lumen.

screen → camera optical modem · v0.1

4×4 patch grid · 32 bits/frame
preamble: checkerboard pair
modulation: differential 4-PSK on DCT

payload

encode & broadcast

Type a short message. The encoder emits a preamble pair, then four payload frames carrying 32 bits each (16 patches × 2 bits via differential 4-PSK on a low-frequency DCT basis).

fps10

amplitude0.45

frame size512

channel

visible carrier

This is what your camera (or a second device's camera) sees. Point a phone here and switch to receive.

idle

stateidle

frame—

bits sent0

throughput0 bps

capture

open camera

Allow camera access. Point at a screen running the transmit panel. Keep the framed area filling most of the view; hold steady; even, indirect lighting works best.

camera off

fps—

preamblesearching

SNR est.—

frames locked0

decoder

recovered bytes

Once the preamble locks, the decoder samples each patch's average luminance, takes a frame-to-frame difference, projects onto the horizontal/vertical DCT basis, and recovers two bits per patch.

calibration

characterize the channel

Just like the audio self-test that swept 1/5/10/15/19 kHz tones to find what your microphone actually heard, this sweeps known luminance steps and known spatial DCT patterns to see what your camera actually captures.

idle

measurement

what the camera saw

Open camera and point at the calibration panel. Each tile reports the measured response to that test pattern.

camera off

step 0%—

step 25%—

step 50%—

step 75%—

step 100%—

DCT 1c h—

DCT 2c h—

DCT 4c h—

DCT 1c v—

DCT 4c v—

The acoustic analogy.

An acoustic modem maps bits onto orthogonal sinusoids in time. This optical modem maps bits onto orthogonal 2D-DCT basis functions in space, modulated differentially across frames in time. The channel has more dimensions, so the carrier alphabet is richer.

Symbol design.

The screen is tiled into a 4×4 grid of patches. Each patch carries 2 bits. The bits choose one of four states: (±1) × (horizontal stripe) or (±1) × (vertical stripe), where the stripe is a 2-cycle DCT basis function. Sign and orientation give 4 PSK-like states.

Why differential.

Cameras have unknown gamma, white balance, exposure. Encoding the change from frame to frame is robust to these unknowns the same way DPSK in audio is robust to phase offset. Each payload frame is interpreted relative to the frame before it.

Preamble lock.

The transmitter starts every burst with a checkerboard frame followed by its inverse. The receiver cross-correlates that known checkerboard-difference against incoming frame-deltas; when correlation peaks, frame timing and patch grid alignment are both established at once. This is the optical version of the matched-filter preamble that worked in the audio modem.

Frame timing.

The camera frame rate is almost never an integer multiple of the screen rate, and rolling shutter samples rows at slightly different times. The receiver oversamples (it grabs every camera frame) and uses the preamble correlation to align to the correct sender frame, dropping camera frames that fall mid-transition.

Realistic rates.

At 10 fps × 32 bits = 320 bps raw, before error correction. In practice, expect to lose half to retries and frame drops. The acoustic modem hit similar territory at the ultrasonic edge — 20 to 200 bps is honest territory for a self-contained browser implementation. Neural decoders in the literature push much higher.

What's not here.

No FEC (a Reed-Solomon outer code would help), no multi-color carriers (chrominance is its own orthogonal channel), no perspective correction (the receiver assumes the sender fills most of the frame, roughly square). All worthwhile next steps if the basic loop closes.