WebRTC signaling protocol¶
WebSocket-based offer/answer exchange between clients (browser, iPad app) and the lokust-stream server. Matches the implementation in session-vm/crates/stream-server/src/signaling/mod.rs.
Connection¶
Endpoint: ws://<host>:<port>/signal (or wss:// when behind TLS termination).
Messages are JSON text frames. Binary frames are ignored.
Message envelope:
type names use snake_case. Unknown types are logged and ignored by the server.
Client → Server¶
offer¶
Client initiates WebRTC negotiation with its SDP offer.
The client is expected to have already:
1. Added a recvonly video transceiver to its peer connection.
2. Created an offer with pc.createOffer().
3. Waited for ICE gathering to complete (non-trickle) OR plans to send ICE candidates separately (trickle not yet supported by the server).
4. Set the local description.
Server → Client¶
ready¶
Sent immediately after WebSocket handshake. Carries the server's view of this connection's session ID.
The client typically responds to ready by sending its offer.
answer¶
Sent after the server applies the client's offer and generates a response.
The client sets this as its RTCPeerConnection.remoteDescription.
error¶
Non-fatal error. The connection may or may not survive.
Happy path¶
Client Server
│ │
│ ─────── WebSocket Upgrade ──────────→ │
│ │
│ ←───────── { ready, session_id } ──── │
│ │
│ ─────── { offer, sdp } ─────────────→ │
│ │
│ ←───────── { answer, sdp } ────────── │
│ │
│ (DTLS-SRTP over UDP) │
│ (video track RTP flowing) │
│ │
Codec negotiation¶
The server registers a single H.264 codec entry:
mimeType: video/H264
clock_rate: 90000
sdp_fmtp: level-asymmetry-allowed=1;packetization-mode=1;profile-level-id=42e01f
payload: 102
profile-level-id=42e01f is baseline profile, level 3.1 — compatible with essentially all modern browsers and iOS/iPadOS WebRTC implementations.
The server outputs H.264 byte-stream with AU alignment. The h264parse element emits SPS/PPS inline (config-interval=-1).
Multi-peer¶
Every WebSocket connection creates an independent RTCPeerConnection and TrackLocalStaticSample. A background task subscribes each peer to the shared encoded-sample broadcast channel and writes samples to the peer's video track.
Consequences: - One pipeline, N peers — encoded frames fan out, no re-encoding per client - If a peer's network is slow, its broadcast receiver lags; the sample is dropped with a warning. Other peers are unaffected. - No simulcast or layer selection yet — all peers receive the same stream
Not yet supported¶
- Trickle ICE — the client must gather candidates before sending the offer. Latency impact on mobile networks is 0–3 seconds.
- Audio — video only. Audio transceiver can be added when needed.
- Data channels for input — the iPad app scaffold reserves this; server-side handler not implemented yet. Will follow the
input_handler.pyprotocol from the Selkies reference (touch →uinputevents) adapted to Rust. - Renegotiation — cleanup-then-reconnect is the current strategy. Expect to implement renegotiation when adaptive bitrate / resolution switching lands.
Reference implementations¶
- Server:
session-vm/crates/stream-server/src/signaling/mod.rs - Browser test client:
session-vm/crates/stream-server/src/signaling/client.html(served at/) - iPad client:
ipad/LokustIpad/Streaming/SignalingClient.swift