Multi-user collision: rtc.tcp_port wildcard-binds on *:7881, hardcoded — second user fails to start #33

Closed
opened 2026-04-29 23:36:28 +00:00 by sameh-farouk · 0 comments
Member

Context

#31 / PR #32 fixed the UDP/7882 multi-user collision by adding rtc.ips.includes so each user's livekit binds UDP only on its own node_ip. That works.

Observed on dev box 138.201.206.39 after deploying #32: the SAME multi-user collision now surfaces on TCP port 7881 (the ICE-TCP fallback). Concretely, when ashraf's livekit-server tries to start while sameh's is already running:

2026-04-30T00:12:21  starting livekit (config /home/ashraf/hero/var/hero_livekit/livekit.yaml)
listen tcp :7881: bind: address already in use

Why

The yaml renderer in crates/hero_livekit_server/src/livekit/rpc.rs:899 hardcodes:

y.push_str("  tcp_port: 7881\n");

Combined with bind_addresses: "0.0.0.0" (kept that way because the wrapper's own twirp client at rpc.rs:385 calls livekit-server via http://127.0.0.1:<livekit_port> — see the comment at line 891), this means:

UDP/7882: per-user IP-isolated via rtc.ips.includes      ✓ (PR #32)
TCP/7881: wildcard bind on `*:7881`                       ✗ collides between users
TCP/7880: configurable via cfg.livekit_port              ✓ (already configurable)

ss -lntp on the dev box shows the collision concretely:

LISTEN  *:7881  users:(("livekit-server",pid=2156583,fd=9))  ← sameh holds wildcard
LISTEN  *:7880  users:(("livekit-server",pid=2156583,fd=10)) ← sameh holds wildcard

A second instance (any user) trying to bind *:7881 gets EADDRINUSE — even though their node_ip is unique.

Fix

Make rtc.tcp_port configurable from runtime.json, mirroring how livekit_port and backend_port already work. Multi-user setups allocate per-user TCP ports (e.g., sameh: 7881, ashraf: 7981, ...) so the wildcard *:tcp_port listeners no longer collide.

UDP doesn't need a similar treatment in this issue's scope: rtc.ips.includes already makes UDP per-user-IP-exclusive, so keeping udp_port: 7882 shared is fine.

Verification (after PR)

With the fix applied and ashraf's runtime.json set to "rtc_tcp_port": 7981:

ss -lntp | grep livekit
  *:7881   sameh  livekit-server
  *:7981   ashraf livekit-server      ← no collision

Both livekit-servers running concurrently.

Follow-up needed in hero_skills

This change makes the field configurable, but multi_user_add still needs to allocate non-overlapping per-user TCP ports when creating a user (and write them into runtime.json). I'll file that as a separate hero_skills issue so this PR stays scoped to hero_livekit.

## Context [#31](https://forge.ourworld.tf/lhumina_code/hero_livekit/issues/31) / [PR #32](https://forge.ourworld.tf/lhumina_code/hero_livekit/pulls/32) fixed the UDP/7882 multi-user collision by adding `rtc.ips.includes` so each user's livekit binds UDP only on its own `node_ip`. That works. Observed on dev box `138.201.206.39` after deploying #32: the SAME multi-user collision now surfaces on **TCP port 7881** (the ICE-TCP fallback). Concretely, when ashraf's livekit-server tries to start while sameh's is already running: ``` 2026-04-30T00:12:21 starting livekit (config /home/ashraf/hero/var/hero_livekit/livekit.yaml) listen tcp :7881: bind: address already in use ``` ## Why The yaml renderer in `crates/hero_livekit_server/src/livekit/rpc.rs:899` hardcodes: ```rust y.push_str(" tcp_port: 7881\n"); ``` Combined with `bind_addresses: "0.0.0.0"` (kept that way because the wrapper's own twirp client at `rpc.rs:385` calls livekit-server via `http://127.0.0.1:<livekit_port>` — see the comment at line 891), this means: ``` UDP/7882: per-user IP-isolated via rtc.ips.includes ✓ (PR #32) TCP/7881: wildcard bind on `*:7881` ✗ collides between users TCP/7880: configurable via cfg.livekit_port ✓ (already configurable) ``` `ss -lntp` on the dev box shows the collision concretely: ``` LISTEN *:7881 users:(("livekit-server",pid=2156583,fd=9)) ← sameh holds wildcard LISTEN *:7880 users:(("livekit-server",pid=2156583,fd=10)) ← sameh holds wildcard ``` A second instance (any user) trying to bind `*:7881` gets `EADDRINUSE` — even though their `node_ip` is unique. ## Fix Make `rtc.tcp_port` configurable from `runtime.json`, mirroring how `livekit_port` and `backend_port` already work. Multi-user setups allocate per-user TCP ports (e.g., sameh: 7881, ashraf: 7981, ...) so the wildcard `*:tcp_port` listeners no longer collide. UDP doesn't need a similar treatment in this issue's scope: `rtc.ips.includes` already makes UDP per-user-IP-exclusive, so keeping `udp_port: 7882` shared is fine. ## Verification (after PR) With the fix applied and ashraf's `runtime.json` set to `"rtc_tcp_port": 7981`: ``` ss -lntp | grep livekit *:7881 sameh livekit-server *:7981 ashraf livekit-server ← no collision ``` Both livekit-servers running concurrently. ## Follow-up needed in hero_skills This change makes the field configurable, but `multi_user_add` still needs to **allocate** non-overlapping per-user TCP ports when creating a user (and write them into `runtime.json`). I'll file that as a separate hero_skills issue so this PR stays scoped to hero_livekit.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_livekit#33
No description provided.