[bootstrap] hero_router bind defaults block public-URL bring-up on heroci AND any normal cloud VM (DO/Hetzner/AWS) #227

Closed
opened 2026-05-07 03:05:27 +00:00 by mik-tf · 1 comment
Owner

Surfaced during session 73 heroci validation (2026-05-07). Documenting for session 74's DO from-nothing demo.

Symptom

  • heroci.gent01.grid.tf returns 502 Bad Gateway on every URL despite hero_proc + mycelium + hero_router all running locally on the VM (all three reach health-green via service_<name> start --download --reset).
  • Local probes from inside the VM succeed: curl http://[<mycelium-v6>]:9988/health → 200, curl http://[<mycelium-v6>]:8991/ → 200.
  • The TF Grid grid_name_proxy.gateway backend is registered as http://<vms[0].ip>:9988; on heroci vms[0].ip is the public IPv4 178.251.27.31, NOT the mycelium IPv6.

Root cause

service_router.nu line 173-179 logic:

let has_addr   = ($address | is-not-empty)
let local_port = if $has_addr { 0 } else { $port }
mut script = $"($bin) --port ($local_port)"
if $has_addr { $script = $"($script) --address ($address) --ui-port ($ui_port)" }

When mycelium is up, --port 0 disables hero_router's TCP listener entirely; the only listener bound is on [<mycelium-v6>]:9988. When mycelium is down, --port 9988 binds on 127.0.0.1:9988 (per hero_router/crates/hero_router/src/main.rs:56 — "TCP port for the UI HTTP listener on 127.0.0.1").

Neither path binds on 0.0.0.0:9988 or on the VM's public IP, so any external-facing reverse proxy (TF Grid name_proxy, nginx running on a different host, a DigitalOcean cloud LB) hitting the public IPv4 finds nothing.

herodemo papers over this because it has nginx running on the VM bridging 0.0.0.0:80/443127.0.0.1:9988. heroci has no nginx.

Fix options (for session 74)

  1. (Preferred) Add a --bind mode to hero_router and surface it in service_router.nu. Default stays 127.0.0.1 for security. Operators who want a public listener pass --bind 0.0.0.0 (or the explicit interface address). Mirror the pattern already used by mycelium_ui --bind [::]:8991. Mechanical Rust change in hero_router/crates/hero_router/src/main.rs around line 258 TcpListener::bind(addr).
  2. Document nginx-on-VM as the canonical pattern for non-TF-Grid deploys, and ship a service_router.nu --behind-nginx flag or similar that doesn't change bind behaviour. Adds an external dep.
  3. Repoint heroci's Terraform backend from vms[0].ip to the mycelium IPv6. Fixes heroci specifically; doesn't help DO/Hetzner/AWS users.

Why this is now a session-74 priority

Session 74 plan: validate the --download from-nothing bootstrap on a fresh DigitalOcean Ubuntu 24.04 droplet — i.e. the actual customer experience. That deploy needs hero_router to listen on a public-facing interface. Without option 1 (or option 2 + nginx), the demo can't be reached.

What session 73 delivered

  • mycelium_network v0.7.5-rc1 (first release, 2 musl assets)
  • hero_router v0.2.3-rc1 (carries --address flag now)
  • hero_skills service_mycelium --download + service_complete --download wiring
  • D-06 forge-token-bootstrap-optional already validated on heroci in session 72

Local-on-heroci bring-up works end-to-end. Public URL exposure is the only remaining gap, and it's this issue.

See:

Signed-off-by: mik-tf

Surfaced during session 73 heroci validation (2026-05-07). Documenting for session 74's DO from-nothing demo. ## Symptom - heroci.gent01.grid.tf returns `502 Bad Gateway` on every URL despite hero_proc + mycelium + hero_router all running locally on the VM (all three reach health-green via `service_<name> start --download --reset`). - Local probes from inside the VM succeed: `curl http://[<mycelium-v6>]:9988/health` → 200, `curl http://[<mycelium-v6>]:8991/` → 200. - The TF Grid `grid_name_proxy.gateway` backend is registered as `http://<vms[0].ip>:9988`; on heroci `vms[0].ip` is the **public IPv4** `178.251.27.31`, NOT the mycelium IPv6. ## Root cause `service_router.nu` line 173-179 logic: ``` let has_addr = ($address | is-not-empty) let local_port = if $has_addr { 0 } else { $port } mut script = $"($bin) --port ($local_port)" if $has_addr { $script = $"($script) --address ($address) --ui-port ($ui_port)" } ``` When mycelium is up, `--port 0` disables hero_router's TCP listener entirely; the only listener bound is on `[<mycelium-v6>]:9988`. When mycelium is down, `--port 9988` binds on **127.0.0.1:9988** (per `hero_router/crates/hero_router/src/main.rs:56` — "TCP port for the UI HTTP listener on 127.0.0.1"). Neither path binds on `0.0.0.0:9988` or on the VM's public IP, so any external-facing reverse proxy (TF Grid name_proxy, nginx running on a different host, a DigitalOcean cloud LB) hitting the public IPv4 finds nothing. herodemo papers over this because it has nginx running on the VM bridging `0.0.0.0:80/443` → `127.0.0.1:9988`. heroci has no nginx. ## Fix options (for session 74) 1. **(Preferred) Add a `--bind` mode to hero_router and surface it in `service_router.nu`.** Default stays `127.0.0.1` for security. Operators who want a public listener pass `--bind 0.0.0.0` (or the explicit interface address). Mirror the pattern already used by `mycelium_ui --bind [::]:8991`. Mechanical Rust change in `hero_router/crates/hero_router/src/main.rs` around line 258 `TcpListener::bind(addr)`. 2. **Document nginx-on-VM as the canonical pattern** for non-TF-Grid deploys, and ship a `service_router.nu --behind-nginx` flag or similar that doesn't change bind behaviour. Adds an external dep. 3. **Repoint heroci's Terraform backend** from `vms[0].ip` to the mycelium IPv6. Fixes heroci specifically; doesn't help DO/Hetzner/AWS users. ## Why this is now a session-74 priority Session 74 plan: validate the `--download` from-nothing bootstrap on a fresh DigitalOcean Ubuntu 24.04 droplet — i.e. the actual customer experience. That deploy needs hero_router to listen on a public-facing interface. Without option 1 (or option 2 + nginx), the demo can't be reached. ## What session 73 delivered - mycelium_network v0.7.5-rc1 (first release, 2 musl assets) - hero_router v0.2.3-rc1 (carries `--address` flag now) - hero_skills `service_mycelium --download` + `service_complete --download` wiring - D-06 forge-token-bootstrap-optional already validated on heroci in session 72 **Local-on-heroci bring-up works end-to-end. Public URL exposure is the only remaining gap, and it's this issue.** See: - https://forge.ourworld.tf/lhumina_code/home/issues/212 (naming-convention rollout) - https://forge.ourworld.tf/geomind_code/mycelium_network/issues/46 (mycelium branch alignment debt) Signed-off-by: mik-tf
Author
Owner

Closing as fixed.

Session 74 added the --bind flag to hero_router and validated DO from-nothing bring-up; session 75 re-validated on hero.threefold.store with the full nginx + LE + htpasswd stack.

Confirmed in current origin/development HEAD (s80 sync):

crates/hero_router/src/main.rs:262: ...anyhow::anyhow!("Invalid --bind '{}': {e}", cli.bind)

The original blocker — bind defaults locked to 127.0.0.1 / mycelium IPv6 — is resolved by passing --bind 0.0.0.0:9988 (or whatever fits the cloud VM's nginx upstream).

Public URL bring-up working on hero.threefold.store per s75 close.

**Closing as fixed.** Session 74 added the `--bind` flag to `hero_router` and validated DO from-nothing bring-up; session 75 re-validated on `hero.threefold.store` with the full nginx + LE + htpasswd stack. Confirmed in current `origin/development` HEAD (s80 sync): ``` crates/hero_router/src/main.rs:262: ...anyhow::anyhow!("Invalid --bind '{}': {e}", cli.bind) ``` The original blocker — bind defaults locked to 127.0.0.1 / mycelium IPv6 — is resolved by passing `--bind 0.0.0.0:9988` (or whatever fits the cloud VM's nginx upstream). Public URL bring-up working on `hero.threefold.store` per s75 close.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/home#227
No description provided.