feat: Auto-detect node connectivity (Public IP → Mycelium → Local fallback) #57
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_compute#57
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Currently, worker nodes require manually setting
MYCELIUM_IPin the environment, and the heartbeat transport only works over TCP with routable IPs. Nodes behind NAT or without public IPs cannot participate in a multi-node cluster.We need an automatic connectivity discovery chain at node startup that determines the best available transport — no manual config required.
Proposed Behavior
On node registration, run a connectivity discovery chain:
1. Check for Public IP
"Public IP detected: 85.x.x.x — ready for multi-node communication"2. No Public IP → Check Mycelium
"No public IP detected, checking Mycelium...""Mycelium network active: 300:abc::1 — ready to accept connections"3. No Public IP, No Mycelium → Local-only mode
"⚠ Mycelium is not running, falling back to local mode""⚠ Note: in local mode you will not be able to manage VMs outside this node"4. Has Both Public IP + Mycelium
"Public IP: 85.x.x.x (primary), Mycelium: 300:abc::1"Why
MYCELIUM_IPenv var to setImplementation Notes
tcp://[300:abc::1]:9002should work with the existing TCP bridge — no protocol changes neededHERO_COMPUTE_ADVERTISE_ADDRESSshould be set automatically based on the discovered transportAffected Crates
hero_compute— CLI startup logic, env var generationhero_compute_server— node registration, heartbeat sender, confighero_compute_explorer— TCP bridge binding, proxy connectionsUpdate: Align with Hero RPC Transport Standards
hero_compute currently uses raw TCP sockets with newline-delimited JSON-RPC for cross-node communication (heartbeats, RPC proxy). This is a custom transport that doesn't follow the Hero ecosystem standard, which is:
hero_proxyis the sole TCP entry pointhero_inspectordiscovers services by scanning socketshero_compute bypasses all of this with a custom
tcp_bridge()— but for a valid reason: the Hero RPC layer has no concept of remote service-to-service communication.OpenRpcTransportonly supportsconnect_socket(path), no TCP. So hero_compute had to roll its own.What this means for Mycelium support
Short-term — Mycelium IPv6 works as-is with the existing raw TCP bridge (
tcp://[300:abc::1]:9002), minimal changes needed.Long-term — this exposes a gap in the Hero RPC ecosystem: no cross-machine transport. Any future Hero service needing multi-node communication will hit the same wall.
Revised scope for this issue
Stays focused on the connectivity discovery chain:
400::/7or300::/8ranges)MYCELIUM_IPenv var[::](not just0.0.0.0) for IPv6/Mycelium supportProposed detection logic
Implemented in v0.1.8, with architecture changes from the original design.
What was implemented
Architecture difference from original design
The issue proposed TCP bridges on Mycelium IPv6 addresses. We instead implemented:
TCP bridges were removed entirely. hero_proxy handles all external connectivity. The user-facing result is the same: auto-detection, no manual config, NAT-friendly via Mycelium.
Not implemented (minor)
These are cosmetic. The core requirement (auto-detect, zero manual config, multi-node over Mycelium) is working and tested live.
Closing.