hero_inspector crash-loops in Docker container (hero_proc conflict) #100
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
hero_inspector_server crash-loops in the Docker container (~0.5s restart interval). When run manually it works fine — discovers 42 services, binds socket, serves requests.
The crash-loop causes 4 smoke test failures:
Root cause
hero_proc starts hero_inspector_server, but the process may exit or get killed before hero_proc detects it as healthy. hero_proc then restarts it, but the previous socket file may still exist, causing a bind conflict. The rapid restart loop (~0.5s) suggests hero_proc is retrying without enough delay.
When run manually (
docker exec herolocal /root/hero/bin/hero_inspector_server), it works perfectly — discovers all 42 services, binds socket, serves RPC.Evidence
Docker logs show repeated restarts:
Likely fix
--startflag support in hero_inspector_server for proper self-registrationSigned-off-by: mik-tf
Fixed
Root cause: the service TOML had
exec = "hero_inspector_server serve"but the binary's clap CLI parser has noservesubcommand. Clap errored and exited, triggering hero_proc crash-loop retry.Fix: removed
serveargument from both server and UI exec lines inhero_services/services/hero_inspector.toml.Result: 122/124 smoke tests pass, 0 failures (was 4 failures).
Signed-off-by: mik-tf