refactor: remove --start/--stop from hero_* binaries; nushell owns full lifecycle #102

Open
opened 2026-04-21 12:04:54 +00:00 by mahmoud · 1 comment
Owner

Context

Per recent architectural decision (@despiegk): Nushell (tools/modules/services/service_*.nu + shared lib.nu helpers) is the single source of truth for Hero service lifecycle. The --start / --stop flags that every hero_* binary currently carries are redundant glue — they duplicate what the nu layer already does via hero_proc_sdk / proc service set / proc service start, and they create a second "canonical" spec that drifts against the nu modules.

The consolidation in 5df145f Refactor service lifecycle management into shared lib.nu helpers already moved lifecycle dispatch into nu helpers. This issue covers the next step: tearing the now-dead --start / --stop code out of each service binary.

Closes #100 and #101 (which proposed the opposite direction).

Current state

  • service_*.nu modules no longer call hero_<svc> --start / --stop. They dispatch via proc service set / proc service start in lib.nu helpers (svc_start, svc_stop, svc_start_preflight, svc_status). Verified: only remaining refs in service_biz.nu are comments and a help-text line.
  • Every hero_* binary still ships --start / --stop matching build_service_definition() and calling hero_proc_sdk::lifecycle::restart_service / stop_service.
  • claude/skills/hero_proc_service_selfstart documents the --start/--stop CLI pattern as the recommended design. This skill becomes obsolete under the new direction.

Target state

  • hero_* binaries are plain foreground daemons. No --start / --stop flags. They read their config, bind their sockets, and run until SIGTERM'd.
  • Service lifecycle (registration, start, stop, status, restart, health) lives entirely in service_*.nu.
  • One Nushell script per service replaces its Makefile's run/start/stop/install targets. make stays for build/test/lint only (or goes away entirely per-service — TBD).
  • claude/skills/hero_proc_service_selfstart is retired; claude/skills/nu_service + nu_service_use become the canonical lifecycle docs.

Per-repo checklist

Each repo gets its own issue + PR. Order is opportunistic — deletions are low-risk since nu modules no longer call these flags.

  • hero_whiteboard — remove --start/--stop from src/main.rs; drop build_service_definition() if unused elsewhere
  • hero_books
  • hero_collab
  • hero_biz
  • hero_db
  • hero_indexer
  • hero_voice
  • hero_aibroker
  • hero_foundry
  • hero_os
  • hero_router
  • hero_proxy
  • hero_mycelium
  • hero_proc (self — keep the lifecycle SDK, remove the CLI flag wiring only)
  • hero_code
  • hero_matrixchat
  • hero_slides
  • hero_embedder

Deferred (not yet carrying the flags): hero_office, hero_osis.

Deferred (no service binary yet): hero_agent, hero_claude, hero_compute, hero_shrimp.

Makefile → nu script consolidation

Parallel track. Each repo currently has a Makefile with install, build, run, test, fmt, lint, clean. The lifecycle pieces (run, anything calling the service binary with --start) move into the service's scripts/nu_service.nu (already per-repo pattern). build/test/fmt/lint stay in make for now — decision on full retirement deferred.

Scope guardrails

  • Each PR touches exactly one repo.
  • Remove: --start, --stop, build_service_definition(), any clap entries for those flags, related docs.
  • Keep: the service's actual daemon code, config loading, socket binding, graceful shutdown on SIGTERM (see rust_shutdownsignals skill).
  • Smoke test each removal: after the service is rebuilt, service_<name>.nu install | start | status | stop must still work end-to-end on Hetzner.
  • No coordinated "big bang" — ship per-repo, merge independently.

Rollout sequencing

  1. Pilot: pick hero_whiteboard (smallest lifecycle surface, already validated by the since-closed PR #101).
  2. Verify the nu script is the only caller: grep every repo and any ops docs for hero_whiteboard --start before deleting.
  3. Iterate through the checklist.
  4. Retire claude/skills/hero_proc_service_selfstart once 3+ services are through.

Refs: #75, closed #100, closed #101.

## Context Per recent architectural decision (@despiegk): Nushell (`tools/modules/services/service_*.nu` + shared `lib.nu` helpers) is the **single source of truth** for Hero service lifecycle. The `--start` / `--stop` flags that every `hero_*` binary currently carries are redundant glue — they duplicate what the nu layer already does via `hero_proc_sdk` / `proc service set` / `proc service start`, and they create a second "canonical" spec that drifts against the nu modules. The consolidation in `5df145f Refactor service lifecycle management into shared lib.nu helpers` already moved lifecycle dispatch into nu helpers. This issue covers the next step: tearing the now-dead `--start` / `--stop` code out of each service binary. Closes #100 and #101 (which proposed the opposite direction). ## Current state - `service_*.nu` modules no longer call `hero_<svc> --start` / `--stop`. They dispatch via `proc service set` / `proc service start` in `lib.nu` helpers (`svc_start`, `svc_stop`, `svc_start_preflight`, `svc_status`). Verified: only remaining refs in `service_biz.nu` are comments and a help-text line. - Every `hero_*` binary still ships `--start` / `--stop` matching `build_service_definition()` and calling `hero_proc_sdk::lifecycle::restart_service` / `stop_service`. - `claude/skills/hero_proc_service_selfstart` documents the `--start`/`--stop` CLI pattern as the recommended design. This skill becomes obsolete under the new direction. ## Target state - `hero_*` binaries are plain foreground daemons. No `--start` / `--stop` flags. They read their config, bind their sockets, and run until SIGTERM'd. - Service lifecycle (registration, start, stop, status, restart, health) lives entirely in `service_*.nu`. - One Nushell script per service replaces its Makefile's `run`/`start`/`stop`/`install` targets. `make` stays for build/test/lint only (or goes away entirely per-service — TBD). - `claude/skills/hero_proc_service_selfstart` is retired; `claude/skills/nu_service` + `nu_service_use` become the canonical lifecycle docs. ## Per-repo checklist Each repo gets its own issue + PR. Order is opportunistic — deletions are low-risk since nu modules no longer call these flags. - [ ] hero_whiteboard — remove `--start`/`--stop` from `src/main.rs`; drop `build_service_definition()` if unused elsewhere - [ ] hero_books - [ ] hero_collab - [ ] hero_biz - [ ] hero_db - [ ] hero_indexer - [ ] hero_voice - [ ] hero_aibroker - [ ] hero_foundry - [ ] hero_os - [ ] hero_router - [ ] hero_proxy - [ ] hero_mycelium - [ ] hero_proc (self — keep the lifecycle SDK, remove the CLI flag wiring only) - [ ] hero_code - [ ] hero_matrixchat - [ ] hero_slides - [ ] hero_embedder Deferred (not yet carrying the flags): hero_office, hero_osis. Deferred (no service binary yet): hero_agent, hero_claude, hero_compute, hero_shrimp. ## Makefile → nu script consolidation Parallel track. Each repo currently has a `Makefile` with `install`, `build`, `run`, `test`, `fmt`, `lint`, `clean`. The lifecycle pieces (`run`, anything calling the service binary with `--start`) move into the service's `scripts/nu_service.nu` (already per-repo pattern). `build`/`test`/`fmt`/`lint` stay in `make` for now — decision on full retirement deferred. ## Scope guardrails - Each PR touches exactly one repo. - Remove: `--start`, `--stop`, `build_service_definition()`, any `clap` entries for those flags, related docs. - Keep: the service's actual daemon code, config loading, socket binding, graceful shutdown on SIGTERM (see `rust_shutdownsignals` skill). - Smoke test each removal: after the service is rebuilt, `service_<name>.nu install | start | status | stop` must still work end-to-end on Hetzner. - No coordinated "big bang" — ship per-repo, merge independently. ## Rollout sequencing 1. **Pilot**: pick `hero_whiteboard` (smallest lifecycle surface, already validated by the since-closed PR #101). 2. **Verify** the nu script is the only caller: grep every repo and any ops docs for `hero_whiteboard --start` before deleting. 3. **Iterate** through the checklist. 4. **Retire** `claude/skills/hero_proc_service_selfstart` once 3+ services are through. Refs: #75, closed #100, closed #101.
Owner

we are too broad here, the idea is not just to copy everything from makefiles, we need to stay elegant

we are too broad here, the idea is not just to copy everything from makefiles, we need to stay elegant
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#102
No description provided.