Refactor service_*.nu modules to thin delegators over the selfstart CLI #100

Closed
opened 2026-04-21 07:56:03 +00:00 by mahmoud · 5 comments
Owner

Motivation

Every lhumina_code Hero service (except hero_office and hero_osis, tracked as hero_office#2 and hero_osis#32) already ships a CLI with --start / --stop that calls hero_proc_sdk::lifecycle directly. That CLI is the canonical source of truth for the action spec (retry policy, kill_other, env, health checks, stop timeouts).

The current tools/modules/services/service_*.nu modules hand-transcribe the same spec in Nushell. Every upstream tune in a service's build_service_definition() silently drifts against the nu copy. Concrete symptoms observed during #75 work:

  • service_claude spec (#99) needed six deviations from the baseline; most were duplication-driven.
  • service_agent probe showed the Rust CLI propagates only a subset of the hero_zero TOML env, so the two paths already disagree.
  • service_office ships ~400 lines of nu for what is effectively one hero_office --start call (pending #2).

Proposal

Refactor each service_*.nu to a thin delegator:

  • install keeps everything it does today — svc_require_sudo, forge_ensure_local, svc_cargo_install, --reset / --update / svc_bins_ok short-circuit, any seeding helpers that are genuinely module-owned (e.g. svx_seed_models_config for aibroker, svx_install_server_wrapper for claude).
  • start becomes: ensure binaries present → run any non-fatal preflight warnings (API keys, soft deps) → ^(svc_bin "hero_<name>" $root) --start with sudo -E HOME=/root dispatch on --root.
  • stop becomes: ^(svc_bin "hero_<name>" $root) --stop with the same sudo dispatch.
  • status stays as-is (proc service status $svc --root=$root).

Drops roughly 60–70% of each module's LOC and makes drift impossible by construction.

New shared helpers (in tools/modules/services/lib.nu)

  • svc_cli_start [cli_bin: string, root: bool] — runs ^$cli_bin --start, wraps in sudo -E HOME=/root when svc_need_sudo $root, surfaces exit code as a clean error.
  • svc_cli_stop [cli_bin: string, root: bool] — same for --stop.
  • (Optional) svc_cli_run [cli_bin: string, args: list<string>, root: bool] — generic version for services with status/logs subcommands on the CLI (hero_books has these).

Per-module checklist

Each cycle: branch development_refactor_service_<name>, one commit, one PR, smoke-test on Hetzner against existing behaviour (install → start → status → probes → stop), close this issue line, merge.

  • service_os.nu → delegate to hero_os
  • service_books.nu → delegate to hero_books (has extra --status / --logs subcommands, keep as-is or wire through)
  • service_collab.nu → delegate to hero_collab
  • service_whiteboard.nu → delegate to hero_whiteboard
  • service_biz.nu → delegate to hero_biz
  • service_db.nu → delegate to hero_db
  • service_indexer.nu → delegate to hero_indexer
  • service_voice.nu → delegate to hero_voice
  • service_aibroker.nu → delegate to hero_aibroker (keep svx_seed_models_config + svx_check_api_keys)
  • service_foundry.nu → delegate to hero_foundry
  • service_compute.nu — still to be created; create it as a delegator from day one
  • service_agent.nu — still to be created; create it as a delegator from day one (waits on no upstream dep)
  • service_claude.nu — still to be created; create it as a delegator from day one (waits on no upstream dep)
  • service_shrimp.nu — still to be created; create it as a delegator (uses the hero_shrimp Rust wrapper in hero_shrimp_proc/)

Blocked on upstream selfstart additions:

  • service_office.nu → delegate to hero_office (blocked on hero_office#2)
  • service_osis.nu → delegate to the new osis CLI (blocked on hero_osis#32)

Pilot

Suggest picking service_whiteboard.nu as the first refactor — shortest, zero preflights, no env-seeding helpers, so the diff shows the canonical delegator shape with nothing else cluttering the review. Once approved, subsequent modules follow the same pattern.

Scope guardrails

  • Each PR changes exactly one service_<name>.nu (and possibly lib.nu on the pilot to add svc_cli_start/svc_cli_stop).
  • No changes to service repos as part of this issue.
  • Smoke test each refactor end-to-end on Hetzner the same way the original PRs did.
  • Keep the existing --root / --reset / --update flags; semantics stay operator-facing.

Refs: #75

## Motivation Every lhumina_code Hero service (except `hero_office` and `hero_osis`, tracked as [hero_office#2](https://forge.ourworld.tf/lhumina_code/hero_office/issues/2) and [hero_osis#32](https://forge.ourworld.tf/lhumina_code/hero_osis/issues/32)) already ships a CLI with `--start` / `--stop` that calls `hero_proc_sdk::lifecycle` directly. That CLI is the canonical source of truth for the action spec (retry policy, kill_other, env, health checks, stop timeouts). The current `tools/modules/services/service_*.nu` modules hand-transcribe the same spec in Nushell. Every upstream tune in a service's `build_service_definition()` silently drifts against the nu copy. Concrete symptoms observed during #75 work: - `service_claude` spec (#99) needed six deviations from the baseline; most were duplication-driven. - `service_agent` probe showed the Rust CLI propagates only a subset of the hero_zero TOML env, so the two paths already disagree. - `service_office` ships ~400 lines of nu for what is effectively one `hero_office --start` call (pending #2). ## Proposal Refactor each `service_*.nu` to a thin delegator: - `install` keeps everything it does today — `svc_require_sudo`, `forge_ensure_local`, `svc_cargo_install`, `--reset` / `--update` / `svc_bins_ok` short-circuit, any seeding helpers that are genuinely module-owned (e.g. `svx_seed_models_config` for aibroker, `svx_install_server_wrapper` for claude). - `start` becomes: ensure binaries present → run any non-fatal preflight warnings (API keys, soft deps) → `^(svc_bin "hero_<name>" $root) --start` with `sudo -E HOME=/root` dispatch on `--root`. - `stop` becomes: `^(svc_bin "hero_<name>" $root) --stop` with the same sudo dispatch. - `status` stays as-is (`proc service status $svc --root=$root`). Drops roughly 60–70% of each module's LOC and makes drift impossible by construction. ## New shared helpers (in `tools/modules/services/lib.nu`) - `svc_cli_start [cli_bin: string, root: bool]` — runs `^$cli_bin --start`, wraps in `sudo -E HOME=/root` when `svc_need_sudo $root`, surfaces exit code as a clean error. - `svc_cli_stop [cli_bin: string, root: bool]` — same for `--stop`. - (Optional) `svc_cli_run [cli_bin: string, args: list<string>, root: bool]` — generic version for services with `status`/`logs` subcommands on the CLI (`hero_books` has these). ## Per-module checklist Each cycle: branch `development_refactor_service_<name>`, one commit, one PR, smoke-test on Hetzner against existing behaviour (install → start → status → probes → stop), close this issue line, merge. - [ ] `service_os.nu` → delegate to `hero_os` - [ ] `service_books.nu` → delegate to `hero_books` (has extra `--status` / `--logs` subcommands, keep as-is or wire through) - [ ] `service_collab.nu` → delegate to `hero_collab` - [ ] `service_whiteboard.nu` → delegate to `hero_whiteboard` - [ ] `service_biz.nu` → delegate to `hero_biz` - [ ] `service_db.nu` → delegate to `hero_db` - [ ] `service_indexer.nu` → delegate to `hero_indexer` - [ ] `service_voice.nu` → delegate to `hero_voice` - [ ] `service_aibroker.nu` → delegate to `hero_aibroker` (keep `svx_seed_models_config` + `svx_check_api_keys`) - [ ] `service_foundry.nu` → delegate to `hero_foundry` - [ ] `service_compute.nu` — still to be created; create it as a delegator from day one - [ ] `service_agent.nu` — still to be created; create it as a delegator from day one (waits on no upstream dep) - [ ] `service_claude.nu` — still to be created; create it as a delegator from day one (waits on no upstream dep) - [ ] `service_shrimp.nu` — still to be created; create it as a delegator (uses the `hero_shrimp` Rust wrapper in `hero_shrimp_proc/`) Blocked on upstream selfstart additions: - [ ] `service_office.nu` → delegate to `hero_office` **(blocked on [hero_office#2](https://forge.ourworld.tf/lhumina_code/hero_office/issues/2))** - [ ] `service_osis.nu` → delegate to the new osis CLI **(blocked on [hero_osis#32](https://forge.ourworld.tf/lhumina_code/hero_osis/issues/32))** ## Pilot Suggest picking `service_whiteboard.nu` as the first refactor — shortest, zero preflights, no env-seeding helpers, so the diff shows the canonical delegator shape with nothing else cluttering the review. Once approved, subsequent modules follow the same pattern. ## Scope guardrails - Each PR changes exactly one `service_<name>.nu` (and possibly `lib.nu` on the pilot to add `svc_cli_start`/`svc_cli_stop`). - No changes to service repos as part of this issue. - Smoke test each refactor end-to-end on Hetzner the same way the original PRs did. - Keep the existing `--root` / `--reset` / `--update` flags; semantics stay operator-facing. Refs: #75
Author
Owner

Implementation Spec for Issue #100 — Pilot (PR A)

Objective

Eliminate the duplicated hero_proc action spec that service_whiteboard.nu currently hand-transcribes, and move its start / stop verbs to thin delegators over the hero_whiteboard selfstart CLI — whose build_service_definition() in lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs is the canonical source of truth for the registration. The pilot introduces two reusable helpers (svc_cli_start, svc_cli_stop) in tools/modules/services/lib.nu, then refactors exactly one module — service_whiteboard.nu — so reviewers see the canonical delegator shape before PR B mass-applies the same transform to the remaining 11 non-blocked modules.

Requirements

Helpers to add (to tools/modules/services/lib.nu):

  • svc_cli_start [cli_name: string, root: bool] — invokes <svc_bin cli_name root> --start, dispatching sudo when needed.
  • svc_cli_stop [cli_name: string, root: bool] — invokes <svc_bin cli_name root> --stop, same dispatch.

Dispatch rules (shared by both):

  • Resolve CLI path via svc_bin $cli_name $root.
  • Pre-flight: CLI must be executable. svc_need_sudo-aware test -x (same idiom as in svc_bins_ok).
  • Execution matrix:
    • root == false^($bin) --start with invoker's env.
    • root == true AND invoker is root → with-env { HOME: "/root" } { ^$bin --start }.
    • root == true AND invoker is NOT root → ^sudo -E HOME=/root $bin --start. -E preserves HERO_SOCKET_DIR etc.; the HOME=/root override redirects the CLI's hero_proc socket resolution to root's dir.
  • Sudo prefix stays inline in each helper (no separate svc_sudo_prefix fn).
  • CLI stdout must flow through to the terminal. Use | complete, then print $r.stdout and print $r.stderr, branch on $r.exit_code, raise error make with stderr attached on failure. Acceptable tradeoff: output appears in one chunk after CLI exits (~1s, negligible).

Module to refactor: tools/modules/services/service_whiteboard.nu.

Verb-to-CLI mapping:

Nu verb New body
service_whiteboard install unchanged — cargo build + copy (CLI is one of the three binaries built)
service_whiteboard start [--root] if $root { svc_require_sudo }; install --root=$root --update=$update --reset=$reset; svc_cli_start "hero_whiteboard" $root
service_whiteboard stop [--root] if $root { svc_require_sudo }; if not (svc_proc_healthy $root) { warn+return }; svc_cli_stop "hero_whiteboard" $root
service_whiteboard status [--root] unchanged

What stays:

  • Constants SVX_SERVICE_NAME, SVX_FORGE_LOC, SVX_BINARIES, SVX_ACTIONS.
  • install (unchanged — still svc_update + svc_bins_ok short-circuit + svc_cargo_install).
  • status (unchanged).
  • use ../clients/proc.nu * and use ./lib.nu * imports.

What goes:

  • svx_server_action, svx_ui_action, svx_service_config, svx_drop_registration — all four def blocks. The CLI's build_service_definition() owns this spec.
  • The "already running → skip" idempotency pre-check in start (CLI's restart_service is idempotent by design).
  • The rich 13-line summary block at the tail of start (CLI prints its own success line).
  • The svc_require_proc call inside start — CLI does its own hero_proc reachability check. status keeps its svc_require_proc.

Files to Modify

Two files:

  1. tools/modules/services/lib.nu — append svc_cli_start and svc_cli_stop in a new section after svc_cargo_install, matching the existing section banner style.
  2. tools/modules/services/service_whiteboard.nu — trim from 317 lines to ~100.

No other files are touched. mod.nu no edit (module already exported). Other service_*.nu modules no edit.

Implementation Plan

Step 1: Design svc_cli_start / svc_cli_stop

Both helpers have an identical body modulo the flag. Shape:

export def svc_cli_start [cli_name: string, root: bool] {
    let bin = (svc_bin $cli_name $root)

    # 1. executable preflight
    let bin_ok = if (svc_need_sudo $root) {
        (do { ^sudo test -x $bin } | complete).exit_code == 0
    } else {
        ($bin | path exists)
    }
    if not $bin_ok {
        error make {msg: $"($cli_name) CLI not found or not executable at ($bin). Run install first."}
    }

    # 2. execute
    let r = if $root {
        if (is-admin) {
            with-env { HOME: "/root" } { ^$bin --start } | complete
        } else {
            ^sudo -E HOME=/root $bin --start | complete
        }
    } else {
        ^$bin --start | complete
    }

    # 3. surface output + raise on failure
    if ($r.stdout | str length) > 0 { print $r.stdout }
    if ($r.stderr | str length) > 0 { print -e $r.stderr }
    if $r.exit_code != 0 {
        error make {msg: $"($cli_name) --start failed (exit ($r.exit_code))"}
    }
}

svc_cli_stop is identical with --stop in place of --start and the error message string adjusted.

Step 2: Add helpers to lib.nu

  • Append after svc_cargo_install (current end of file).
  • Header comment block explaining: resolves via svc_bin; is-admin → bare with HOME=/root; non-root invoker with --rootsudo -E HOME=/root; errors raised with CLI stderr surfaced first.
  • Update the "What's exposed" comment at the top of lib.nu to add a Selfstart CLI delegators — svc_cli_start, svc_cli_stop line.

Step 3: Trim service_whiteboard.nu header

  • Keep the 2-binary summary, socket paths block, hero_router line.
  • Replace the usage block with a shorter delegator-aware version. Add an explicit line:

    The hero_proc action spec is owned by the Rust CLI's build_service_definition() in lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs. start / stop delegate to hero_whiteboard --start / hero_whiteboard --stop; install and status stay local.

  • Target header length: ~25 lines (down from 40).

Step 4: Delete removed helpers

Remove in order (keeps diff readable):

  1. Section banner + both def svx_server_action / def svx_ui_action blocks.
  2. def svx_service_config.
  3. def svx_drop_registration.

Keep the Constants banner and the four const SVX_* lines.

Step 5: Rewrite start

# ===========================================================================
# start — delegate to `hero_whiteboard --start`.
# ===========================================================================
#
# `hero_whiteboard --start` calls hero_proc_sdk::lifecycle::restart_service,
# which unregisters+re-registers the service idempotently. The action spec
# lives in build_service_definition() in the Rust CLI — do NOT duplicate it
# here.
#
# --reset semantic shift: previously meant "re-register even if running";
# with the CLI owning (idempotent) re-registration, --reset now effectively
# means "force rebuild before restart" (forwarded to install).
#
# Examples:
#   service_whiteboard start
#   service_whiteboard start --reset
#   service_whiteboard start --root
export def start [
    --reset       # Force rebuild before restart (passed through to install)
    --root(-r)    # Manage root's hero_proc instance (requires sudo)
    --update(-u)  # Pull newest code (via `forge merge`) before building
] {
    if $root { svc_require_sudo }
    install --root=$root --update=$update --reset=$reset
    svc_cli_start "hero_whiteboard" $root
}

Step 6: Rewrite stop

# ===========================================================================
# stop — delegate to `hero_whiteboard --stop`.
# ===========================================================================
export def stop [
    --root(-r)  # Stop root's hero_whiteboard registration (requires sudo)
] {
    if $root { svc_require_sudo }
    if not (svc_proc_healthy $root) {
        let flag = if $root { " --root" } else { "" }
        print "⚠ hero_proc is not running — nothing to stop."
        print $"  Bring hero_proc back up with:   service_proc start($flag)"
        return
    }
    svc_cli_stop "hero_whiteboard" $root
}

Step 7: Leave install and status alone

  • install byte-identical.
  • status byte-identical — we keep proc service status because it returns a structured record that callers pipe into where state == "running", which hero_whiteboard --status doesn't expose.

Expected post-refactor LOC: ~100 lines (down from 317).

Smoke Test Plan (Hetzner, --root)

  1. Install parityservice_whiteboard install --root — verify binaries present, no behavioural change vs. pre-refactor.
  2. Start cold — after stop, service_whiteboard start --root — CLI's hero_whiteboard started successfully line prints; proc service status shows running; both sockets exist; curl --unix-socket ui.sock /health → 200; server OpenRPC health OK.
  3. Start warm / idempotent — re-run without flags; CLI's restart_service handles clean re-register; still running.
  4. Status parityservice_whiteboard status --root — record shape byte-identical to baseline.
  5. Stopservice_whiteboard stop --root — CLI's hero_whiteboard stopped successfully prints; post-stop status returns not-found.
  6. Stop with hero_proc down⚠ hero_proc is not running branch fires; helper never called.
  7. Error surfacesudo chmod -x hero_whiteboard CLI, run start --root → preflight raises with actionable message. Restore chmod.
  8. Output pass-through — visually confirm CLI's stdout/stderr surfaces cleanly; no nu-side header obscures it.

Exit criteria: all eight cases pass; LOC drop ~200 lines; user-visible output dominated by the CLI's own messages.

Acceptance Criteria (from #100, pilot portion)

  • lib.nu exports svc_cli_start and svc_cli_stop with the contract in §Requirements.
  • service_whiteboard.nu no longer contains svx_server_action, svx_ui_action, svx_service_config, or svx_drop_registration.
  • start body is svc_require_sudo (conditional) → installsvc_cli_start.
  • stop body is svc_require_sudo (conditional) → hero_proc-down warning → svc_cli_stop.
  • install byte-identical to pre-refactor.
  • status byte-identical to pre-refactor.
  • Line count drops from 317 to under 110.
  • Full Hetzner --root smoke matrix passes (all eight cases).
  • CLI's stdout/stderr surfaces on both success and failure paths.

Notes

  • --reset semantic shift: pre-refactor meant "re-register even if already running". CLI's restart_service is idempotent, so --reset now functionally means "force rebuild before restart". Documented in the start doc-comment.
  • Why no generic svc_cli_run [verb] yet: two specific helpers keep call sites (svc_cli_start "hero_whiteboard" $root) self-documenting and let us evolve --start vs. --stop divergence without touching callers. Revisit after PR B if a second shape wants sharing (e.g. --status).
  • Why status stays unchanged: hero_whiteboard --status doesn't exist today, and proc service status returns a structured nu record that smoke tests and other skills pipe into where / get state. Replacing it with a CLI invocation would regress the data shape.
  • Why we drop svc_require_proc from start but keep it in status: start now delegates to the CLI, which does its own hero_proc_factory().await? check inside self_start and returns a clear error. Two redundant preflights muddy the error output. status uses the nu client directly and benefits from the cleaner nu-side message.
  • Why PR A touches only service_whiteboard.nu: canonical-shape review is the goal; spreading across 12 files would make review harder and delay rollback if we picked the wrong helper shape. PR B applies the same transform in bulk after the helpers are validated.

Critical Files for Implementation

  • tools/modules/services/lib.nu (new helpers)
  • tools/modules/services/service_whiteboard.nu (pilot refactor)
  • lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs (reference — canonical action spec)
  • tools/modules/services/service_db.nu (reference — install flag shape with --reset)
## Implementation Spec for Issue #100 — Pilot (PR A) ### Objective Eliminate the duplicated hero_proc action spec that `service_whiteboard.nu` currently hand-transcribes, and move its `start` / `stop` verbs to thin delegators over the `hero_whiteboard` selfstart CLI — whose `build_service_definition()` in `lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs` is the canonical source of truth for the registration. The pilot introduces two reusable helpers (`svc_cli_start`, `svc_cli_stop`) in `tools/modules/services/lib.nu`, then refactors exactly one module — `service_whiteboard.nu` — so reviewers see the canonical delegator shape before PR B mass-applies the same transform to the remaining 11 non-blocked modules. ### Requirements **Helpers to add (to `tools/modules/services/lib.nu`):** - `svc_cli_start [cli_name: string, root: bool]` — invokes `<svc_bin cli_name root> --start`, dispatching sudo when needed. - `svc_cli_stop [cli_name: string, root: bool]` — invokes `<svc_bin cli_name root> --stop`, same dispatch. **Dispatch rules (shared by both):** - Resolve CLI path via `svc_bin $cli_name $root`. - Pre-flight: CLI must be executable. `svc_need_sudo`-aware `test -x` (same idiom as in `svc_bins_ok`). - Execution matrix: - `root == false` → `^($bin) --start` with invoker's env. - `root == true` AND invoker is root → `with-env { HOME: "/root" } { ^$bin --start }`. - `root == true` AND invoker is NOT root → `^sudo -E HOME=/root $bin --start`. `-E` preserves `HERO_SOCKET_DIR` etc.; the `HOME=/root` override redirects the CLI's hero_proc socket resolution to root's dir. - Sudo prefix stays inline in each helper (no separate `svc_sudo_prefix` fn). - CLI stdout must flow through to the terminal. Use `| complete`, then `print $r.stdout` and `print $r.stderr`, branch on `$r.exit_code`, raise `error make` with stderr attached on failure. Acceptable tradeoff: output appears in one chunk after CLI exits (~1s, negligible). **Module to refactor: `tools/modules/services/service_whiteboard.nu`.** Verb-to-CLI mapping: | Nu verb | New body | |---|---| | `service_whiteboard install` | unchanged — cargo build + copy (CLI is one of the three binaries built) | | `service_whiteboard start [--root]` | `if $root { svc_require_sudo }; install --root=$root --update=$update --reset=$reset; svc_cli_start "hero_whiteboard" $root` | | `service_whiteboard stop [--root]` | `if $root { svc_require_sudo }; if not (svc_proc_healthy $root) { warn+return }; svc_cli_stop "hero_whiteboard" $root` | | `service_whiteboard status [--root]` | unchanged | **What stays:** - Constants `SVX_SERVICE_NAME`, `SVX_FORGE_LOC`, `SVX_BINARIES`, `SVX_ACTIONS`. - `install` (unchanged — still `svc_update` + `svc_bins_ok` short-circuit + `svc_cargo_install`). - `status` (unchanged). - `use ../clients/proc.nu *` and `use ./lib.nu *` imports. **What goes:** - `svx_server_action`, `svx_ui_action`, `svx_service_config`, `svx_drop_registration` — all four def blocks. The CLI's `build_service_definition()` owns this spec. - The "already running → skip" idempotency pre-check in `start` (CLI's `restart_service` is idempotent by design). - The rich 13-line summary block at the tail of `start` (CLI prints its own success line). - The `svc_require_proc` call inside `start` — CLI does its own hero_proc reachability check. `status` keeps its `svc_require_proc`. ### Files to Modify Two files: 1. `tools/modules/services/lib.nu` — append `svc_cli_start` and `svc_cli_stop` in a new section after `svc_cargo_install`, matching the existing section banner style. 2. `tools/modules/services/service_whiteboard.nu` — trim from 317 lines to ~100. No other files are touched. `mod.nu` no edit (module already exported). Other `service_*.nu` modules no edit. ### Implementation Plan #### Step 1: Design `svc_cli_start` / `svc_cli_stop` Both helpers have an identical body modulo the flag. Shape: ```nu export def svc_cli_start [cli_name: string, root: bool] { let bin = (svc_bin $cli_name $root) # 1. executable preflight let bin_ok = if (svc_need_sudo $root) { (do { ^sudo test -x $bin } | complete).exit_code == 0 } else { ($bin | path exists) } if not $bin_ok { error make {msg: $"($cli_name) CLI not found or not executable at ($bin). Run install first."} } # 2. execute let r = if $root { if (is-admin) { with-env { HOME: "/root" } { ^$bin --start } | complete } else { ^sudo -E HOME=/root $bin --start | complete } } else { ^$bin --start | complete } # 3. surface output + raise on failure if ($r.stdout | str length) > 0 { print $r.stdout } if ($r.stderr | str length) > 0 { print -e $r.stderr } if $r.exit_code != 0 { error make {msg: $"($cli_name) --start failed (exit ($r.exit_code))"} } } ``` `svc_cli_stop` is identical with `--stop` in place of `--start` and the error message string adjusted. #### Step 2: Add helpers to `lib.nu` - Append after `svc_cargo_install` (current end of file). - Header comment block explaining: resolves via `svc_bin`; `is-admin` → bare with `HOME=/root`; non-root invoker with `--root` → `sudo -E HOME=/root`; errors raised with CLI stderr surfaced first. - Update the "What's exposed" comment at the top of `lib.nu` to add a `Selfstart CLI delegators — svc_cli_start, svc_cli_stop` line. #### Step 3: Trim `service_whiteboard.nu` header - Keep the 2-binary summary, socket paths block, hero_router line. - Replace the usage block with a shorter delegator-aware version. Add an explicit line: > The hero_proc action spec is owned by the Rust CLI's `build_service_definition()` in `lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs`. `start` / `stop` delegate to `hero_whiteboard --start` / `hero_whiteboard --stop`; `install` and `status` stay local. - Target header length: ~25 lines (down from 40). #### Step 4: Delete removed helpers Remove in order (keeps diff readable): 1. Section banner + both `def svx_server_action` / `def svx_ui_action` blocks. 2. `def svx_service_config`. 3. `def svx_drop_registration`. Keep the Constants banner and the four `const SVX_*` lines. #### Step 5: Rewrite `start` ```nu # =========================================================================== # start — delegate to `hero_whiteboard --start`. # =========================================================================== # # `hero_whiteboard --start` calls hero_proc_sdk::lifecycle::restart_service, # which unregisters+re-registers the service idempotently. The action spec # lives in build_service_definition() in the Rust CLI — do NOT duplicate it # here. # # --reset semantic shift: previously meant "re-register even if running"; # with the CLI owning (idempotent) re-registration, --reset now effectively # means "force rebuild before restart" (forwarded to install). # # Examples: # service_whiteboard start # service_whiteboard start --reset # service_whiteboard start --root export def start [ --reset # Force rebuild before restart (passed through to install) --root(-r) # Manage root's hero_proc instance (requires sudo) --update(-u) # Pull newest code (via `forge merge`) before building ] { if $root { svc_require_sudo } install --root=$root --update=$update --reset=$reset svc_cli_start "hero_whiteboard" $root } ``` #### Step 6: Rewrite `stop` ```nu # =========================================================================== # stop — delegate to `hero_whiteboard --stop`. # =========================================================================== export def stop [ --root(-r) # Stop root's hero_whiteboard registration (requires sudo) ] { if $root { svc_require_sudo } if not (svc_proc_healthy $root) { let flag = if $root { " --root" } else { "" } print "⚠ hero_proc is not running — nothing to stop." print $" Bring hero_proc back up with: service_proc start($flag)" return } svc_cli_stop "hero_whiteboard" $root } ``` #### Step 7: Leave `install` and `status` alone - `install` byte-identical. - `status` byte-identical — we keep `proc service status` because it returns a structured record that callers pipe into `where state == "running"`, which `hero_whiteboard --status` doesn't expose. Expected post-refactor LOC: ~100 lines (down from 317). ### Smoke Test Plan (Hetzner, `--root`) 1. **Install parity** — `service_whiteboard install --root` — verify binaries present, no behavioural change vs. pre-refactor. 2. **Start cold** — after stop, `service_whiteboard start --root` — CLI's `hero_whiteboard started successfully` line prints; `proc service status` shows `running`; both sockets exist; `curl --unix-socket ui.sock /health` → 200; server OpenRPC health OK. 3. **Start warm / idempotent** — re-run without flags; CLI's `restart_service` handles clean re-register; still `running`. 4. **Status parity** — `service_whiteboard status --root` — record shape byte-identical to baseline. 5. **Stop** — `service_whiteboard stop --root` — CLI's `hero_whiteboard stopped successfully` prints; post-stop status returns not-found. 6. **Stop with hero_proc down** — `⚠ hero_proc is not running` branch fires; helper never called. 7. **Error surface** — `sudo chmod -x hero_whiteboard CLI`, run `start --root` → preflight raises with actionable message. Restore chmod. 8. **Output pass-through** — visually confirm CLI's stdout/stderr surfaces cleanly; no nu-side header obscures it. Exit criteria: all eight cases pass; LOC drop ~200 lines; user-visible output dominated by the CLI's own messages. ### Acceptance Criteria (from #100, pilot portion) - [ ] `lib.nu` exports `svc_cli_start` and `svc_cli_stop` with the contract in §Requirements. - [ ] `service_whiteboard.nu` no longer contains `svx_server_action`, `svx_ui_action`, `svx_service_config`, or `svx_drop_registration`. - [ ] `start` body is `svc_require_sudo` (conditional) → `install` → `svc_cli_start`. - [ ] `stop` body is `svc_require_sudo` (conditional) → hero_proc-down warning → `svc_cli_stop`. - [ ] `install` byte-identical to pre-refactor. - [ ] `status` byte-identical to pre-refactor. - [ ] Line count drops from 317 to under 110. - [ ] Full Hetzner `--root` smoke matrix passes (all eight cases). - [ ] CLI's stdout/stderr surfaces on both success and failure paths. ### Notes - **`--reset` semantic shift**: pre-refactor meant "re-register even if already running". CLI's `restart_service` is idempotent, so `--reset` now functionally means "force rebuild before restart". Documented in the `start` doc-comment. - **Why no generic `svc_cli_run [verb]` yet**: two specific helpers keep call sites (`svc_cli_start "hero_whiteboard" $root`) self-documenting and let us evolve `--start` vs. `--stop` divergence without touching callers. Revisit after PR B if a second shape wants sharing (e.g. `--status`). - **Why `status` stays unchanged**: `hero_whiteboard --status` doesn't exist today, and `proc service status` returns a structured nu record that smoke tests and other skills pipe into `where` / `get state`. Replacing it with a CLI invocation would regress the data shape. - **Why we drop `svc_require_proc` from `start` but keep it in `status`**: `start` now delegates to the CLI, which does its own `hero_proc_factory().await?` check inside `self_start` and returns a clear error. Two redundant preflights muddy the error output. `status` uses the nu client directly and benefits from the cleaner nu-side message. - **Why PR A touches only `service_whiteboard.nu`**: canonical-shape review is the goal; spreading across 12 files would make review harder and delay rollback if we picked the wrong helper shape. PR B applies the same transform in bulk after the helpers are validated. ### Critical Files for Implementation - `tools/modules/services/lib.nu` (new helpers) - `tools/modules/services/service_whiteboard.nu` (pilot refactor) - `lhumina_code/hero_whiteboard/crates/hero_whiteboard/src/main.rs` (reference — canonical action spec) - `tools/modules/services/service_db.nu` (reference — `install` flag shape with `--reset`)
Author
Owner

Pilot (PR A) implementation summary

Changes

  • tools/modules/services/lib.nu (+78 lines): added svc_cli_start / svc_cli_stop with a private svc_cli_invoke dispatcher that handles three --root cases:
    • --root=false → bare exec, invoker's env
    • --root=true + invoker is root → with-env { HOME: "/root" } { ... } bare
    • --root=true + non-root invoker → sudo -E HOME=/root
      CLI stdout/stderr surfaced via | complete + reprint; non-zero exit raises error make with stderr attached.
  • tools/modules/services/service_whiteboard.nu (317 → 85 lines, −232): removed svx_server_action, svx_ui_action, svx_service_config, svx_drop_registration, the "already running → skip" idempotency precheck in start, the 13-line summary block, and the now-redundant svc_require_proc call in start (the CLI surfaces its own). install and status byte-identical to pre-refactor.

Hetzner smoke test (--root)

# Assertion Result
1 install short-circuits via svc_bins_ok ("binaries already in place — skipping build") PASS
2 start cold — CLI's output (hero_whiteboard started (pid: ...) + hero_whiteboard started successfully) flows through PASS
3 Both Unix sockets present after start; curl --unix-socket ui.sock /health → HTTP 200 PASS
4 status returns structured record (name, state: running, pid, current_run_id: 22, restarts: 0) PASS
5 start warm (no flags) — CLI's restart_service re-registers idempotently (new pid) PASS
6 stop — CLI's hero_whiteboard stopped successfully surfaces PASS
7 Post-stop status returns state: exited, pid: 0 (see note below) PASS
8 stop with hero_proc down — ⚠ hero_proc is not running — nothing to stop branch fires PASS

Behavioural note — post-stop status

Pre-refactor the module called proc service delete after the proc service stop in its svx_drop_registration helper; subsequent status returned "service not found". The CLI's stop_service only stops — it leaves the service record in exited state. Net effect is positive: hero_proc retains the history row (useful for restart counts and audit logs), and any downstream skill that piped status through | get state continues to work — they get exited instead of an RPC error. Callers that explicitly test for "not found" would need to update; a codebase search shows none in hero_skills.

Path to PR B

Once PR A is approved and merged, the 11 remaining non-blocked modules can be mass-applied in a single follow-up PR using the same transform:

  • service_os, service_books, service_collab, service_biz, service_db, service_indexer, service_voice, service_aibroker, service_foundry
  • service_agent, service_claude, service_compute, service_shrimp — the four yet-to-be-created modules land directly as delegators (no pre-refactor state).

service_office and service_osis remain blocked on hero_office#2 and hero_osis#32 respectively.

Acceptance criteria (#100, pilot portion)

  • lib.nu exports svc_cli_start and svc_cli_stop with the documented dispatch contract.
  • service_whiteboard.nu no longer contains svx_server_action, svx_ui_action, svx_service_config, or svx_drop_registration.
  • start body reduced to svc_require_sudoinstallsvc_cli_start.
  • stop body reduced to svc_require_sudo → hero_proc-down guard → svc_cli_stop.
  • install byte-identical to pre-refactor.
  • status byte-identical to pre-refactor.
  • LOC: 317 → 85 (target was under 110).
  • Full Hetzner --root smoke matrix passes.
  • CLI's stdout/stderr surfaces cleanly on both success and failure paths.
## Pilot (PR A) implementation summary ### Changes - **`tools/modules/services/lib.nu`** (+78 lines): added `svc_cli_start` / `svc_cli_stop` with a private `svc_cli_invoke` dispatcher that handles three `--root` cases: - `--root=false` → bare exec, invoker's env - `--root=true` + invoker is root → `with-env { HOME: "/root" } { ... }` bare - `--root=true` + non-root invoker → `sudo -E HOME=/root` CLI stdout/stderr surfaced via `| complete` + reprint; non-zero exit raises `error make` with stderr attached. - **`tools/modules/services/service_whiteboard.nu`** (317 → 85 lines, −232): removed `svx_server_action`, `svx_ui_action`, `svx_service_config`, `svx_drop_registration`, the "already running → skip" idempotency precheck in `start`, the 13-line summary block, and the now-redundant `svc_require_proc` call in `start` (the CLI surfaces its own). `install` and `status` byte-identical to pre-refactor. ### Hetzner smoke test (`--root`) | # | Assertion | Result | |---|---|---| | 1 | `install` short-circuits via `svc_bins_ok` ("binaries already in place — skipping build") | PASS | | 2 | `start` cold — CLI's output (`hero_whiteboard started (pid: ...)` + `hero_whiteboard started successfully`) flows through | PASS | | 3 | Both Unix sockets present after start; `curl --unix-socket ui.sock /health` → HTTP 200 | PASS | | 4 | `status` returns structured record (`name`, `state: running`, `pid`, `current_run_id: 22`, `restarts: 0`) | PASS | | 5 | `start` warm (no flags) — CLI's `restart_service` re-registers idempotently (new pid) | PASS | | 6 | `stop` — CLI's `hero_whiteboard stopped successfully` surfaces | PASS | | 7 | Post-stop `status` returns `state: exited, pid: 0` (see note below) | PASS | | 8 | `stop` with hero_proc down — `⚠ hero_proc is not running — nothing to stop` branch fires | PASS | ### Behavioural note — post-stop `status` Pre-refactor the module called `proc service delete` after the `proc service stop` in its `svx_drop_registration` helper; subsequent `status` returned "service not found". The CLI's `stop_service` only stops — it leaves the service record in `exited` state. Net effect is positive: hero_proc retains the history row (useful for restart counts and audit logs), and any downstream skill that piped `status` through `| get state` continues to work — they get `exited` instead of an RPC error. Callers that explicitly test for "not found" would need to update; a codebase search shows none in `hero_skills`. ### Path to PR B Once PR A is approved and merged, the 11 remaining non-blocked modules can be mass-applied in a single follow-up PR using the same transform: - `service_os`, `service_books`, `service_collab`, `service_biz`, `service_db`, `service_indexer`, `service_voice`, `service_aibroker`, `service_foundry` - `service_agent`, `service_claude`, `service_compute`, `service_shrimp` — the four yet-to-be-created modules land directly as delegators (no pre-refactor state). `service_office` and `service_osis` remain blocked on hero_office#2 and hero_osis#32 respectively. ### Acceptance criteria (#100, pilot portion) - [x] `lib.nu` exports `svc_cli_start` and `svc_cli_stop` with the documented dispatch contract. - [x] `service_whiteboard.nu` no longer contains `svx_server_action`, `svx_ui_action`, `svx_service_config`, or `svx_drop_registration`. - [x] `start` body reduced to `svc_require_sudo` → `install` → `svc_cli_start`. - [x] `stop` body reduced to `svc_require_sudo` → hero_proc-down guard → `svc_cli_stop`. - [x] `install` byte-identical to pre-refactor. - [x] `status` byte-identical to pre-refactor. - [x] LOC: 317 → 85 (target was under 110). - [x] Full Hetzner `--root` smoke matrix passes. - [x] CLI's stdout/stderr surfaces cleanly on both success and failure paths.
mahmoud self-assigned this 2026-04-21 08:30:26 +00:00
mahmoud added this to the ACTIVE project 2026-04-21 08:30:28 +00:00
mahmoud added this to the now milestone 2026-04-21 08:30:32 +00:00
Author
Owner

Pilot PR A opened: #101

PR B (mass apply to the 11 remaining non-blocked modules) will follow once this is approved and merged.

Pilot PR A opened: https://forge.ourworld.tf/lhumina_code/hero_skills/pulls/101 PR B (mass apply to the 11 remaining non-blocked modules) will follow once this is approved and merged.
Author
Owner

Closing — direction reversed, superseded by a new issue capturing the actual target architecture.

Original proposal: refactor nu modules to delegators over hero_<svc> --start/--stop CLI flags. Rust CLI as source of truth.

Actual direction (per @despiegk): the opposite. --start/--stop should be removed from every hero_* binary — they add no value beyond what the nu layer does. Nushell (specifically service_*.nu + lib.nu helpers) owns full lifecycle and will replace the per-repo Makefiles.

The 5df145f Refactor service lifecycle management into shared lib.nu helpers commit on development already moved in this direction (~1,700 lines of duplication removed; lib.nu helpers dispatch directly via hero_proc_sdk / proc service set + proc service start).

Closing in favour of # which scopes the removal of --start/--stop from the binaries themselves. Pilot PR #101 also closed.

Closing — direction reversed, superseded by a new issue capturing the actual target architecture. **Original proposal**: refactor nu modules to delegators over `hero_<svc> --start`/`--stop` CLI flags. Rust CLI as source of truth. **Actual direction** (per @despiegk): the opposite. `--start`/`--stop` should be removed from every `hero_*` binary — they add no value beyond what the nu layer does. Nushell (specifically `service_*.nu` + `lib.nu` helpers) owns full lifecycle and will replace the per-repo Makefiles. The `5df145f Refactor service lifecycle management into shared lib.nu helpers` commit on `development` already moved in this direction (`~1,700` lines of duplication removed; `lib.nu` helpers dispatch directly via `hero_proc_sdk` / `proc service set` + `proc service start`). Closing in favour of #<TBD> which scopes the removal of `--start`/`--stop` from the binaries themselves. Pilot PR #101 also closed.
Author
Owner

Successor: #102refactor: remove --start/--stop from hero_* binaries; nushell owns full lifecycle.

Successor: #102 — `refactor: remove --start/--stop from hero_* binaries; nushell owns full lifecycle`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_skills#100
No description provided.