service_db.nu — hero_db server + UI lifecycle module #87
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_skills#87
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Child of #75.
Objective
Add
tools/modules/services/service_db.nuimplementing the standardinstall | start | stop | statuslifecycle for the hero_db service (server + UI).Scope
ssh://git@forge.ourworld.tf/lhumina_code/hero_db.gitbuildenv.sh):hero_db,hero_db_server,hero_db_uihero_db_server,hero_db_uilhumina_code/hero_zero/services/hero_db.toml$HERO_SOCKET_DIR/hero_db/rpc.sock— HTTP/1.1 OpenRPC management (server)$HERO_SOCKET_DIR/hero_db/resp.sock— RESP2 Unix socket (server)$HERO_SOCKET_DIR/hero_db/ui.sock— UI admin0.0.0.0:6378for RESP2 (unprivileged port — no root required).HERO_DB_PORTenv var overrides.RUST_LOG=infofor both — TOML declares no other env.HERO_DB_DATA_DIRdefaults to~/.hero_db(server-managed, no preflight needed).Cargo.toml:1-10,[workspace]-only).hero_db_server; UI invoked ashero_db_ui serve. Matches theservice_books.nupattern for the UI.--rootflag supported but optional; user-level default.Acceptance criteria
use services/mod.nu *makesservice_dbavailable.service_db install [--root] [--update]cloneslhumina_code/hero_db, builds all 3 binaries in release mode, installs to~/hero/bin/(or/root/hero/bin/with--root).service_db start [--reset] [--root] [--update]registers both runtime actions + the service, starts, prints all three socket paths plus the TCP port in the summary. Idempotent without--reset.service_db status [--root]reports state.service_db stop [--root]cleanly unregisters.Template & references
service_whiteboard.nu(PR #83) /service_collab.nu(PR #85) — virtual-workspace baseline.service_books.nu(PR #81) — reference for theservesubcommand pattern on the UI (used verbatim here).kill_other.port: [6378]to catch stale TCP binds on re-register (it probably does, given the TCP listener).Expected deviations from the baseline template
kill_othermust cover THREE artifacts:rpc.sock,resp.sock, and TCP port6378.scriptusesservesubcommand (per TOML). Serverscriptis bare binary.service_whiteboard.nuverbatim.Implementation Spec for Issue #87
Objective
Add a
service_dbNushell lifecycle module that registers, starts, stops, and queries the status of thehero_dbservice underhero_proc. The module supervises two binaries —hero_db_server(OpenRPC + RESP2 store) andhero_db_ui(HTTP dashboard over a Unix socket) — and ships thehero_dbCLI alongside them without registering it as an action. Shape follows theservice_whiteboard.nu/service_collab.nubaseline (pure two-binary virtual workspace, no preflight) with two narrow deviations borrowed fromservice_books.nufor the UIservesubcommand and an expandedkill_otheron the server action so hero_proc can reclaim the RPC socket, the RESP2 Unix socket, and the RESP2 TCP port together.Requirements
hero_db, contextcore, classsystem, criticalfalse.lhumina_code/hero_db. Virtual workspace with memberscrates/hero_db{,_server,_sdk,_ui,_app,_examples}; no root[package]. Plaincargo build --releaseproduces every bin target — usesvc_cargo_installunmodified (no--workspacehand-roll likeservice_books).SVX_BINARIES):hero_db,hero_db_server,hero_db_ui(matchesbuildenv.sh).SVX_ACTIONS):hero_db_server,hero_db_ui. Thehero_dbCLI is installed but not registered (whiteboard/collab convention).hero_db_server/src/main.rs):$HERO_SOCKET_DIR/hero_db/rpc.sock— OpenRPC (hero_proc health-checks this)$HERO_SOCKET_DIR/hero_db/resp.sock— RESP2 Unix0.0.0.0:6378— RESP2 TCP (overridable viaHERO_DB_PORT; not overridden here)$HERO_SOCKET_DIR/hero_db/ui.sock.RUST_LOG: "info"only on both actions.HERO_DB_PORT,HERO_DB_DATA_DIR,HERO_DB_ENCRYPTION_KEY,REDIS_ADMIN_SECRETall left at server defaults (data dir~/.hero_db, no encryption, port 6378).hero_db.tomlhas nodepends_on; both binaries pre-create the socket dir and unlink stale sockets before bind. No preflight helper warranted.install,start [--reset --update],stop,status. All accept--root (-r). Identical UX toservice_whiteboard/service_collab.Files to Modify / Create
tools/modules/services/service_db.nuservice_whiteboard.nuwith the two deviations below.tools/modules/services/mod.nuexport use service_db.nu.Implementation Plan
Each step maps 1:1 onto a block of
service_whiteboard.nu. Deviations are called out explicitly.Step 1: Header comment block (whiteboard lines 1–39)
Substitute every
hero_whiteboard→hero_db. Rewrite the functional description to:hero_db_server— encrypted Redis-backed store exposing OpenRPC (HTTP/1.1 over Unix) plus RESP2 on both a Unix socket and TCP 6378.hero_db_ui— HTTP dashboard over Unix socket.Explicitly list all three server bind points so an operator reading the header understands why
kill_otherlists extras:Keep the "No external dependencies / both binaries remove stale sockets before bind" paragraph verbatim. Keep the CLI note: "
hero_dbis the CLI; it is shipped alongside the runtime binaries but is NOT registered as a hero_proc action."Step 2: Imports
Identical to whiteboard:
use ../clients/proc.nu *+use ./lib.nu *. Do NOT importforge.nu— unlikeservice_books, hero_db uses the standardsvc_cargo_installpath.Step 3: Constants
Step 4:
svx_server_action— DEVIATION #1Copy whiteboard's server action. Keep
script: $bin(TOML uses bare binary),env: {RUST_LOG: "info"}, retry policy, stop signal/timeout, and health check unchanged.Only change:
kill_othermust cover all three artifacts the binary binds. Replace the single-socket list with:Rationale: on restart, hero_proc must reclaim any of the three listeners that may be stuck (stale process, TIME_WAIT port, orphaned socket inode) before the fresh
hero_db_servercan bind. Port6378is a literal becauseHERO_DB_PORTis not overridden here — if a future operator sets it, the action spec must be updated alongside.health_checksstays pinned torpc.sock, which is the only endpoint the hero_proc OpenRPC probe can speak.Step 5:
svx_ui_action— DEVIATION #2Copy whiteboard's UI action. Keep retry policy, env, timeouts,
kill_other(single socket:ui.sock), andhealth_checks(pinned toui.sock) unchanged.Only change: invocation is
script: $"($bin) serve". This mirrorshero_db.toml'sexec = "__HERO_BIN__/hero_db_ui serve"verbatim.hero_db_uihas no clap and ignores the argument at runtime, but we stay faithful to the published TOML contract — same rationale asservice_books.nu.Step 6:
svx_service_configIdentical structure. Description:
"Hero DB — encrypted Redis-backed store with graph/vector/stream/ontology APIs and dashboard".Step 7:
svx_drop_registrationIdentical to whiteboard — stop service, delete service, delete each action, all wrapped in
try { ... } catch { }.Step 8:
installCopy whiteboard verbatim.
hero_dbis a pure virtual workspace —cargo build --releasebuilds every bin target in one pass andsvc_cargo_install's release-dir preflight catches any misnamed binary before copy. Do NOT add the--workspacepre-step fromservice_books.Step 9:
startCopy whiteboard verbatim, substituting names. No embedder preflight (books-only).
Update the final summary block to reflect the extras so operators can probe them directly:
The
resp sockandresp tcplines are informational — they're additional bind points on the server process, not separate hero_proc actions.Step 10:
stopIdentical to whiteboard.
svc_proc_healthyguard, thensvx_drop_registration. No service-specific cleanup.Step 11:
statusIdentical to whiteboard.
svc_require_proc "service_db" $rootthenproc service status $SVX_SERVICE_NAME --root=$root.Step 12:
mod.nuAppend
export use service_db.nu. Following the existing merge-order convention in the file (no alphabetical enforcement), append as a new line afterservice_collab.nu.Smoke Test Plan (Hetzner,
--root)After
service_proc start --root:Expected: all four probes (rpc.sock, ui.sock, resp.sock, TCP 6378) succeed on initial start AND after
--reset. Stop leaves no hero_db service / action entries in hero_proc.Acceptance Criteria
service_db install [--root]cloneslhumina_code/hero_db, runscargo build --release, copieshero_db,hero_db_server,hero_db_uiinto the correct bin dir.service_db start [--root]brings up the service with both actions registered, health checks passing onrpc.sockandui.sock, idempotent without--reset.service_db start --reset [--root]drops prior registration and restarts cleanly even when stale sockets / TCP listeners are present.service_db stop [--root]removes the service + both actions from hero_proc; re-run is a safe no-op.service_db status [--root]returns the hero_proc status record when up; clean actionable error viasvc_require_procwhen hero_proc is down.--rootroutes through root's hero_proc socket, uses/root/hero/bin+/root/hero/var/sockets, and validates passwordless sudo up front.--reset.mod.nuexportsservice_db.Notes
[package]), so plaincargo build --releaseat the rootCargo.tomlbuilds every bin target. No--workspaceaccommodation needed.installis a verbatim copy of whiteboard's, not books'.servesubcommand:hero_db_uihas no clap — the argument is ignored at runtime. We keep it inscriptanyway because the publishedhero_zero/services/hero_db.tomlincludes it and this module's job is to mirror that contract. Ifhero_db_uilater grows a real CLI, the TOML (and this module) already match.resp.sockas extras:hero_db_serverbinds three artifacts in one process. They are not separate hero_proc actions — one action (hero_db_server), one health probe (rpc.sock). The RESP2 listeners ride on the same process lifecycle. The expandedkill_otherlist is the hero_proc-side reclaim mechanism that keeps restarts clean.service_books, hero_db has no soft dependency and nodepends_on. Both binariescreate_dir_allthe per-service socket dir and unlink stale sockets before bind. A preflight helper here would be dead code.HERO_DB_PORTleft unset so the default (6378) matches the literal inkill_other.port. Any future override must update both places together — thekill_other.portlist is the canonical enforcement point on restart.serve-subcommand + multi-socketkill_otherpattern is specific to hero_db here. Defer any refactor tolib.nuuntil a second service confirms which pattern is the outlier.Implementation summary
Changes
tools/modules/services/service_db.nu— ~330 lines, copy-rename ofservice_whiteboard.nuwith the two spec-approved deviations applied (expanded serverkill_othercoveringrpc.sock+resp.sock+ TCP 6378, and UIscript: "<bin> serve"mirroring the TOML).tools/modules/services/mod.nu— addedexport use service_db.nu.End-to-end smoke test on Hetzner
service_db status --rootwith hero_proc down → actionable error pointing toservice_proc start --rootservice_db stop --rootwith hero_proc down → benign warning, no errorservice_db start --rootwith hero_proc down → actionable errorservice_proc start --roothealthyservice_db install --rootproduced 3 binaries in/root/hero/bin/service_db start --reset --rootregisters + startsrpc.sockpresent (OpenRPC)resp.sockpresent (RESP2 Unix)ui.sockpresent (UI)hero_db_servercurl --unix-socket rpc.sockrpc.discoverreturns OpenRPC doccurl --unix-socket ui.sock /→ HTTP 200 (198k body)redis-cli -s resp.sock ping→ PONGredis-cli -h 127.0.0.1 -p 6378 ping→ PONGstatus→{name: hero_db, state: running, restarts: 0, pid: 3597906}start(no--reset) prints "already running"current_run_idstable at 13,restarts: 0, staterunningstart --reset --rootwhile running — all 5 probes (rpc/resp/ui/resp-ping/tcp-ping) pass again after restart, provingkill_otherreclaimed all three server bind pointsservice_db stop --rootstops + unregistersstatusreturns expectedservice 'hero_db' not foundhero_db_server/hero_db_uiprocesses, TCP 6378 released, socket files removed (/root/hero/var/sockets/hero_db/empty)Deviations from baseline template — confirmed behaving as intended
kill_otheron rpc.sock + resp.sock + port 6378: verified end-to-end by 2o — running a--resetrestart against a live service reclaimed every bind point cleanly. No stale-listener / EADDRINUSE on the new process.script: "<bin> serve": theserveargument is ignored byhero_db_ui's main (no clap), but the module stays faithful tohero_zero/services/hero_db.toml'sexecline. Confirmed the UI starts and serves correctly.Observation: better than whiteboard on shutdown
Both
hero_db_serverandhero_db_uiclean up their Unix sockets onSIGTERM— afterstop, the per-service socket directory is empty. No stale inodes left behind (unlikehero_whiteboard, which requiredkill_other.socketcleanup on next start).Acceptance criteria
use services/mod.nu *oruse services/service_db.nu *.installbuilds 3 binaries and places them in~/hero/bin/(or/root/hero/bin/with--root).startregisters both actions + the service, starts, surfaces all four endpoints (rpc.sock, resp.sock, resp tcp, ui.sock) plus UI URL in the summary.start --resettears down and reclaims all three server bind points cleanly.statusreports the hero_proc record.stopcleanly unregisters.--rootworks end-to-end with passwordless sudo.PR opened: #88