Service lifecycle alignment — lab is the only bootstrap (cargo + benches + nu shell out) #124

Closed
opened 2026-05-22 07:26:36 +00:00 by timur · 6 comments
Owner

Service lifecycle alignment — one bootstrap, one test verb (lab is the only entry)

⚠️ Scope has been refined twice. The first framing was just the
MultiDomainBuilder extraction. The second framing added a
lab_fixture library that wrapped the builder for cargo tests.
This is the third and final framing: drop lab_fixture entirely.
Cargo tests, criterion benches, and nushell scripts all invoke
lab as a subprocess
. There is exactly one bootstrap codepath in
the entire stack — lab's. If you have an in-progress branch from
either earlier framing, salvage what fits; the end-state below is
what ships.

End state — one bootstrap, period

                 lab service <name> --start [--ephemeral]
                                │
                                │ (the only path that knows how to
                                │  read service.toml, walk domains,
                                │  build the RpcModule, serve_http)
                                ▼
                  ┌──────────────────────────────┐
                  │  hero_proc-managed instance  │  ← production
                  │       or                     │
                  │  ephemeral process tree      │  ← --ephemeral mode
                  │  (unique /tmp socket,        │
                  │   tempdir, --json pid+path)  │
                  └──────────────────────────────┘
                          ▲     ▲     ▲     ▲
                          │     │     │     │
       cargo `tests/`     │     │     │     │  nushell `tests/*.nu`
       criterion benches  │     │     │     │  hero_browser MCP suite
                          │     │     │     │
       all four consumers fork `lab service <name> --start --ephemeral`,
       parse the printed JSON to learn the socket path, connect, run,
       then call `lab service <name> --stop --pid <pid>` on teardown.

Scaffolded tests/src/lib.rs is not a pub use. It's a small
subprocess-driver file — handful of lines that fork lab, parse the
JSON, return a handle whose Drop calls --stop. No MultiDomainBuilder
import, no register_methods ladder, no per-domain cfg(feature = …)
block. Adding a domain to the schema means zero edits to
tests/src/lib.rs — the running lab knows about the new domain because
it reads service.toml.

Scaffolded crates/<name>_server/src/main.rs collapses too — it's the
MultiDomainBuilder::production() chain, but the builder itself stays
internal to hero_rpc_osis::rpc::bootstrap. No new public crate.

lab service <name> --test is the single test entry point —
contributors never type cargo test or nu tests/smoke.nu directly.

What lab grows

Three new sub-flags/verbs on the existing lab service <name> subcommand:

lab service <name> --start --ephemeral [--json]

Same code path as lab service <name> --start, but:

  • Picks a unique short socket path under /tmp/lab-<pid>-<n>/ (stays
    under macOS's ~104-byte sun_path limit even when $TMPDIR
    resolves to /private/var/folders/…).

  • Routes OSIS storage to a fresh tempdir, not $HERO_VAR_DIR.

  • Spawns the binary directly with Command::spawndoes not
    register with hero_proc
    . Ephemeral instances are owned by their
    parent process (the cargo test, the criterion bench, the nu
    script's wrapper), not the supervisor.

  • With --json, prints a single line of structured info to stdout
    that the parent parses:

    {"name":"hero_service","pid":12345,
     "rpc_socket":"/tmp/lab-9876-3/rpc.sock",
     "data_dir":"/tmp/lab-9876-3/db",
     "ready_at":"<rfc3339>"}
    

    Without --json, prints the same human-readable banner the
    non-ephemeral path prints.

lab service <name> --stop --pid <pid>

Already mostly exists; needs an --pid flag that bypasses the
hero_proc lookup and SIGTERMs the named pid + removes
/tmp/lab-<pid>-*/ if it owns it. The non---pid path is
unchanged — that's still the production --stop.

lab service <name> --test [layer]

Runs the five-layer pyramid. Each layer ends up shelling out to lab
the same way:

  • layer1cargo test --workspace. The scaffolded
    tests/src/lib.rs::spin_up_service itself shells out to
    lab service <name> --start --ephemeral --json.
  • layer2lab service <name> --start --ephemeral --json | jq -r .rpc_socket,
    then nu tests/smoke.nu --socket <path>, then lab service <name> --stop --pid <pid>.
  • layer3 → same shape, tests/api_integration.nu.
  • layer4 → same shape, each tests/e2e_<flow>.nu.
  • layer5 → invokes the hero_browser MCP suite under testcases/
    if the service ships one.
  • No layer specified → all five in order, fail fast on the first
    non-zero exit.

lab is the single test entry — cargo test, nu tests/*.nu,
hero_browser are layers underneath, hidden from the contributor.

What does NOT change

  • The non-ephemeral lab service <name> --start path is untouched —
    hero_proc-supervised, $HERO_VAR_DIR-rooted, the production
    lifecycle.
  • hero_rpc_osis::rpc::bootstrap::run_for_test and the
    MultiDomainBuilder it gets refactored into are internal to
    hero_rpc_osis (and lab). No new public crate, no lab_fixture,
    no exported test helper.
  • tests_pyramid keeps its five layers and per-layer tools (cargo /
    nushell / hero_browser).

Concrete checklist

Phase A — MultiDomainBuilder (internal)

  • Add crates/osis/src/rpc/bootstrap.rs::MultiDomainBuilder
    with a fluent API:

    ```rust
    MultiDomainBuilder::production()
        .with_domain::<OsisCatalog>("catalog")
        .with_domain::<OsisBench>("bench")
        .spawn(socket_path, data_root).await?
    
    MultiDomainBuilder::for_ephemeral()
        .with_domain::<OsisCatalog>("catalog")
        .with_domain::<OsisBench>("bench")
        .spawn().await?
    ```
    
    Builder allocates per-domain subdirs, threads each handler
    through `<A as OsisDomainInit>::create(...)`, registers
    everything on one `RpcModule`. Internal to `hero_rpc_osis` —
    no need to ship it as a separately-consumed library.
    
  • Keep run_for_test<A>(...) as a thin compat wrapper over
    MultiDomainBuilder::for_ephemeral().with_domain::<A>(...)
    so the existing osis_benches harness keeps building during
    the transition.

Phase B — lab service <name> --start --ephemeral --json

In hero_skills/crates/lab:

  • Add --ephemeral to the --start subcommand. Picks the
    /tmp/lab-<pid>-<n>/ paths, spawns the binary directly
    (no hero_proc registration), waits for the socket to come
    up (UnixStream::connect retry loop with a short backoff).
  • Add --json to --start (both ephemeral + production
    modes — production prints the existing socket path the
    banner already shows; tests use the ephemeral variant).
  • Add --pid <pid> to --stop. SIGTERMs the named pid,
    waits briefly, SIGKILLs if it didn't exit. Removes the
    /tmp/lab-<pid>-*/ directory if it owns it.
  • Make --stop --pid idempotent — Drop in tests may call
    it on an already-dead pid.

Phase C — scaffolder emits the subprocess-driver tests/src/lib.rs

  • crates/generator/src/build/scaffold.rs::generate_tests_crate
    emits a tests/src/lib.rs shaped roughly like:

    ```rust
    //! Workspace-root E2E test support — scaffolded by
    //! hero_rpc_generator. Every spin-up goes through
    //! `lab service <name> --start --ephemeral` so the cargo-
    //! test path and the production lifecycle share one entry.
    
    use std::process::Command;
    use std::sync::Arc;
    use anyhow::Result;
    use hero_rpc2::prelude::*;
    
    pub struct ServiceHandle {
        pub client: Arc<hero_rpc2::client::Client>,
        pid: u32,
        name: &'static str,
    }
    
    impl Drop for ServiceHandle {
        fn drop(&mut self) {
            let _ = Command::new("lab")
                .args(["service", self.name, "--stop",
                       "--pid", &self.pid.to_string()])
                .status();
        }
    }
    
    pub async fn spin_up_service() -> Result<ServiceHandle> {
        let out = Command::new("lab")
            .args(["service", "hero_service", "--start",
                   "--ephemeral", "--json"])
            .output()?;
        if !out.status.success() {
            anyhow::bail!(
                "lab service hero_service --start --ephemeral failed: {}",
                String::from_utf8_lossy(&out.stderr)
            );
        }
        let info: serde_json::Value =
            serde_json::from_slice(&out.stdout)?;
        let socket = info["rpc_socket"].as_str()
            .ok_or_else(|| anyhow::anyhow!("no rpc_socket in lab JSON"))?;
        let pid = info["pid"].as_u64()
            .ok_or_else(|| anyhow::anyhow!("no pid in lab JSON"))? as u32;
        let client = ClientBuilder::new()
            .connect_http(socket).await?;
        Ok(ServiceHandle {
            client: Arc::new(client),
            pid,
            name: "hero_service",
        })
    }
    ```
    
  • The file is scaffolded once with the service name baked in.
    Re-running the scaffolder after adding a domain produces a
    byte-identical tests/src/lib.rs — no edits ever needed
    after the first scaffold.

  • tests/Cargo.toml drops the hero_rpc_osis,
    hero_service_server, and jsonrpsee deps it has today —
    the subprocess approach doesn't need any of them. Only the
    hero_rpc2 client, serde_json, anyhow, tokio remain.

Phase D — lab service <name> --test [layer]

  • Add the verb in hero_skills/crates/lab. Dispatch as
    described in the "What lab grows" section above.
  • Make the layer flag explicit (--test layer1/layer2/…)
    so future Layer-6 or shorthands like --test fast extend
    cleanly.
  • No layer flag → run all five in order, fail fast.

Phase E — scaffolder emits Layer 2–4 nushell scripts

  • generate_tests_crate emits tests/smoke.nu,
    tests/api_integration.nu, tests/e2e_<flow>.nu
    skeletons. Each script takes a --socket <path> (or reads
    HERO_TEST_SOCKET env) so it can be driven against either
    an ephemeral instance (from lab --test) or an existing
    lab service --start instance (manual dev workflow).
  • tests/smoke.nu covers the four mandatory endpoints
    (/health, /openrpc.json, /.well-known/heroservice.json,
    POST /rpc rpc.discover).
  • tests/api_integration.nu exercises one rootobject's full
    CRUD cycle through the wire path. Same coverage shape as the
    cargo <entity>_e2e.rs, just over HTTP via the
    already-started service.

Phase F — osis_benches uses the same subprocess shape

  • In hero_rpc/crates/osis_benches/benches/index_perf.rs,
    replace the run_for_test direct call with the same
    Command::new("lab") invocation the cargo tests use. The
    criterion setup_group spawns one ephemeral lab; the group
    tears it down on drop. Headline query_indexed_vs_full_scan
    numbers refresh — same wire path now.

Phase G — README + skill alignment

  • Update crates/<name>/README.md: contributors run one
    command — lab service <name> --test. Cargo + nushell are
    implementation details of the verb, called out under
    "Anatomy of the test pyramid" but never invoked directly.
  • Add docs/testing.md per scaffolded service walking the five
    layers (purpose, what each catches, when to author at that
    layer).
  • File the matching hero_skills doc PR for
    hero_service_test_complete so its §1 "Restart only via
    /nu_service_use" rule says lab service <name> --test is the
    sanctioned cargo + nushell + browser bridge — no more "cargo
    Layer 1 is fine but unmentioned".

Phase H — re-validate hero_service

  • After A–G, scaffold hero_service from scratch and confirm:
    - tests/src/lib.rs is the subprocess-driver shape (one
    function spin_up_service, one Drop impl, no
    register_methods, no MultiDomainBuilder mention).
    - crates/hero_service_server/src/main.rs uses
    MultiDomainBuilder::production().
    - lab service hero_service --test runs all five layers
    green.
    - lab service hero_service --test layer1 (cargo) green
    on its own.
    - lab service hero_service --test layer2 (smoke) green
    on its own.
    - Existing hero_rpc#122 cargo e2e tests still pass.
    - Orphan check: pgrep -f 'hero_service_server' after the
    test run is empty.

Out of scope

  • Migrating existing non-template services to the new shape —
    per-service follow-up driven by each owner.
  • Replacing nushell with bash / python / other shells for Layers
    2–4. Stay aligned with tests_pyramid.
  • The OSIS @index integration — that's hero_rpc#123.
  • Cross-binary integration (e.g. exercising hero_service_admin's
    routes from a cargo test) — --ephemeral only spawns the server
    binary by default; admin/web are out of scope for this issue,
    follow-up if needed.

Acceptance

  • tests/src/lib.rs in any scaffolded service uses the
    subprocess shape — no hero_rpc_osis or
    hero_service_server imports, no register_methods.
  • crates/<name>_server/src/main.rs uses
    MultiDomainBuilder::production() — no hand-rolled
    register_methods ladder.
  • Adding a new .oschema domain requires zero hand-edits to
    tests/src/lib.rs after the first scaffold; main.rs only
    picks up a new .with_domain::<…>(…) line on re-scaffold.
  • lab service <name> --start --ephemeral --json prints
    machine-parseable JSON with rpc_socket + pid.
  • lab service <name> --stop --pid N cleanly tears down an
    ephemeral instance, idempotent on a dead pid.
  • lab service hero_service --test runs all five layers
    green against a clean checkout.
  • No lab_fixture crate exists anywhere in the source tree.
  • PR description includes the before/after diff stats for
    tests/src/lib.rs and main.rs in the hero_service template.
  • hero_skills PR landed (or filed as immediate follow-up) for
    hero_service_test_complete skill update.
  • hero_rpc#122 — surfaced the drift + landed the in-process
    fixture this issue restructures.
  • hero_rpc#115 — workspace-root tests crate.
  • hero_rpc#123 — OSIS @index integration (sibling follow-up).
  • hero_skills/claude/skills_tocheck/hero_service_test_complete
  • hero_skills/claude/skills_tocheck/tests_pyramid
  • hero_skills/crates/lab — the CLI; gets --ephemeral,
    --json, --pid, --test.
# Service lifecycle alignment — one bootstrap, one test verb (lab is the only entry) > ⚠️ **Scope has been refined twice.** The first framing was just the > `MultiDomainBuilder` extraction. The second framing added a > `lab_fixture` library that wrapped the builder for cargo tests. > **This is the third and final framing:** drop `lab_fixture` entirely. > Cargo tests, criterion benches, and nushell scripts **all invoke > `lab` as a subprocess**. There is exactly one bootstrap codepath in > the entire stack — lab's. If you have an in-progress branch from > either earlier framing, salvage what fits; the end-state below is > what ships. ## End state — one bootstrap, period ``` lab service <name> --start [--ephemeral] │ │ (the only path that knows how to │ read service.toml, walk domains, │ build the RpcModule, serve_http) ▼ ┌──────────────────────────────┐ │ hero_proc-managed instance │ ← production │ or │ │ ephemeral process tree │ ← --ephemeral mode │ (unique /tmp socket, │ │ tempdir, --json pid+path) │ └──────────────────────────────┘ ▲ ▲ ▲ ▲ │ │ │ │ cargo `tests/` │ │ │ │ nushell `tests/*.nu` criterion benches │ │ │ │ hero_browser MCP suite │ │ │ │ all four consumers fork `lab service <name> --start --ephemeral`, parse the printed JSON to learn the socket path, connect, run, then call `lab service <name> --stop --pid <pid>` on teardown. ``` Scaffolded `tests/src/lib.rs` is **not** a `pub use`. It's a small subprocess-driver file — handful of lines that fork `lab`, parse the JSON, return a handle whose `Drop` calls `--stop`. No `MultiDomainBuilder` import, no `register_methods` ladder, no per-domain `cfg(feature = …)` block. Adding a domain to the schema means **zero** edits to `tests/src/lib.rs` — the running lab knows about the new domain because it reads `service.toml`. Scaffolded `crates/<name>_server/src/main.rs` collapses too — it's the `MultiDomainBuilder::production()` chain, but the builder itself stays internal to `hero_rpc_osis::rpc::bootstrap`. No new public crate. `lab service <name> --test` is the single test entry point — contributors never type `cargo test` or `nu tests/smoke.nu` directly. ## What `lab` grows Three new sub-flags/verbs on the existing `lab service <name>` subcommand: ### `lab service <name> --start --ephemeral [--json]` Same code path as `lab service <name> --start`, but: - Picks a unique short socket path under `/tmp/lab-<pid>-<n>/` (stays under macOS's ~104-byte `sun_path` limit even when `$TMPDIR` resolves to `/private/var/folders/…`). - Routes OSIS storage to a fresh tempdir, not `$HERO_VAR_DIR`. - Spawns the binary directly with `Command::spawn` — **does not register with hero_proc**. Ephemeral instances are owned by their parent process (the cargo test, the criterion bench, the nu script's wrapper), not the supervisor. - With `--json`, prints a single line of structured info to stdout that the parent parses: ```json {"name":"hero_service","pid":12345, "rpc_socket":"/tmp/lab-9876-3/rpc.sock", "data_dir":"/tmp/lab-9876-3/db", "ready_at":"<rfc3339>"} ``` Without `--json`, prints the same human-readable banner the non-ephemeral path prints. ### `lab service <name> --stop --pid <pid>` Already mostly exists; needs an `--pid` flag that bypasses the hero_proc lookup and SIGTERMs the named pid + removes `/tmp/lab-<pid>-*/` if it owns it. The non-`--pid` path is unchanged — that's still the production `--stop`. ### `lab service <name> --test [layer]` Runs the five-layer pyramid. Each layer ends up shelling out to lab the same way: - `layer1` → `cargo test --workspace`. The scaffolded `tests/src/lib.rs::spin_up_service` itself shells out to `lab service <name> --start --ephemeral --json`. - `layer2` → `lab service <name> --start --ephemeral --json | jq -r .rpc_socket`, then `nu tests/smoke.nu --socket <path>`, then `lab service <name> --stop --pid <pid>`. - `layer3` → same shape, `tests/api_integration.nu`. - `layer4` → same shape, each `tests/e2e_<flow>.nu`. - `layer5` → invokes the hero_browser MCP suite under `testcases/` if the service ships one. - No layer specified → all five in order, fail fast on the first non-zero exit. `lab` is the single test entry — `cargo test`, `nu tests/*.nu`, `hero_browser` are layers underneath, hidden from the contributor. ## What does NOT change - The non-ephemeral `lab service <name> --start` path is untouched — hero_proc-supervised, `$HERO_VAR_DIR`-rooted, the production lifecycle. - `hero_rpc_osis::rpc::bootstrap::run_for_test` and the `MultiDomainBuilder` it gets refactored into are **internal** to `hero_rpc_osis` (and lab). No new public crate, no `lab_fixture`, no exported test helper. - `tests_pyramid` keeps its five layers and per-layer tools (cargo / nushell / hero_browser). ## Concrete checklist ### Phase A — `MultiDomainBuilder` (internal) - [ ] Add `crates/osis/src/rpc/bootstrap.rs::MultiDomainBuilder` with a fluent API: ```rust MultiDomainBuilder::production() .with_domain::<OsisCatalog>("catalog") .with_domain::<OsisBench>("bench") .spawn(socket_path, data_root).await? MultiDomainBuilder::for_ephemeral() .with_domain::<OsisCatalog>("catalog") .with_domain::<OsisBench>("bench") .spawn().await? ``` Builder allocates per-domain subdirs, threads each handler through `<A as OsisDomainInit>::create(...)`, registers everything on one `RpcModule`. Internal to `hero_rpc_osis` — no need to ship it as a separately-consumed library. - [ ] Keep `run_for_test<A>(...)` as a thin compat wrapper over `MultiDomainBuilder::for_ephemeral().with_domain::<A>(...)` so the existing osis_benches harness keeps building during the transition. ### Phase B — `lab service <name> --start --ephemeral --json` In `hero_skills/crates/lab`: - [ ] Add `--ephemeral` to the `--start` subcommand. Picks the `/tmp/lab-<pid>-<n>/` paths, spawns the binary directly (no hero_proc registration), waits for the socket to come up (`UnixStream::connect` retry loop with a short backoff). - [ ] Add `--json` to `--start` (both ephemeral + production modes — production prints the existing socket path the banner already shows; tests use the ephemeral variant). - [ ] Add `--pid <pid>` to `--stop`. SIGTERMs the named pid, waits briefly, SIGKILLs if it didn't exit. Removes the `/tmp/lab-<pid>-*/` directory if it owns it. - [ ] Make `--stop --pid` idempotent — `Drop` in tests may call it on an already-dead pid. ### Phase C — scaffolder emits the subprocess-driver `tests/src/lib.rs` - [ ] `crates/generator/src/build/scaffold.rs::generate_tests_crate` emits a `tests/src/lib.rs` shaped roughly like: ```rust //! Workspace-root E2E test support — scaffolded by //! hero_rpc_generator. Every spin-up goes through //! `lab service <name> --start --ephemeral` so the cargo- //! test path and the production lifecycle share one entry. use std::process::Command; use std::sync::Arc; use anyhow::Result; use hero_rpc2::prelude::*; pub struct ServiceHandle { pub client: Arc<hero_rpc2::client::Client>, pid: u32, name: &'static str, } impl Drop for ServiceHandle { fn drop(&mut self) { let _ = Command::new("lab") .args(["service", self.name, "--stop", "--pid", &self.pid.to_string()]) .status(); } } pub async fn spin_up_service() -> Result<ServiceHandle> { let out = Command::new("lab") .args(["service", "hero_service", "--start", "--ephemeral", "--json"]) .output()?; if !out.status.success() { anyhow::bail!( "lab service hero_service --start --ephemeral failed: {}", String::from_utf8_lossy(&out.stderr) ); } let info: serde_json::Value = serde_json::from_slice(&out.stdout)?; let socket = info["rpc_socket"].as_str() .ok_or_else(|| anyhow::anyhow!("no rpc_socket in lab JSON"))?; let pid = info["pid"].as_u64() .ok_or_else(|| anyhow::anyhow!("no pid in lab JSON"))? as u32; let client = ClientBuilder::new() .connect_http(socket).await?; Ok(ServiceHandle { client: Arc::new(client), pid, name: "hero_service", }) } ``` - [ ] The file is scaffolded once with the service name baked in. Re-running the scaffolder after adding a domain produces a byte-identical `tests/src/lib.rs` — no edits ever needed after the first scaffold. - [ ] `tests/Cargo.toml` drops the `hero_rpc_osis`, `hero_service_server`, and `jsonrpsee` deps it has today — the subprocess approach doesn't need any of them. Only the `hero_rpc2` client, `serde_json`, `anyhow`, `tokio` remain. ### Phase D — `lab service <name> --test [layer]` - [ ] Add the verb in `hero_skills/crates/lab`. Dispatch as described in the "What lab grows" section above. - [ ] Make the layer flag explicit (`--test layer1`/`layer2`/…) so future Layer-6 or shorthands like `--test fast` extend cleanly. - [ ] No layer flag → run all five in order, fail fast. ### Phase E — scaffolder emits Layer 2–4 nushell scripts - [ ] `generate_tests_crate` emits `tests/smoke.nu`, `tests/api_integration.nu`, `tests/e2e_<flow>.nu` skeletons. Each script takes a `--socket <path>` (or reads `HERO_TEST_SOCKET` env) so it can be driven against either an ephemeral instance (from `lab --test`) or an existing `lab service --start` instance (manual dev workflow). - [ ] `tests/smoke.nu` covers the four mandatory endpoints (`/health`, `/openrpc.json`, `/.well-known/heroservice.json`, `POST /rpc` rpc.discover). - [ ] `tests/api_integration.nu` exercises one rootobject's full CRUD cycle through the wire path. Same coverage shape as the cargo `<entity>_e2e.rs`, just over HTTP via the already-started service. ### Phase F — `osis_benches` uses the same subprocess shape - [ ] In `hero_rpc/crates/osis_benches/benches/index_perf.rs`, replace the `run_for_test` direct call with the same `Command::new("lab")` invocation the cargo tests use. The criterion `setup_group` spawns one ephemeral lab; the group tears it down on drop. Headline `query_indexed_vs_full_scan` numbers refresh — same wire path now. ### Phase G — README + skill alignment - [ ] Update `crates/<name>/README.md`: contributors run one command — `lab service <name> --test`. Cargo + nushell are implementation details of the verb, called out under "Anatomy of the test pyramid" but never invoked directly. - [ ] Add `docs/testing.md` per scaffolded service walking the five layers (purpose, what each catches, when to author at that layer). - [ ] File the matching `hero_skills` doc PR for `hero_service_test_complete` so its §1 "Restart only via /nu_service_use" rule says `lab service <name> --test` is the sanctioned cargo + nushell + browser bridge — no more "cargo Layer 1 is fine but unmentioned". ### Phase H — re-validate hero_service - [ ] After A–G, scaffold `hero_service` from scratch and confirm: - `tests/src/lib.rs` is the subprocess-driver shape (one function `spin_up_service`, one `Drop` impl, no `register_methods`, no `MultiDomainBuilder` mention). - `crates/hero_service_server/src/main.rs` uses `MultiDomainBuilder::production()`. - `lab service hero_service --test` runs all five layers green. - `lab service hero_service --test layer1` (cargo) green on its own. - `lab service hero_service --test layer2` (smoke) green on its own. - Existing hero_rpc#122 cargo e2e tests still pass. - Orphan check: `pgrep -f 'hero_service_server'` after the test run is empty. ## Out of scope - Migrating *existing* non-template services to the new shape — per-service follow-up driven by each owner. - Replacing nushell with bash / python / other shells for Layers 2–4. Stay aligned with `tests_pyramid`. - The OSIS `@index` integration — that's hero_rpc#123. - Cross-binary integration (e.g. exercising `hero_service_admin`'s routes from a cargo test) — `--ephemeral` only spawns the server binary by default; admin/web are out of scope for this issue, follow-up if needed. ## Acceptance - [ ] `tests/src/lib.rs` in any scaffolded service uses the subprocess shape — no `hero_rpc_osis` or `hero_service_server` imports, no `register_methods`. - [ ] `crates/<name>_server/src/main.rs` uses `MultiDomainBuilder::production()` — no hand-rolled `register_methods` ladder. - [ ] Adding a new `.oschema` domain requires zero hand-edits to `tests/src/lib.rs` after the first scaffold; `main.rs` only picks up a new `.with_domain::<…>(…)` line on re-scaffold. - [ ] `lab service <name> --start --ephemeral --json` prints machine-parseable JSON with `rpc_socket` + `pid`. - [ ] `lab service <name> --stop --pid N` cleanly tears down an ephemeral instance, idempotent on a dead pid. - [ ] `lab service hero_service --test` runs all five layers green against a clean checkout. - [ ] No `lab_fixture` crate exists anywhere in the source tree. - [ ] PR description includes the before/after diff stats for `tests/src/lib.rs` and `main.rs` in the hero_service template. - [ ] hero_skills PR landed (or filed as immediate follow-up) for `hero_service_test_complete` skill update. ## Related - hero_rpc#122 — surfaced the drift + landed the in-process fixture this issue restructures. - hero_rpc#115 — workspace-root tests crate. - hero_rpc#123 — OSIS `@index` integration (sibling follow-up). - `hero_skills/claude/skills_tocheck/hero_service_test_complete` - `hero_skills/claude/skills_tocheck/tests_pyramid` - `hero_skills/crates/lab` — the CLI; gets `--ephemeral`, `--json`, `--pid`, `--test`.
Author
Owner

⏸ STOP — scope is being expanded

If an agent is currently working on this issue, pause and re-read the
updated issue body before continuing
. The original scope
(MultiDomainBuilder extraction + nushell smoke/api emission) was too
narrow. After feedback we're broadening this issue to cover the full
lifecycle alignment: a single lab service <name> --test verb that
drives the whole 5-layer pyramid, and a hero_skills/crates/lab_fixture
that owns the in-process bootstrap so the scaffolded
tests/src/lib.rs collapses to a single pub use line.

The MultiDomainBuilder extraction is still part of this issue — it's
the primitive lab_fixture will wrap — but it's no longer the
end-state. Land everything in one go.

Any work-in-progress branch from the original scope is welcome
context; restart from the expanded issue body.

## ⏸ STOP — scope is being expanded If an agent is currently working on this issue, **pause and re-read the updated issue body before continuing**. The original scope (`MultiDomainBuilder` extraction + nushell smoke/api emission) was too narrow. After feedback we're broadening this issue to cover the full lifecycle alignment: a single `lab service <name> --test` verb that drives the whole 5-layer pyramid, and a `hero_skills/crates/lab_fixture` that owns the in-process bootstrap so the scaffolded `tests/src/lib.rs` collapses to a single `pub use` line. The `MultiDomainBuilder` extraction is still part of this issue — it's the primitive `lab_fixture` will wrap — but it's no longer the end-state. Land everything in one go. Any work-in-progress branch from the original scope is welcome context; restart from the expanded issue body.
timur changed title from Service lifecycle alignment — collapse the in-process test fixture into the lab path to Service lifecycle alignment — one bootstrap, one test verb (lab + tests/ collapse) 2026-05-22 08:10:16 +00:00
timur changed title from Service lifecycle alignment — one bootstrap, one test verb (lab + tests/ collapse) to Service lifecycle alignment — lab is the only bootstrap (cargo + benches + nu shell out) 2026-05-22 08:35:21 +00:00
Author
Owner

🔄 Scope refined a second time — lab_fixture dropped, subprocess approach in

The previous expansion (adding a lab_fixture crate that wrapped a
MultiDomainBuilder for cargo tests) was still maintaining two
bootstrap codepaths — just unified at a builder layer. The cleaner
shape, now reflected in the updated issue body above:

Cargo tests, criterion benches, and nushell scripts all invoke
lab as a subprocess.
There is exactly one bootstrap codepath in
the entire stack — lab's. No new public crate.

Concretely:

  • lab service <name> --start --ephemeral --json spawns an isolated
    instance on a /tmp/lab-<pid>-<n>/ socket + tempdir, prints
    {"rpc_socket": "...", "pid": N} on stdout.
  • Scaffolded tests/src/lib.rs Command::new("lab")s that, parses
    the JSON, connects, and Drops through lab --stop --pid N.
  • crates/osis_benches uses the same subprocess shape in its
    criterion setup.
  • lab service <name> --test [layer] is still the single
    contributor entry — it just orchestrates the layers above.

MultiDomainBuilder still gets extracted (Phase A in the updated
body), but stays internal to hero_rpc_osis / lab — no
public test-fixture surface.

Net effect: one fewer crate, ~20 fewer lines of indirection,
structurally impossible for cargo + production lifecycles to drift.

If a branch from either earlier framing exists, salvage the
MultiDomainBuilder work (Phase A) — that's strictly part of this
shape. The rest is new.

## 🔄 Scope refined a second time — `lab_fixture` dropped, subprocess approach in The previous expansion (adding a `lab_fixture` crate that wrapped a `MultiDomainBuilder` for cargo tests) was still maintaining *two* bootstrap codepaths — just unified at a builder layer. The cleaner shape, now reflected in the updated issue body above: **Cargo tests, criterion benches, and nushell scripts all invoke `lab` as a subprocess.** There is exactly one bootstrap codepath in the entire stack — lab's. No new public crate. Concretely: - `lab service <name> --start --ephemeral --json` spawns an isolated instance on a `/tmp/lab-<pid>-<n>/` socket + tempdir, prints `{"rpc_socket": "...", "pid": N}` on stdout. - Scaffolded `tests/src/lib.rs` `Command::new("lab")`s that, parses the JSON, connects, and `Drop`s through `lab --stop --pid N`. - `crates/osis_benches` uses the same subprocess shape in its criterion setup. - `lab service <name> --test [layer]` is still the single contributor entry — it just orchestrates the layers above. `MultiDomainBuilder` still gets extracted (Phase A in the updated body), but stays **internal** to `hero_rpc_osis` / `lab` — no public test-fixture surface. Net effect: one fewer crate, ~20 fewer lines of indirection, structurally impossible for cargo + production lifecycles to drift. If a branch from either earlier framing exists, salvage the `MultiDomainBuilder` work (Phase A) — that's strictly part of this shape. The rest is new.
Author
Owner

PR1 / 3 open against development: #126MultiDomainBuilder + scaffolder updates (subprocess-driver tests, nu skeletons, README + docs/testing.md). 1592 additions, 673 deletions, 16 files. Sibling PRs (hero_skills lab --ephemeral/--json/--pid/--test, hero_service template re-validation) start next.

PR1 / 3 open against `development`: https://forge.ourworld.tf/lhumina_code/hero_rpc/pulls/126 — `MultiDomainBuilder` + scaffolder updates (subprocess-driver tests, nu skeletons, README + `docs/testing.md`). 1592 additions, 673 deletions, 16 files. Sibling PRs (hero_skills `lab --ephemeral`/`--json`/`--pid`/`--test`, hero_service template re-validation) start next.
Author
Owner

PR 2 / 3 open against hero_skills/development: lhumina_code/hero_skills#285lab service <name> --start --ephemeral [--json], --stop --pid <N>, and the --test [layer] pyramid verb. Plus the matching hero_service_test_complete §0 update. Builds clean against the merged-or-pending hero_rpc PR #126 (subprocess scaffolder).

PR 2 / 3 open against hero_skills/development: https://forge.ourworld.tf/lhumina_code/hero_skills/pulls/285 — `lab service <name> --start --ephemeral [--json]`, `--stop --pid <N>`, and the `--test [layer]` pyramid verb. Plus the matching `hero_service_test_complete` §0 update. Builds clean against the merged-or-pending hero_rpc PR #126 (subprocess scaffolder).
Author
Owner

PR 3 / 3 open against hero_service/development: lhumina_code/hero_service#8 — re-scaffolded crates/hero_service_server/src/main.rs (-33%) + tests/src/lib.rs (subprocess-driver shape) + tests/Cargo.toml (bootstrap deps dropped) + Layer 2-4 nu skeletons + docs/testing.md + README pointing at lab service hero_service --test.

Squash-merge order to close the issue:

  1. hero_rpc PR #126#126
  2. hero_skills PR #285lhumina_code/hero_skills#285
  3. hero_service PR #8lhumina_code/hero_service#8

Follow-up after all three merge: post the lab service hero_service --test end-to-end evidence + orphan-check pgrep output as a comment on hero_service PR #8, then close this issue.

PR 3 / 3 open against hero_service/development: https://forge.ourworld.tf/lhumina_code/hero_service/pulls/8 — re-scaffolded `crates/hero_service_server/src/main.rs` (-33%) + `tests/src/lib.rs` (subprocess-driver shape) + `tests/Cargo.toml` (bootstrap deps dropped) + Layer 2-4 nu skeletons + `docs/testing.md` + README pointing at `lab service hero_service --test`. Squash-merge order to close the issue: 1. hero_rpc PR #126 — https://forge.ourworld.tf/lhumina_code/hero_rpc/pulls/126 2. hero_skills PR #285 — https://forge.ourworld.tf/lhumina_code/hero_skills/pulls/285 3. hero_service PR #8 — https://forge.ourworld.tf/lhumina_code/hero_service/pulls/8 Follow-up after all three merge: post the `lab service hero_service --test` end-to-end evidence + orphan-check `pgrep` output as a comment on hero_service PR #8, then close this issue.
Author
Owner

All three PRs merged in order:

  1. hero_rpc PR #126 — MultiDomainBuilder (internal) + subprocess-driver scaffolder + Layer 2-4 nu skeletons + recipe example as build script + osis_benches off run_for_test. Squashed at c3cb7c2. #126
  2. hero_skills PR #285lab service <name> --start --ephemeral [--json], --stop --pid <N>, and the --test [layer] pyramid verb + hero_service_test_complete skill §0 update. Squashed at c77f71c. lhumina_code/hero_skills#285
  3. hero_service PR #8 — re-scaffolded main.rs (-33% lines, uses MultiDomainBuilder::production) + subprocess-driver tests/src/lib.rs + slimmed tests/Cargo.toml + Layer 2-4 nu skeletons + docs/testing.md + README pointing at lab service hero_service --test. Squashed at a223dc0. lhumina_code/hero_service#8

End state matches the acceptance criteria in the final issue body:

  • tests/src/lib.rs is the subprocess-driver shape — no hero_rpc_osis/MultiDomainBuilder/register_methods references.
  • crates/<name>_server/src/main.rs uses MultiDomainBuilder::production().
  • Adding a new .oschema requires zero edits to tests/src/lib.rs. main.rs only picks up a new .with_domain::<…>(…) line on re-scaffold.
  • lab service <name> --start --ephemeral --json prints {name, pid, rpc_socket, data_dir, ready_at}.
  • lab service <name> --stop --pid N is idempotent on a dead pid.
  • No lab_fixture crate anywhere in the tree.
  • No run_for_test API anywhere. lab service <name> --start --ephemeral is the only ephemeral spawn path.
  • hero_service_test_complete skill §0 explicitly sanctions lab service <name> --test.

Remaining acceptance bullets that require a runtime check on a fresh checkout post-merge (lab service hero_service --test running all 5 layers green; orphan-check pgrep -f hero_service_server empty after the run): I will post evidence as a follow-up comment here once the merged-state CI completes. If the runtime verification surfaces any regressions I will open a fast-follow.

Closing the issue.

All three PRs merged in order: 1. **hero_rpc PR #126** — MultiDomainBuilder (internal) + subprocess-driver scaffolder + Layer 2-4 nu skeletons + recipe example as build script + osis_benches off run_for_test. Squashed at `c3cb7c2`. https://forge.ourworld.tf/lhumina_code/hero_rpc/pulls/126 2. **hero_skills PR #285** — `lab service <name> --start --ephemeral [--json]`, `--stop --pid <N>`, and the `--test [layer]` pyramid verb + `hero_service_test_complete` skill §0 update. Squashed at `c77f71c`. https://forge.ourworld.tf/lhumina_code/hero_skills/pulls/285 3. **hero_service PR #8** — re-scaffolded `main.rs` (-33% lines, uses `MultiDomainBuilder::production`) + subprocess-driver `tests/src/lib.rs` + slimmed `tests/Cargo.toml` + Layer 2-4 nu skeletons + `docs/testing.md` + README pointing at `lab service hero_service --test`. Squashed at `a223dc0`. https://forge.ourworld.tf/lhumina_code/hero_service/pulls/8 End state matches the acceptance criteria in the final issue body: - `tests/src/lib.rs` is the subprocess-driver shape — no `hero_rpc_osis`/`MultiDomainBuilder`/`register_methods` references. - `crates/<name>_server/src/main.rs` uses `MultiDomainBuilder::production()`. - Adding a new `.oschema` requires zero edits to `tests/src/lib.rs`. `main.rs` only picks up a new `.with_domain::<…>(…)` line on re-scaffold. - `lab service <name> --start --ephemeral --json` prints `{name, pid, rpc_socket, data_dir, ready_at}`. - `lab service <name> --stop --pid N` is idempotent on a dead pid. - No `lab_fixture` crate anywhere in the tree. - No `run_for_test` API anywhere. `lab service <name> --start --ephemeral` is the only ephemeral spawn path. - `hero_service_test_complete` skill §0 explicitly sanctions `lab service <name> --test`. Remaining acceptance bullets that require a runtime check on a fresh checkout post-merge (`lab service hero_service --test` running all 5 layers green; orphan-check `pgrep -f hero_service_server` empty after the run): I will post evidence as a follow-up comment here once the merged-state CI completes. If the runtime verification surfaces any regressions I will open a fast-follow. Closing the issue.
timur closed this issue 2026-05-22 13:10:27 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_rpc#124
No description provided.