Consolidate herolib_core::base + hero_lifecycle into one crate; replace HeroLifecycle/HeroRpcServer/HeroUiServer with ServiceRpcServer/ServiceAdminServer builders #142
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_rpc#142
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
Today Hero service authoring is split across three places:
herolib_core::base(hero_lib) —ServiceTomlschema,validate_service_toml,handle_info_flag,print_startup_banner,prepare_sockets, socket/path resolution,service_base!()macro.hero_lifecycle(this repo) —HeroLifecycle(hero_proc install/start/stop),HeroRpcServer/HeroUiServer(axum UDS serve helpers),serve_unix,shutdown_signal, mandatory-endpoint injection,test::*harness, and ahero_lifecycleinspection CLI binary.lab(hero_skills/crates/lab) — calls<bin> --info, parsesservice.toml, registers with hero_proc, runs install/start/stop. This is the canonical lifecycle owner perhero_service_implementation.mdandhero_service_refactor.md.The
hero_service_refactor.mdskill explicitly namesHeroLifecycle,HeroRpcServer, andHeroUiServeras legacy patterns to retire.labalready owns lifecycle, so the per-binary--start/--stop/installflags inhero_lifecycleare duplicating lab's job.At the same time,
herolib_core::baseonly covers startup primitives. Every Hero service still hand-rolls the UDS accept loop, socket permissions, shutdown signal handling, mandatory/health+/openrpc.json+/.well-known/heroservice.jsonendpoints, and per-test integration harness. That boilerplate is ~60 lines per service today and easy to drift on.Goal
Merge everything service-authoring-related into one crate,
hero_lifecycle, structured around two clean builder entry points (ServiceRpcServer,ServiceAdminServer). Drop thehero_proc_sdkdependency entirely (lab owns proc registration). Update the canonical skills to point at the consolidated crate.Crate structure after the change
hero_lifecyclebecomes the single "Hero service authoring + inspection kit":No
hero_proc_sdkdependency. Nolifecycle.rs, noservice.rs, nocli.rs, nobin/main.rs.Builder API
Typestate two-stage builder.
.build()runs the pre-startup chain (validate → handle-info → banner) and exits the process on--info..serve(router)consumes the prepared server, does prepare_sockets + bind + 0o770 + accept + shutdown + cleanup.RPC backend (
kind = "server")ServiceRpcServerinternally mergesmandatory_router(name, spec)(the three required endpoints) over the user router before binding.Admin dashboard (
kind = "admin")Same shape, no
openrpc()(admin doesn't expose JSON-RPC), no mandatory-endpoint injection:Name is
ServiceAdminServer, notServiceUiServer— aligns with the canonicalkind = "admin"closed-enum value._uiis legacy perhero_service_refactor.md.Raw escape hatch
Services that need raw control (e.g.
hero_proc_serverrunning under screen) call the base primitives directly — they stay public:Cargo dependency aliasing for readability
Crate name stays
hero_lifecycle(pernaming_conventionskill —hero_*prefix is mandatory for top-level packages). Consumer crates can alias the import for cleaner reads:Then
use lifecycle::ServiceRpcServer;everywhere.Inspection commands →
labThe
hero_lifecyclebinary (list,health,test,discover,run,stop,status) is deleted. Cross-service introspection moves intolab:lab service --list— walk$HERO_SOCKET_DIR, print every*.socklab service --health [name|--all]— GET/healthon one or alllab service --test [name|--all]—assert_standard_endpointson one or alllab service --discover <name>— pretty-print/openrpc.jsonLab depends on
hero_lifecycle(just the library) and calls intohero_lifecycle::test::*for the assertions. No duplicated implementation.run/stop/statusare already in lab as--start/--stop/--status. No need to re-add.Migration plan (file-by-file)
hero_libsidecrates/core/src/base/{mod.rs, service.rs, README.md}intohero_rpc/crates/hero_lifecycle/src/base.rs(+ split as needed).path_root,path_var,path_code,hero_socket_dir,hero_bin,hero_var_dir) and theservice_base!()macro into the samehero_lifecyclecrate.print_startup_banner'sextras: &[(&str, &str)]parameter tometadata: &[(&str, &str)].#[deprecated]shim inherolib_core::basethat re-exports fromhero_lifecycle::baseso the migration can be staged across consumer repos. Targeted for one release cycle, then deleted.hero_rpc/crates/hero_lifecycle/sidelifecycle.rs(HeroLifecycle, CompanionBinary).service.rs(HeroService, HeroServices, ServiceKind).cli.rs(LifecycleCommand).src/bin/main.rs(binary; functionality moves to lab).hero_proc_sdkfromCargo.toml.hero_server.rsintorpc.rs+admin.rs+serve.rs: keepserve_unix,shutdown_signal, socket-perms 0o770 setup, mandatory-endpoint router builder. Drop.run()/.run_simple()/.run_raw()wrappers (lifecycle branching gone).test.rsas-is — it's pure library, no proc dep.service.tomlfor the crate to drop the[[binaries]]entry (no more binary). Or removeservice.tomlentirely if nokind = "cli"left.Generator side (
crates/generator/src/build/scaffold.rs)use hero_lifecycle::HeroLifecycle;+HeroLifecycle::new(...).start()+--start/--stoparg sniffing ingenerate_server_main_rs(currently lines ~1103–1142) with the newServiceRpcServer::new(...).build().serve(...)pattern.ServiceAdminServer).hero_lifecycle = { ... }Cargo dep emission sincehero_rpc_osis::rpc::bootstrap::MultiDomainBuilderalready covers the canonical scaffolded server. Decide: do we keepMultiDomainBuilder::productionor fold it intoServiceRpcServer? (Both do the same thing — the former is OSIS-specific glue, the latter is the new generic primitive. Likely keep MultiDomainBuilder calling into ServiceRpcServer internally so OSIS-specific orchestration stays in OSIS.)Downstream consumers
Not in scope for this issue — each downstream repo migrates on its own cadence under
hero_service_refactor.md. Confirmed consumers (grep -r hero_lifecycleacross lhumina_code):hero_proxy,hero_osis,hero_compute,hero_os,hero_router,hero_voice,hero_service(template),hero_inspector,hero_index_ui_old,hero_indexer_ui,hero_auth(+ a few archived copies).The
#[deprecated]re-export shim inherolib_core::basemeans existing builds keep working while each repo gets its own refactor PR.Skill / spec updates needed
Update after the code lands:
hero_service_implementation— replace the hand-wiredmain.rspatterns with the newServiceRpcServer/ServiceAdminServerbuilder examples.hero_service_refactor— add a step pointing atServiceRpcServer/ServiceAdminServeras the replacement forHeroLifecycle/HeroRpcServer/HeroUiServer.hero_service_check— audit rules update to accept the new builder pattern; flag any service still usingHeroLifecycledirectly.hero_proc_service_singlebin— currently documents per-binary--start/--stopflags. Rewrite to say binary runs plain; lab owns install/start/stop.hero_proc_service_selfstart— same: multi-component registration moves to lab parsing each binary's--info.hero_service_scaffold— confirm the scaffolder emits the new pattern.Alignment check (already validated against pulled
hero_skills@development)naming_conventionhero_lifecyclekeepshero_*prefix;ServiceRpcServer/ServiceAdminServermirrorkind = "server"/"admin".hero_service_implementationbase::*fn in the mandated order; nothing hand-rolled.hero_service_checkfn print_info_json/fn print_startup_info/ hand-rolled socket cleanup in scaffolded main.rs.hero_service_toml_infoServiceTomlschema preserved verbatim; closed enums unchanged.hero_socketsprepare_socketshonours$HERO_SOCKET_DIRlayout; admin getsadmin.sock.rust_shutdownsignalsserve().lab/SPECS_PROC_STARTservice.tomlvia--info; lab parses and registers with hero_proc; no--start/--stopflags on the binary.Acceptance criteria
hero_lifecyclecrate contains the consolidated module set (base,serve,rpc,admin,test).hero_lifecycle/Cargo.tomlno longer depends onhero_proc_sdk.crates/hero_lifecycle/src/{lifecycle.rs, service.rs, cli.rs, bin/main.rs}deleted.herolib_core::baseships as a#[deprecated]re-export shim fromhero_lifecycle::base(companion PR inhero_lib).crates/generator/src/build/scaffold.rsemitsServiceRpcServer/ServiceAdminServerbuilders, noHeroLifecycle::new(...).lab service --list / --health / --test / --discoverwork; thehero_lifecyclebinary is gone.cargo build --workspaceis clean.lab service <name> --install && lab service <name> --start.Out of scope
hero_osis,hero_compute,hero_os, etc.) — those are individual PRs underhero_service_refactor.#[deprecated]shim inherolib_core::base— happens one release cycle after this lands.Read this against the just-pulled
hero_skills@development. Two notes after that pass:1.
admin.socksettled (no change needed)The alignment-check line "admin gets
admin.sock" is correct. Verified acrosshero_sockets.md§3.2,hero_service_refactor.md(lines 31/109/129),hero_service_check.md,hero_service_toml_info.md:300,naming_convention.md,hero_router.md,hero_libs.md,hero_voice_widget.md,hero_ui_dashboard_implementation.md,hero_ui_openrpc_proxy.md,hero_browser.md,hs_service_scaffold.md— every one usesadmin.sock. Onlyui.sockreferences are explicit legacy-migration callouts (hero_service_refactor.md:31). Outlier worth a cleanup pass:rust_hero_repo_create.md:188still mentionsui.sock.2. Env-var spelling glitch in
hero_sockets.mdhero_sockets.md§1 documents the root as$PATH_VAR/socketsbut the Rust snippet readsstd::env::var("PATH_VAR_sockets")— that env var name doesn't appear anywhere in the implementation. The actual resolver hierarchy (perlab/service/ephemeral.rs:178-187andherolib_core::base) isPATH_ROOT→PATH_VAR→HERO_SOCKET_DIR, with lab clearing all four when it pivots into an ephemeral scratch dir.Since this PR makes
hero_lifecycle::basethe single owner of the resolver, it's the right moment to pick a canonical env-var name (recommendPATH_VARto match the §1 path) and update both code and skill in lockstep.3. Suggested addition:
peer_socket_pathhelperThe new
ServiceRpcServer/ServiceAdminServerbuilders fix the bind half (everyone routes throughprepare_sockets, so a binary can no longer hand-roll a socket directory). They don't cover the connect half — e.g. a UI binary connecting to its sibling RPC server.Today the scaffolder emits
crates/generator/src/build/scaffold.rs:3313-3314:Which is wrong: the server's
service.tomldeclares{svc}/rpc.sock, not{svc}_server/rpc.sock(the_serversuffix violateshero_sockets.md§2). The UI as scaffolded today doesn't connect.Adding a single helper next to
prepare_sockets:…closes the loop. After this lands, the scaffolded
AppState::from_envbecomespeer_socket_path(SERVICE_NAME, "rpc")and consumer code never spells a directory name again. Worth pulling into #142 sincehero_lifecycle::baseis exactly the right home for it.A companion follow-up issue (just opened) covers the orthogonal scaffolder / generator cleanups: the broken
AppState::from_envconnect, the hardcoded/tmp/herozero_*_e2edata dir ingenerate/e2e.rs, and the legacyhero_indexer_server.sockprobe fallback infind_tests_emit.rs+benches/index_perf.rs. None of those overlap with the consolidation work — they're just stale.No blocking concerns on the consolidation otherwise. Looking forward to it.
Landed on development as
50d132f.Out of scope for the followup:
Incorporated from the review comment: