betters service manager #20
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_proc#20
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Spec: Service Lifecycle Refactor — Job Provenance & Cleanup
Status
This document describes a planned refactor. As of the current codebase, most items are not yet implemented. See the gap analysis in section 10 for details.
1. Goals
Functional goals
Main outcome
A service should behave like a managed runtime unit — not a loose collection of actions. Restarting a service should cleanly reset its prior jobs.
2. Current Domain Model
Service (
ServiceSpec)A named logical container with a list of action names, a wanted status (
Start/Stop/Ignore/Spec), class (User/System), and dependency definitions.Action (
ActionSpec)An executable operation with interpreter, script, environment, timeouts, retry policy, health checks, schedule policy, and signal handling. Can be invoked standalone or as part of a service.
Job (
Job)A persisted runtime execution record. Each job stores:
id(autoincrement)context_name(e.g."core")action(string — action name or"{service}.{action}"pattern)spec(embeddedActionSpec)script,phase,attempt, timestamps,exit_code,error,tags,pidMissing fields (spec requirement):
service_id,action_id— job-to-service/action linkage is currently done via string pattern matching on theactionfield.3. Current Implementation State
Service start (
service.start)Creates a single job for the service's first action. The job's
actionfield is set to"{service_name}.{action_name}". No cleanup of old jobs occurs.Service stop (
service.stop)Finds running jobs matching the service name via string prefix matching (
"{name}."or"{name}:"), then cancels them. Does not delete job records.Service restart (
service.restart)Calls stop, then start. No cleanup of old terminated jobs.
Service-to-job matching
Pattern-based: a job belongs to a service if
job.action == nameorjob.action.starts_with("{name}.")orjob.action.starts_with("{name}:"). This is fragile and has no database-level enforcement.4. Required Changes
4.1 Job provenance fields
Add to the
Jobstruct and jobs table:service_id(nullable) — links job to the service that created itaction_id(nullable or required) — links job to the action that created itAdd indexes:
idx_jobs_service_ididx_jobs_action_ididx_jobs_service_id_phase4.2 Service start with cleanup
Add
replace_existing_jobs: boolparameter toservice.start(default:true).When
replace_existing_jobs = true:job.service_id = service_id4.3 Centralized job creation
Job creation must set
service_idandaction_idin one central path — not scattered across callers.4.4 Standalone action execution
When an action runs outside a service:
action_idis setservice_idis null5. OpenRPC Changes
service.startAdd optional parameter:
replace_existing_jobs: bool(defaulttrue).service.stopAdd optional parameter:
remove_jobs: bool(defaultfalse). Not required for initial implementation.Existing methods (already implemented)
service.set,service.get,service.delete,service.list,service.list_fullservice.start,service.stop,service.restart,service.killservice.status,service.status_full,service.statsservice.children,service.is_running,service.why,service.treeservice.start_all,service.stop_all6. Database Migration
Schema changes
Legacy data
service_id/action_idservice_id7. Backend Logic
Service start flow (target)
service.start(name, replace_existing_jobs=true)replace_existing_jobs=true:service_idmatchesservice_idandaction_idsetFailure handling
8. UI Changes
Already implemented
Still needed
replace_existing_jobstoggle (default true is sufficient for most use cases)9. Acceptance Criteria
service_idandaction_id.action_idonly (nullservice_id).service.startwith default parameters removes old jobs for that service.replace_existing_jobswith defaulttrue.10. Gap Analysis
service_idfieldaction_idfieldreplace_existing_jobsparameter11. Implementation Order
service_idandaction_idcolumns + indexes (schema migration)replace_existing_jobsto OpenRPC spec andservice.starthandlertest
Implementation Spec for Issue #20: Service Lifecycle Refactor — Job Provenance & Cleanup
Objective
Add explicit
service_idandaction_idprovenance fields to the Job struct and database schema so that every job can be traced back to the service and action that created it. Use these fields to implement clean service restart behavior: when starting a service, old jobs belonging to that service are stopped and deleted before new ones are created.Requirements
service_id: Option<String>andaction_id: Option<String>fields to theJobstructservice_id TEXTandaction_id TEXTcolumns to thejobsSQLite table with appropriate indexesservice_idandaction_idat every job creation site (service.start, service.start_all, service.restart, quick_service.start, job.create)replace_existing_jobs: bool(defaulttrue) parameter toservice.start— when true, stop/kill running jobs for this service and delete all completed jobs before starting freshremove_jobs: bool(defaultfalse) parameter toservice.stop— when true, delete terminated jobs after stopping active onesservice_idlookupsJobFilterto support filtering byservice_idjob.create) setsaction_idonly;service_idremains nullFiles to Modify
crates/hero_proc_lib/src/db/jobs/model.rsservice_id/action_idto Job struct, schema DDL, all SQL operationscrates/hero_proc_lib/src/db/factory.rslist_by_service_id()anddelete_terminated_by_service()to JobsApicrates/hero_proc_server/src/rpc/service.rscrates/hero_proc_server/src/rpc/job.rsaction_idon standalone job creationcrates/hero_proc_server/src/rpc/quick_service.rscrates/hero_proc_server/openrpc.jsoncrates/hero_proc_server/src/web.rsservice_idfilter instead ofhero_proc_service_namecrates/hero_proc_server/src/rpc/debug.rsservice_idImplementation Plan
Step 1: Add
service_idandaction_idto Job schema and structJob,JobSummary,JobFilterstructsALTER TABLEfor existing databasesinsert_job,update_job,row_to_job,list_jobsSQLStep 2: Add convenience methods to JobsApi
list_by_service_id()— filter jobs by servicedelete_terminated_by_service()— delete completed/failed/cancelled jobsStep 3: Update job creation sites to populate provenance fields
service.start→ setservice_id+action_idservice.start_all→ setservice_id+action_idjob.create→ setaction_idonlyquick_service.start→ set both fieldsStep 4: Implement cleanup logic in service.start and service.stop
replace_existing_jobsparam (default true) inhandle_startremove_jobsparam (default false) inhandle_stopservice_idqueriesStep 5: Update OpenRPC spec
replace_existing_jobstoservice.startparamsremove_jobstoservice.stopparamsStep 6: Update web.rs and debug.rs
service_idfilter instead ofhero_proc_service_nameStep 7: Add tests for cleanup behavior
Acceptance Criteria
Jobstruct hasservice_id: Option<String>andaction_id: Option<String>fieldsservice.startpopulates provenance and cleans up old jobs by defaultservice.stopcan optionally delete terminated jobsjob.createsetsaction_idonlyJobFiltersupportsservice_idfilteringTest Results
Overall: FAILED
Ran
cargo test --workspace --exclude hero_proc_integration_tests(thehero_proc_integration_testscrate was excluded due to 6 compilation errors — missing fieldsreplace_existing_jobsandremove_jobsinServiceStartInput/ServiceStopInputstructs intests/integration/tests/dev_only.rs).Summary
Failure Details
hero_proc_lib::db::actions::model::tests::test_detect_interpreter_nushellThe nushell interpreter detection returns
Bashinstead ofNushell.Compilation Errors (excluded crate)
hero_proc_integration_testsfailed to compile with 6 errors —ServiceStartInputis missing fieldreplace_existing_jobsandServiceStopInputis missing fieldremove_jobsintests/integration/tests/dev_only.rs.Warnings
partition_pathincrates/hero_proc_lib/src/db/logs/store.rs:9ActionSpecincrates/hero_proc_lib/src/db/integration_tests.rs:614timestampfield incrates/hero_proc_integration_test/src/tests/stress.rs:24shell_escapefn intests/integration/src/fixtures.rs:84Implementation Summary
Changes Made
Core data model (
crates/hero_proc_lib/src/db/jobs/model.rs):service_id: Option<String>andaction_id: Option<String>toJob,JobSummary, andJobFilterstructsidx_jobs_service_id,idx_jobs_action_id,idx_jobs_service_phase)ALTER TABLEmigration for existing databasesinsert_job,update_job,row_to_job,list_jobs,get_jobservice_idandaction_idfilter support inlist_jobsJobsApi convenience methods (
crates/hero_proc_lib/src/db/factory.rs):list_by_service_id()— filter jobs by servicedelete_terminated_by_service()— delete completed/failed/cancelled jobs for a serviceService lifecycle cleanup (
crates/hero_proc_server/src/rpc/service.rs):service.start: Addedreplace_existing_jobsparameter (defaulttrue) — cancels active jobs and deletes terminated jobs before starting freshservice.stop: Addedremove_jobsparameter (defaultfalse) — deletes terminated jobs after stoppingservice_idqueries inservice_running_jobs(),service_last_terminal_state(),count_restarts(), andhandle_children()Job provenance at creation sites:
service.rs handle_start/start_all: Sets bothservice_idandaction_idjob.rs handle_create: Setsaction_idonly (standalone)job.rs handle_retry: Preserves provenance from original jobOpenRPC spec (
openrpc.json+ generated client):replace_existing_jobsparam toservice.startremove_jobsparam toservice.stopservice_idandaction_idtoJob,JobSummary, andJobFilterschemasOther updates:
web.rs,debug.rs: Useservice_idfilter instead ofhero_proc_service_nameMakefile: Exclude integration test crates frombuild/installdevtargetsTest Results
service_id,action_id) correctly populated ✅replace_existing_jobs=truecleans up old terminated jobs on restart ✅remove_jobs=trueon stop deletes all service jobs ✅service_management.rs:test_job_provenance_fieldstest_replace_existing_jobs_on_restarttest_stop_with_remove_jobstest_standalone_job_has_no_service_idImplementation committed:
49a6ff0Browse:
49a6ff0