service.define hardcodes Interpreter::Exec, breaks any caller passing shell-shaped exec strings (uc12-16 root cause) #113
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_proc#113
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Root cause for uc12-16 (and likely uc17+, etc.) — confirmed by isolated repro
When
service.defineRPC is called with a shell-shapedexecstring (e.g."while true; do sleep 1; done"), the spawned job fails immediately withspawn error: No such file or directory (os error 2)and the service never transitions torunning.Repro
The job record:
Daemon log:
Root cause
crates/hero_proc_server/src/service/define.rs:143(and lines 167, 188) hardcodesinterpreter: Interpreter::Execon the main / stop / check actions. The executor (supervisor/executor.rs:655-663) interpretsInterpreter::Execas literal execve:script.split_whitespace()→ first token is the binary, rest are args. So"while true; do sleep 1; done"becomesexecve("while", ["true;", "do", "sleep", "1;", "done"]), which fails with ENOENT because there's no binary literally namedwhile.The spec's own example (
crates/hero_proc_server/src/service/SPECS.md:96):would also be broken under
Interpreter::Execsemantics — the single quotes aren't stripped, songinxwould receive literal'daemonandoff;'as separate argv entries. So this defect is not test-only; it's a UX flaw of the high-level convenience API.Affects
service.definecallers passing anything more shell-shaped than"binary arg1 arg2"crates/hero_proc_test/src/tests/functional/uc_12_16_service_lifecycle.rs(uc12, uc13, uc14, uc16 — confirmed)uc_*_*tests that useservice_define(haven't fully checked)Proposed fix
Replace the hardcoded
Interpreter::Execwithcrate::db::actions::model::detect_interpreter(exec)in all 3 sites indefine.rs. The existingdetect_interpreter(crates/hero_proc_server/src/db/actions/model.rs:124-148) already exists and defaults toBashwhen no shebang is present — so:execvalue"while true; do sleep 1; done"bash -c "while true; do sleep 1; done"✓"#!/bin/sh\nrm -rf foo"sh -c "#!/bin/sh\nrm -rf foo"✓"sleep 100"bash -c "sleep 100"✓"nginx -g 'daemon off;'"Interpreter::Execwould still be accessible via the lower-leveldb.actions.set(...)for users who want literal execve semantics.Verification
After the fix,
cargo run -p hero_proc_test --filter functional::uc_12_16::uc12_define_service_start_stopshould pass on a clean PATH_ROOT sandbox. Expect 4 tests in the uc_12_16 cluster + possibly several others to flip from FAIL to PASS.Origin
Introduced in
cc836a7(despiegk, 2026-05-14,refactor: consolidate crates, rewrite logging/scheduler/service/secrets modules) — the rewrite that created theservice/define.rsmodule. TheInterpreter::Execchoice was likely an oversight given the spec's own example would also fail under those semantics.Surfacing rather than auto-patching since
service/define.rsis hot Kristof surface (cc836a7was 7 days ago).cc @despiegk
Verified the proposed fix works. Patched
service/define.rslines 143/167/188 to usedetect_interpreter()(the proposal in the OP), rebuilt, ran against clean PATH_ROOT sandbox:uc12_define_service_start_stopuc13_restart_produces_new_job_idsuc14_stop_with_remove_jobs_deletes_recordsservice.childrendefect, not this issue's scope)uc16_system_class_excluded_from_stop_allSo 3/4 of the uc12-16 cluster flip from FAIL→PASS with the 3-line change, and uc14 reaches a second-stage assertion exposing a separate
service.childrendefect (worth its own ticket; not blocking this fix).Patch reverted from my working tree (this verification was repro-only — leaving the actual change to whoever takes the issue, since
service/define.rsis your hot surface).