now get required jobs to autostart when we restart hero_proc #23
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_proc#23
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
see also #22
when we restart hero_proc it needs to check which of the jobs are coming from action which is is_process
these need to be autostarted, we ofc do health checks and all these things
also check in UI (can also test using browser mcp)
if we delete a job, will it do a stop first?
maybe we need to popup a modal where we show logs how we stop the still running jobs, only needed if its jobs which are running
Implementation Spec for Issue #23 — Autostart
is_processJobs on Restart & Safe Job DeletionObjective
When
hero_proc_serverrestarts, it must automatically restart any jobs that came from services whose actions haveis_process = true— these are long-running processes that should always be running. Additionally, when a user deletes a running job from the UI, it must be stopped first, and if the job is actively running a process, a modal showing stop progress/logs should appear.Requirements
hero_proc_serverstartup, afterrecover_running_jobs(), query all jobs that are in a terminal phase (Failed,Cancelled) and haveis_process = trueand have a non-nullservice_idandaction_id, and the referenced service still exists in the DB → auto-create a newPendingjob for each such actionPending/Runningjob already exists)is_processrunning jobs, show an extra warning that this is a long-running service processjob.cancelfirst, thenjob.deleteafter cancel succeedsFiles to Modify
crates/hero_proc_lib/src/db/jobs/model.rslist_is_process_terminal_jobs()raw SQL functioncrates/hero_proc_lib/src/db/factory.rslist_process_jobs_needing_restart()onJobsApicrates/hero_proc_server/src/supervisor/mod.rsautostart_process_jobs()called afterrecover_running_jobs()inrun()crates/hero_proc_ui/static/js/dashboard.jsdeleteJob(),bulkDeleteJobs(),deleteJobFromModal()to stop-then-deleteImplementation Plan
Step 1 — DB query for is_process terminal jobs
File:
crates/hero_proc_lib/src/db/jobs/model.rsandfactory.rslist_is_process_terminal_jobs()returning jobs whereis_process=1ANDphase IN ('failed','cancelled')AND service/action IDs are setlist_process_jobs_needing_restart()onJobsApifollowing therunning_pids()patternDependencies: none
Step 2 — Implement
autostart_process_jobs()in SupervisorFile:
crates/hero_proc_server/src/supervisor/mod.rsasync fn autostart_process_jobs(&self)that queries terminal is_process jobs, checks for existing non-terminal jobs for same service+action, verifies service/action still exist, creates newPendingjob and logs itrun()afterrecover_running_jobs().awaitDependencies: Step 1
Step 3 — UI stop-then-delete for running jobs
File:
crates/hero_proc_ui/static/js/dashboard.jsdeleteJob(id)to fetch job, detect running phase, show appropriate warning modal, calljob.cancelthenjob.deletestopAndDeleteJob(id, job)helperbulkDeleteJobs()to count and warn about running jobs, stop them firstdeleteJobFromModal()to use same flowDependencies: none (independent of Steps 1-2)
Acceptance Criteria
Notes
handle_startinrpc/service.rsforce: trueinjob.deleteremains for safety even after cancelTest Results
Failure Details
The workspace build failed in
tests/integration/tests/hero_script.rsdue to 5 compiler errors. No test binaries were produced and no tests ran.File:
tests/integration/tests/hero_script.rshero_proc_sdk::JobLogsInputhas no fieldattemptlinesonJobLogsOutput— available field isvalueInterpreterInterpreterInterpreterRoot Cause
The integration test file
tests/integration/tests/hero_script.rsreferences SDK types/fields that have since been renamed or restructured:JobLogsInput.attemptno longer existsJobLogsOutput.lineswas renamed to.valueInterpretertype is not imported/declared in scopeThe test file needs to be updated to match the current
hero_proc_sdkAPI.Implementation Complete
Changes Made
crates/hero_proc_lib/src/db/jobs/model.rslist_is_process_terminal_jobs()— raw SQL query returning allis_process=1jobs infailed/cancelledphase withservice_idandaction_idsetcrates/hero_proc_lib/src/db/factory.rsJobsApi::list_process_jobs_needing_restart()— delegates to the new model function through the mutex-guarded connection (same pattern asrunning_pids())crates/hero_proc_server/src/supervisor/mod.rsautostart_process_jobs()async method onSupervisor:is_processjobsservice_id+action_idPendingjob copyingcontext_name,action,description,is_process,service_id,action_id,spec, andscripttracing::info!run()afterrecover_running_jobs().awaitcrates/hero_proc_ui/static/js/dashboard.jsstopAndDeleteJob(id, job)async helper — detects running phase, shows appropriate warning modal (extra warning foris_processjobs), callsjob.cancelthen waits 500ms, thenjob.deletedeleteJob(id)— fetches job, delegates tostopAndDeleteJobbulkDeleteJobs()— counts running jobs in selection, warns in confirm modal, stops them before deletingdeleteJobFromModal(id)— usescachedJobsto avoid redundant fetch, delegates tostopAndDeleteJobTest Results
hero_proc_libandhero_proc_serverpackages: ✅ all tests passtests/integrationpackage: ⚠️ pre-existing compile errors inhero_script.rs(unrelated to this change —JobLogsInput/JobLogsOutputfield names out of sync with SDK)Acceptance Criteria
is_processjobs infailed/cancelledphase are auto-queued asPendingPendingjobs createdinfomessage per autostarted jobis_processrunning jobs get extra warning messagehero_proc_libandhero_proc_servertests passImplementation committed:
bf4235cBrowse:
bf4235c