Tag jobs and actions for bulk cleanup in hero_proc #10
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent: #8
Problem
Jobs and Actions triggered by hero_codescalers accumulate in hero_proc indefinitely. There is no mechanism to clean them up in bulk.
Solution
codescalers_<feature>_<input_hash>)$HERO_SOCKET_DIR/hero_proc/rpc.sock)submit_or_dedup_docs_jobwith action keydocs_<verb>_<hash>Relevant paths
hero_proc/crates/hero_proc_server/— where action/job management liveshero_proc/docs/hero_proc_openrpc/— check existing clear/delete methodshero_codescalers/crates/hero_codescalers_server/— where jobs are triggeredAcceptance Criteria
Implementation Spec for Issue #10 — jobs.cleanup RPC + CLI
Objective
Provide a single, admin-gated mechanism for bulk-removing every hero_proc job and action that hero_codescalers has ever enqueued, so hero_proc does not grow unbounded across repeated feature runs. Exposed as
jobs.cleanupover JSON-RPC and as ahero_codescalers jobs-cleanupCLI subcommand. Tagging itself is already in place (build_tags()injobs.rs); the missing piece is the cleanup operation that consumes those tags.Requirements
jobs::cleanup(state, predicate)inhero_codescalers_server.codescalertag — never let a caller use this method to nuke unrelated hero_proc state.kind,target,actor,older_than_ms.force=true), then delete the unique action specs they originated from. Action deletion is best-effort because the same action_id may be referenced by other jobs we do not own.{ jobs_deleted, actions_deleted, errors[] }.dry_run: boolflag — list what would be deleted, change nothing.jobs.delete,jobs.bulk,jobs.cancel.openrpc.json. The SDK regenerates automatically at build time via theopenrpc_client!macro.hero_codescalers jobs-cleanup [--kind <k>] [--target <t>] [--actor <a>] [--older-than <duration>] [--dry-run].Files to Modify/Create
crates/hero_codescalers_server/src/jobs.rs— addCleanupPredicatestruct,CleanupSummarystruct, andcleanup()async function.crates/hero_codescalers_server/src/main.rs— addjobs.cleanuparm todispatch, admin-gated.crates/hero_codescalers_server/openrpc.json— registerjobs.cleanupmethod with params/result schema;hero_codescalers_sdkregenerates automatically on nextcargo buildbecause it consumes this file via theopenrpc_client!macro.crates/hero_codescalers/src/main.rs— addJobsCleanupvariant toCommandsand its match arm inmain().Implementation Plan
Step 1: jobs::cleanup function
Files:
crates/hero_codescalers_server/src/jobs.rsNew input/output types:
Function signature:
Body, in order:
hero_proc_sdk::hero_proc_factory()(same pattern aslist()).pred.actor→format!("codescaler_{actor}")pred.kind→format!("codescaler_kind_{kind}")pred.target→format!("codescaler_target_{target}")"codescaler".hp.job_list(JobListInput { filter: Some(JobFilter { tag: Some(filter_tag), limit: Some(10_000), ..Default::default() }) }).JobSummary, post-filter in Rust:codescalerintags(defense-in-depth).pred.kindset: tags must containcodescaler_kind_<kind>.pred.targetset: tags must containcodescaler_target_<target>.pred.actorset: tags must containcodescaler_<actor>.pred.older_than_msset:j.created_at_ms.unwrap_or(i64::MAX) <= chrono::Utc::now().timestamp_millis() - older_than_ms.(id, action_id)pairs from matched jobs. Buildsummary.job_idsand aHashSet<String>of uniqueaction_idvalues.pred.dry_run: setsummary.dry_run = true, leave deletion counts at 0, return.hp.job_delete(JobDeleteInput { id, force: Some(true) }). On Ok bumpjobs_deleted. On Err push intosummary.errors.hp.action_delete(ActionDeleteInput { name: action_id, context: None }). Best-effort; failures go intoerrors.Ok(json!(summary)).Dependencies: none.
Step 2: Wire
jobs.cleanuparm in dispatchFiles:
crates/hero_codescalers_server/src/main.rsInsert after
"jobs.bulk"arm (~line 583):Dependencies: Step 1.
Step 3: Register
jobs.cleanupin openrpc.jsonFiles:
crates/hero_codescalers_server/openrpc.jsonInsert after
jobs.bulk(line 383), beforeactor.info(line 384):The
hero_codescalers_sdkregeneratesJobsCleanupInput/JobsCleanupOutputand thejobs_cleanup()method automatically on nextcargo buildvia theopenrpc_client!macro. No manual SDK changes.Dependencies: Step 2.
Step 4: CLI subcommand
jobs-cleanupFiles:
crates/hero_codescalers/src/main.rsAdd
Commands::JobsCleanupvariant with flags--kind,--target,--actor,--older-than,--dry-run. Add match arm callingclient.jobs_cleanup(...)and printing the result as pretty JSON. Add a smallparse_duration_ms()helper supportingms,s,m,h,d.Dependencies: Step 3 (so
hero_codescalers_sdk::JobsCleanupInputexists).Acceptance Criteria
codescaler+ per-feature tags (already done bybuild_tags()atjobs.rs:76).jobs.cleanupRPC method exists, is admin-gated, and is documented inopenrpc.json.jobs::cleanup()removes all matched jobs, then deletes their unique action specs from hero_proc.hero_codescalers jobs-cleanupworks end to end and prints the summary as pretty JSON.--dry-runreturns the same summary shape withjobs_deleted=0,actions_deleted=0, and the would-be-deleted ids enumerated.--kind,--target,--actor,--older-than) all narrow correctly.jobs.listreturns no codescaler jobs andstats.job_countdrops accordingly.Notes
codescaler*, the Rust post-filter re-checks the literalcodescalertag before deleting, mirroringlist()/job_value_has_codescaler_tag.action.deletemay legitimately fail (another job still references that spec); failures land insummary.errorswithout aborting cleanup.older_than_msfilter treats absentcreated_at_msconservatively (skip when set).JobFilter.tagis single-valued — most-specific-tag selection is what makes server-side narrowing work.JobSummary.action_id; codescalerenqueue()always sets a unique action name.Test Results
Workspace:
cargo test -p hero_codescalers_server -p hero_codescalers --no-fail-fasthero_codescalers, 5 inhero_codescalers_server)hero_codescalers_server::jobs::cleanup_testsfilter_tag_priority_actor_kind_target_default— verifies the server-side tag pushdown picksactor > kind > target > "codescaler".filter_tag_ignores_empty_strings— empty strings are treated the same asNone.keep_requires_codescaler_tag— defense-in-depth: a job missing the literalcodescalertag is never kept, even if hero_proc returned it.keep_narrows_by_kind_target_actor— Rust-side post-filter excludes mismatching kind/target/actor.keep_older_than_ms_skips_recent_and_undated— recent jobs are skipped; jobs with nocreated_at_msare skipped conservatively whenolder_than_msis set; left untouched when it is not.hero_codescalers::testsduration_units—0,250,250ms,3s,10m,2h,7dall parse to the right millisecond count.duration_errors— empty string, non-numeric, unknown unit (5y,10x) all return errors.Build
cargo build -p hero_codescalers_sdk -p hero_codescalers_server -p hero_codescalerssucceeds: the SDK macro picked upjobs.cleanupfromopenrpc.jsonand generatedJobsCleanupInput/jobs_cleanup()automatically.No new warnings introduced; the 13 pre-existing dead-code warnings in
geoip.rs,model/state.rs, andstore.rsare unchanged.Implementation Summary
Closes the
jobs.cleanuphalf of issue #10. Tagging itself was already in place —build_tags()atcrates/hero_codescalers_server/src/jobs.rs:76already producescodescaler,codescaler_<actor>,codescaler_kind_<kind>,codescaler_target_<t>for every enqueued job. What was missing is the cleanup operation that consumes those tags. This change adds it end-to-end.Changes
crates/hero_codescalers_server/src/jobs.rs— added:pub struct CleanupPredicate—kind,target,actor,older_than_ms,dry_run.pub struct CleanupSummary—jobs_deleted,actions_deleted,job_ids,action_ids,errors,dry_run.cleanup_filter_tag()(private, unit-tested) — picks the most-specific tag for the server-side pushdown filter (actor > kind > target > "codescaler").cleanup_keep()(private, unit-tested) — Rust-side post-filter; defense-in-depth re-check that the literalcodescalertag is present, plus narrowing on kind/target/actor/older_than_ms.pub async fn cleanup()— list viahp.job_list, post-filter, collect unique action ids into aBTreeSet, thenjob_delete(force=true)each job andaction_deleteeach unique action. Action delete failures are best-effort and recorded insummary.errors. Honorsdry_run(returns the matched-set without deleting).crates/hero_codescalers_server/src/main.rs— added the"jobs.cleanup"arm todispatch, admin-gated viarequire_admin(same auth model asjobs.delete,jobs.bulk,jobs.cancel).crates/hero_codescalers_server/openrpc.json— registered thejobs.cleanupmethod with full param/result schemas. The SDK regenerates automatically on nextcargo buildbecausehero_codescalers_sdkconsumes this file via theopenrpc_client!macro — no manual SDK code changes were required.crates/hero_codescalers/src/main.rs— added theJobsCleanupvariant and its handler:--older-thanacceptsNms,Ns,Nm,Nh,Nd. Output is the summary as pretty JSON.Tests
7 new unit tests, all passing (
cargo test -p hero_codescalers_server -p hero_codescalers):cleanup_filter_tagandcleanup_keep— predicate priority, empty-string handling, the codescaler-tag guard, kind/target/actor narrowing,older_than_mssemantics.parse_duration_ms— every accepted unit and the four error paths.Acceptance criteria
codescaler+ per-feature tags (already done bybuild_tags()).jobs.cleanupRPC method exists, admin-gated, documented inopenrpc.json.jobs::cleanup()removes matched jobs, then deletes their unique action specs.hero_codescalers jobs-cleanupworks end to end.--dry-runreturns the same summary shape with deletion counts at zero and the matched ids enumerated.--kind,--target,--actor,--older-than) all narrow correctly.Out of scope / follow-ups
summary.errorsrather than aborting cleanup.Closed — landed in PR #14 (merged)
Tagging itself was already in place via
build_tags()(crates/hero_codescalers_server/src/jobs.rs:76); every codescaler-triggered job carries:codescalercodescaler_<actor>codescaler_kind_<kind>codescaler_target_<t>(when target arg present)What was missing was the bulk-cleanup operation that consumes those tags. PR #14 adds:
pub async fn cleanup(state, pred) -> Result<Value>injobs.rs. Lists by tag (most-specific pushdown viaJobFilter.tag), post-filters in Rust onkind/target/actor/older_than_ms, deletes matching jobs (force=true), then best-effort deletes unique action specs. Returns{ jobs_deleted, actions_deleted, job_ids, action_ids, errors, dry_run }.jobs.cleanuparm indispatch, admin-gated likejobs.bulk/jobs.delete.openrpc.json; SDK regenerates at build time via theopenrpc_client!macro.hero_codescalers jobs-cleanup [--kind] [--target] [--actor] [--older-than <Nms|s|m|h|d>] [--dry-run].older_than_mssemantics, andparse_duration_ms.Bonus fix surfaced during verification:
hero_proc#50(merged) —spec.tagsweren't being copied ontoJob.tagsatjob.createtime, soJobSummary.tagswas alwaysnulland the tag filter returned nothing. Without that fix,jobs.cleanupcouldn't actually find anything to clean.Verified end-to-end on kristof4:
service.start service_router→ tagged job appears inhero_proc job.listfilter and codescalersjobs.list.jobs.cleanup --dry-runcorrectly identifies the matching job + its action.hero_procrow count drops accordingly.