Jobs appeared in the sidebar as jobs count but not in the jobs tab #22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We need to investigate this issue. We may need to call the appropriate hero_proc endpoint to fetch the jobs and list them, similar to how it’s done in the sidebar.
might be related to #20
Implementation Spec for Issue #22
Objective
The sidebar's "Jobs Total" badge displays a non-zero count while the Jobs tab shows "No jobs". The two views read from different fields of the same
hero_proc_sdkjob_listresponse: the sidebar usesJobListResult.total, which hero_proc computes from a SQLCOUNT(*)that ignores the tag filter, while the Jobs tab consumesJobListResult.jobs, which IS post-filtered by tag in Rust. Make both views render the samecodescaler-scoped count by sourcing the count from the post-filtered list returned byjobs::list, not fromtotal.Root cause
crates/hero_codescalers_server/src/main.rs:676-689—get_hero_proc_job_count()returnsr.total.unwrap_or(0). Thattotalfield is populated by hero_proc'slist_jobsathero_proc/crates/hero_proc_lib/src/db/jobs/model.rs:670-678, which builds the SQLWHEREclause fromcontext_name,phase,service_id,action_id,run_id,hero_proc_service_nameonly. Tag filtering is applied in Rust after the SELECT (model.rs:697-702), sototalcounts every job in the hero_proc DB regardless of tag. Sendingtag: Some("codescaler")makes thejobsarray tag-correct but leavestotaltag-unaware.crates/hero_codescalers_server/src/jobs.rs:226-269—jobs::listreturns the correct, post-filtered job set. Its only weakness is thelimit.or(Some(500))default: the SQL page is taken first (most-recent 500 across ALL jobs in hero_proc), then narrowed tocodescalerin Rust. On a busy shared hero_proc (foreign jobs from other services), this can drop legitimate codescaler jobs off the visible page even though the sidebar would still happily report them intotal.crates/hero_codescalers_ui/static/js/dashboard.js:850-856—pollSidebarwritesstats.job_countintosb-jobs-total(and incorrectly intosb-jobs-running, withsb-jobs-failedhard-coded to 0). The sidebar therefore re-displays the wrong number every poll cycle even whenloadJobs()overwrites it briefly with the correct count from the table fetch.dashboard.js:618readsj.created_at;jobs::listemitsj.created_at_ms(jobs.rs:537). Every row's "Created" column renders as "—" — easy to confuse with "no data" while debugging this issue.The Jobs tab is the correct view (it iterates the tag-filtered
jobsarray). The sidebar is the wrong view (it truststotal).Requirements
Files to Modify
crates/hero_codescalers_server/src/main.rs— replaceget_hero_proc_job_count()so it returns counts derived from the post-filteredjobsarray, and surface phase breakdown so the sidebar can render Running / Failed correctly without a separatejobs.listround-trip on every poll.crates/hero_codescalers_server/src/jobs.rs— add a small internal helper (e.g.count_by_phase) shared between the new stats path and the existinglist, so there is one definition of "codescaler jobs". Optionally raise the default page size used byjobs::listto a higher cap (e.g. 2_000) and document why.crates/hero_codescalers_ui/static/js/dashboard.js— updatepollSidebarto consume the new structuredjob_statsobject (or a dedicated key) from thestatsresponse instead of treatingjob_countas both total and running. Fix thej.created_at→j.created_at_msrendering bug inrenderJobs.No template, no openrpc.json, no SDK regeneration is strictly required if we only enrich the existing
statsresult with extra keys — additive JSON changes don't break older clients.Implementation Plan
Step 1: Add a shared codescaler-scoped counting helper in
jobs.rsFiles:
crates/hero_codescalers_server/src/jobs.rsDependencies: none
pub async fn stats(state: &AppState) -> Result<JobStatsSummary>(or equivalent) that callshp.job_listwithJobFilter { tag: Some("codescaler"), limit: Some(2_000), ..Default::default() }, applies the same defense-in-depthtags.contains("codescaler")filter thatlist()already does, and returns{ total: usize, running: usize, failed: usize, pending: usize, succeeded: usize, cancelled: usize }. Returning struct-with-Serialize is cleanest — keepsmain.rsfree of phase string logic.list()fromSome(500)toSome(2_000)so the Jobs tab cannot be starved by foreign jobs sharing the hero_proc DB. Add an inline comment that explains: hero_proc applies LIMIT before the Rust-side tag post-filter, so the SQL page must be wide enough to include all codescaler jobs in the working set.Step 2: Rewrite
get_hero_proc_job_count()and enrich thestatsRPCFiles:
crates/hero_codescalers_server/src/main.rsDependencies: Step 1
get_hero_proc_job_count()(line 676) and replace its single call site in the"stats"arm (line 331) withlet job_stats = jobs::stats(state).await.unwrap_or_default();.Ok(json!({ … }))block (lines 349-366), keep"job_count"for backwards compatibility but set it tojob_stats.total. Add a sibling"job_stats": job_stats(which serializes to{ total, running, failed, pending, succeeded, cancelled }) for the sidebar's per-phase rendering.Defaulton the newJobStatsSummarysounwrap_or_default()keeps the daemon servingstatseven when hero_proc is unreachable (matching today'sunwrap_or(0)behavior).Step 3: Fix the sidebar to consume real codescaler counts
Files:
crates/hero_codescalers_ui/static/js/dashboard.jsDependencies: Step 2
pollSidebar(lines 850-856), replace the three lines that pumpstats.job_countinto bothsb-jobs-totalandsb-jobs-running(and the hard-coded zero intosb-jobs-failed) with reads from the new structured object:setText('sb-jobs-running', stats.job_stats?.running ?? 0),setText('sb-jobs-failed', stats.job_stats?.failed ?? 0),setText('sb-jobs-total', stats.job_stats?.total ?? stats.job_count ?? 0). The?? stats.job_countfallback covers a transient version skew where the server has not been restarted yet.stats.job_stats, not calljobs.listdirectly, so the sidebar costs one RPC per poll instead of two.Step 4: Fix the
created_atrendering bug in the Jobs tabFiles:
crates/hero_codescalers_ui/static/js/dashboard.jsDependencies: none (independent of Steps 1-3)
renderJobs(line 618) changej.created_attoj.created_at_msand convert vianew Date(j.created_at_ms).toLocaleString()only when the field is a finite number. This is mechanically tiny but if left in place the Jobs tab will look "broken" to a human eye even after Steps 1-3 land.Step 5: Smoke-test parity
Files: none (test harness)
Dependencies: Steps 1-4
make build && make install, restart the server (service_codescalers start --instance 0 --root --resetor whatever the local instance is), open the UI.proc job submit … --tag foreign(or any non-codescaler job) and confirm the sidebar Total does NOT increase.Acceptance Criteria
phase == "running"; "Failed" equalsphase == "failed". Neither is hard-coded.created_at_ms, instead of "—".cargo test --workspace --libandmake checkpass.Notes
WHEREsototaland the page contents agree, but that is out of scope per the issue body and would force a coordinated hero_proc release.job_stats) to the existingstatsresult; older callers still seejob_count.statsis already cheap (in-memory counters except for the hero_proc round-trip we're keeping). The new structuredjob_statsadds zero extra RPCs because we're reusing the samejob_listcall that today populatesjob_count.jobs::cleanup, theenqueuetag scheme, the OpenRPC SDK, the askama template, or the auth gating. The bug is fully contained in the count derivation and one JS rendering line.Test Results
cargo check --workspacepassed
Finished in 1.4s (full check from clean: 1.34s; cached: 0.12s)
cargo test --workspace --libpassed
Notes
hero_codescalers_sdkandnu_execboth reportrunning 0 tests); the binary crates (hero_codescalers,hero_codescalers_server,hero_codescalers_ui) have no--libtarget.crates/hero_codescalers_ui/src/main.rs(1 warning) andcrates/hero_codescalers_server/src/geoip.rs(11 warnings); these were present before this change and are not introduced by the patch.Implementation Summary
Root cause confirmed and fixed:
hero_proc_sdk::JobListResult.totalis a SQLCOUNT(*)that ignores thetagfilter (tag matching happens in Rust after the SELECT). The sidebar was readingtotaland therefore showing every job in hero_proc, codescaler-tagged or not. The Jobs tab was iterating the post-filteredjobsarray and showing the correct (smaller) set, so the two views disagreed.Fix: derive both views from the same post-filtered list.
Changes
crates/hero_codescalers_server/src/jobs.rsCODESCALER_TAGconstant; replaced inline"codescaler"literals inlist()with the constant.JobStatsSummary { total, pending, running, succeeded, failed, cancelled }(Default,Serialize).pub async fn stats(_state: &AppState) -> Result<JobStatsSummary>— queries the sametag: "codescaler"filterlist()uses, applies the same defense-in-depth post-filter, and counts per phase.list()default page fromSome(500)toSome(2_000). Inline comment explains why: hero_proc'sLIMITis applied before its Rust-side tag post-filter, so the SQL page must be wide enough to include every codescaler job in the working set.crates/hero_codescalers_server/src/main.rsget_hero_proc_job_count()in thestatsarm withjobs::stats(state).await.unwrap_or_default().statsJSON response:job_countis preserved (now sourced fromjob_stats.total) for backwards compatibility; new siblingjob_statscarries the per-phase breakdown.get_hero_proc_job_count().crates/hero_codescalers_ui/static/js/dashboard.jspollSidebar: now readsstats.job_stats.{total,running,failed}instead of duplicatingstats.job_countinto Total and Running and hard-coding Failed to 0.?? stats.job_countfallback covers a transient version skew where the server hasn't been restarted yet.renderJobs(line 618) andshowJobDetail(line 1659): fixedj.created_at→Number.isFinite(j.created_at_ms) ? new Date(j.created_at_ms).toLocaleString() : '—'. The server emitscreated_at_ms(verified injobs.rs:602); the prior code rendered every row's "Created" cell as "—".Deliberately not changed
j.updated_atinshowJobDetail— the server has noupdated_at_msfield (onlycreated_at_ms,started_at_ms,finished_at_ms); leaving it as today's "—" is correct until a separate cleanup decides whether to drop the row or repurpose it forfinished_at_ms.loadJobswriting the same sidebar IDs from its locally computedallJobsarray — already codescaler-scoped and correct, just redundant; out of scope.hero_procitself — pushing the tag filter into SQLWHEREwould be the cleaner upstream fix, but it's a coordinated release and out of scope per the issue body.openrpc.json/ SDK regeneration —job_statsis an additive field, no breaking change.Test results
cargo check --workspace: passed.cargo test --workspace --lib: passed (0 tests; the workspace has no library test targets, so this only confirms compile health). Behavioral verification is the manual smoke test from Step 5 of the spec — restart the service, populate hero_proc with codescaler jobs and at least one foreign-tagged job, reload the page without opening the Jobs tab, and confirm sidebar Total / Running / Failed match the Jobs tab and that foreign jobs do NOT inflate the sidebar count.Pre-existing dead-code warnings in
geoip.rsandhero_codescalers_ui/src/main.rsremain, untouched by this patch.Correction — initial fix had the wrong polarity
The first round (
c4126a0) made the sidebar agree with the (over-filtered) Jobs tab. Onproc job listwe have ~48 jobs visible, but none of them are taggedcodescaler— they're hero_proc-supervised services, not jobs hero_codescalers itself enqueued. So the previous fix landed both views on0, which is internally consistent but useless.Re-reading the issue body — "fetch the jobs and list them, similar to how it's done in the sidebar" — the right reading is to widen the tab to match the sidebar's previously-broad view, not narrow the sidebar.
What changed in
1e0a829jobs::list()andjobs::stats(): drop the implicittag: codescalerfilter.extra_tagstill narrows when explicitly requested (per-user views viacodescaler_<user>continue to work).jobs::get(): drop the tag gate. The codescaler UI is admin-only (ADMIN_SECRETSwhitelist) so any visible hero_proc job is fair game to inspect from this daemon.logs/cancel/deletefollow automatically since they delegate toget().jobs::cleanup()andbuild_tags()stay scoped to the codescaler tag — bulk-delete must never touch system services, and codescaler-launched jobs still carry the canonical tag so the per-user filter and the cleanup gate keep working.Smoke test
make build && make installservice_codescalers start --root --reset(or your local equivalent)proc job listcreated_at_msrendering fix fromc4126a0)