[infra] build perf quick wins — default -j auto, wire sccache, conditional nice/ionice #188
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_skills#188
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Three small changes in
tools/modules/services/lib.nuthat significantly speed upservice_install_allfor everyone. None of them are structural — just defaults that haven't kept up with the box's actual capacity. Together they cut typical build cycles by 2-3× before any work on the bigger CI-artifacts story (hero_demo#54).Verified live during the 2026-05-01 herodemo deploy: setting
HERO_CARGO_JOBS=0mid-deploy went from load avg 1.14 (1 active rustc) to 15.22 (10+ active rustc), with iowait staying at 0-4%. The box is CPU-bound, and we were leaving ~75% of CPU on the floor.A. Default
HERO_CARGO_JOBSto 0 (auto = nproc)File:
tools/modules/services/lib.nu(svc_install helper, around line ~620 whereHERO_CARGO_JOBSis read).Current:
Proposed:
Rationale: the existing
4was set when hero_proc had a chatty SQLite log that fought cargo for I/O bandwidth. With the SQLite log backend replaced byhero_log(a2eff7c), that contention is gone. Modern Hero deploy targets (herodemo: 16 CPU; CI runners: similar) underutilise at-j 4. Defaulting to 0 (cargo auto = nproc) lets every box use all its cores. Operators on smaller VMs can still cap withHERO_CARGO_JOBS=4explicitly.Win: 2-4× build speedup on 16-core boxes. Free.
B. Wire
sccacheintocargo buildenvGoal: shared compile cache across all 22+ services. Today every repo recompiles its own copy of axum, tokio, serde, hyper, rustls, etc. — 50+ shared deps × 22 services = enormous redundant work.
Mechanism: set
RUSTC_WRAPPER=sccachein the env passed to^niceinsvc_install. sccache transparently caches each rustc invocation by input hash; identical compiles short-circuit to a copy.Prerequisites (mostly already done in this org):
sccachebinary installed on deploy targets — already present persccache.nuskill in hero_skills~/.config/sccache/configwith reasonable size cap (e.g. 50G — much smaller than the 87G cargo target dir we've seen)File:
tools/modules/services/lib.nu, insvc_installwhere the build is invoked. AddRUSTC_WRAPPER=sccacheto the env record passed to^nice. Also condition onwhich sccacheso the build still works on hosts without sccache.Win on first deploy after population: ~60-80% reduction on shared deps. The first build that populates sccache pays the same cost as today; every subsequent build benefits. Win compounds across services within a single
service_install_allrun since each service shares ~80% of its dep tree with neighbours.Caveats:
sccacheandRUSTC_WRAPPERinteract with build.rs / proc-macros in subtle ways for some crates. Test on a few services before defaulting on.~/.cache/sccachesize capping persccache.nu; verify the cap is reasonable.D. Make
nice/ioniceflags conditionalToday: every cargo build is wrapped in
nice -n 19 ionice -c 3 cargo build ...unconditionally. This is correct for "deploying onto a live production box without disrupting users" but pointless for "fresh deploy where nothing else is running on the box."Verified live tonight: with no other heavy workloads,
ionice -c 3is essentially a no-op (no I/O contention to yield to), andnice 19slightly slows the build (yields to even minor processes). Neither hurts, but they're cosmetic in the fresh-deploy case.Proposed: a
--low-priorityflag onservice_install_all(and propagated tosvc_install) that wraps with nice/ionice. Default OFF forservice_install_all(fresh deploy assumption); ON forservice_complete --updateif the operator passes it explicitly.File:
tools/modules/services/lib.nuandtools/modules/services/packages.nu.Win: small. Mostly a clarity-of-intent change — explicit about when we're being polite to running services vs when we're going as fast as possible.
Combined ROI
For a fresh
service_install_allon a 16-core box with cargo cache cold:For typical "redeploy single service" cycles, after sccache is warm: ~30 sec to 2 min per service vs today's 30-60 sec to 5-10 min. Order of magnitude better dev-iteration cycle.
Out of scope
lto = true+codegen-units = 1in release profile — covered by a separate proposed--debuginstall path issue (the next one I'm filing). Those flags are correct for production binaries; relaxing them is for dev-iteration only.Cross-refs
tools/modules/sccache.nuskill — already exists in hero_skills, just not wired into builds yetValidation
Live demonstration of A from tonight's herodemo deploy:
Same change as a default would benefit every Hero deploy.
Merged. Closing — defaults are now
HERO_CARGO_JOBS=0(auto = nproc),HERO_CARGO_NICE=0,HERO_CARGO_IONICE_C="", plusHERO_CARGO_SCCACHE=auto. Operators on live boxes opt back into politeness viaHERO_CARGO_NICE=19 HERO_CARGO_IONICE_C=3.Next deploy after the current herodemo run will be the first to use the new defaults — should reproduce the 13× load-avg gain we observed live during this session.