Smoke Tests, Production Deployment & Cleanup #13
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Smoke Tests, Production Deployment & Cleanup
Follow-up from #12 (Infrastructure Sync — completed).
herodev and herodemo are deployed and working (31 services each, HTTP 200). This issue covers smoke testing, service bug fixes, and the remaining cleanup before production.
Build Pipeline
Issue #12 replaced the old Dockerfile.dev (SSH git cloning inside Docker, 40-60 min) with the local build pipeline:
make distdocker/build-local.sh— compiles all service repos insiderust:1.93-bookwormcontainers withlhumina_code/andgeomind_code/volume-mounted. Whatever is checked out on disk gets built. Persistent cargo caches: 1-3 min incremental, 10-15 min cold.make packdocker build -f Dockerfile.pack— copies pre-builtdist/into thindebian:bookworm-slimimage. No compilation.make pushhero_zero:devtoforge.ourworld.tf/lhumina_code/hero_zeromake update ENV=herodevPromotion:
make demotags:dev→:demo, pushes, deploys to herodemo.Stale files to delete:
Dockerfile(old hero_zero) andDockerfile.prod(SSH-based BuildKit) — both replaced byDockerfile.pack+docker/build-local.sh.Smoke Tests
Three bash/curl test suites verify the live gateway remotely. ~110 tests, ~30 seconds, zero dependencies beyond curl/bash/python3.
smoke_gateway.shsmoke_test.shsmoke_theme.shhero:themepostMessage listener present in all 14 iframe-embedded servicesHow each test works — just curl:
curl /hero_redis_ui/health→ expect HTTP 200curl /hero_cloud_ui/→ expect HTTP 200 +content-type: text/htmlcurl -X POST /hero_osis_ui/rpc/rootwith JSON body → expect"jsonrpc"in responserpc.discover, count methods ≥ thresholdinspector.services, check 33 services found, 12 have methodscurl -X OPTIONS /hero_fossil_server/webdav/→ expect 204curl --max-time 3→ checktext/event-streamheader/hero_inspector_ui/mcp/hero_redis_serverhero:themelistenerThe smoke tests are the safety gate for
:demopromotion. Whenmake test ENV=herodevpasses, all services are healthy, auth works, RPC responds, UI renders, themes sync. The fix→deploy→test cycle is 3-5 minutes.Test Bug Fixes (completed)
6 bugs in the test scripts themselves (not service bugs):
smoke_gateway.shservices.listdoesn't existinspector.servicessmoke_gateway.shmethodsdoesn't existmethods_countsmoke_gateway.shhero_indexer_uiin health list but no socketsmoke_test.shbase href="/zinit_ui/"but zinit uses<meta name="base-path">smoke_test.shrpc.discovernot implementedhealthmethodsmoke_test.shCoverage extended: gateway 20→47 tests, service 35→52 tests. All 15 services from
services/user/*.tomlnow covered.Service Bugs Found
Smoke tests against herodev.gent02.grid.tf (2026-03-11) found 3 genuine service bugs:
Template 'login.html' not foundhero:themepostMessage listenerResults before fixes:
DevOps Workflow
Branching & PRs
One issue = one commit on
development. The PR squash merge handles this automatically.Flow:
development_{name}branch fromdevelopment(same name across all affected repos)closes #13in the descriptiondevelopmentwith the issue URL as messageNo manual squashing, no force pushing, no rewriting history. Forgejo does it.
Commit message format (set by Forgejo on squash):
https://forge.ourworld.tf/lhumina_code/home/issues/13 — Smoke Tests, Production Deployment & CleanupRules:
developmentINTO feature branch if neededmain— releases only via PR fromdevelopmentdevelopment: Squash commitmain: Create merge commit (preserves release boundary)11-Step Pipeline (3 human gates)
development_{name}in each repomake deploy(1-3 min incremental)make test ENV=herodev(30 sec, ~110 tests)development_{name}→developmentwithcloses #Ngit checkout development && git pullall repos,make deploymake demo(tag :dev→:demo, deploy herodemo)Why two builds (steps 2 and 7): Step 2 builds from local feature branch (fast iteration). Step 7 builds from clean merged
development(proves the pushed code works).Critical rules:
:demobefore human confirms clean build (step 8 gate)make deployTasks
1. Smoke Tests
make testtarget running all 3 suites2. Infrastructure Cleanup
DockerfileandDockerfile.proddist/WASM root-owned files inbuild-local.sh3. Branch Cleanup
development_theme_sync→developmentdevelopment_mik5branches across reposcargo teston zinit4. Production Deployment (after smoke tests green)
:devas:prod, push5. Build Pipeline CI (nice-to-have)
development, runmake dist+make packEnvironments
herodev.gent02.grid.tfhero_zero:devherodevherodemo.gent02.grid.tfhero_zero:demoherodemohero_zero:prodheroprodVM: Both containers on same VM at Mycelium IP
495:72fa:8ec3:9264:ff0f:c0a8:abad:234cRegistry:
forge.ourworld.tf/lhumina_code/hero_zeroCurrent Status
Step 3 complete — smoke tests assessed, test script bugs fixed, 3 service bugs identified. Ready for step 1 (implement service fixes).
Step 5 complete — PRs created (branch:
development_mik6)3 service bugs fixed:
/build/...vs/src/...)CARGO_MANIFEST_DIR127.0.0.1:9753, no Unix socket created~/hero/var/sockets/hero_indexer_ui.sockhero:themepostMessage listenerbase.htmlSmoke test results (107 passed, 0 failed):
smoke_gateway.sh: 47 passed, 1 skipped (SSE needs auth)smoke_test.sh: 46 passedsmoke_theme.sh: 14 passedPipeline status:
development_mik6make deploymake test ENV=herodev— all greendevelopmentmake deployfrom mergeddevelopmentmake demoPipeline complete — all 11 steps done
Final results
Steps completed
development_mik6make deploymake test ENV=herodev— 107 passed, 0 faileddevelopmentmake deployfrom mergeddevelopment— all greenmake demo—:devtagged as:demo, deployedAdditional fix
SSH_HOST/GATEWAY_FQDNoverride inapp.envsomake update ENV=herodemoworks (herodemo shares the herodev VM, stale terraform state pointed to unreachable IP)Merged PRs