[prod] Hero OS as a versioned nu-shell distribution #38

Open
opened 2026-04-28 12:21:36 +00:00 by mik-tf · 0 comments
Owner

Goal

Ship Hero OS as a coherent, versioned distribution that any Linux operator can deploy out of the box. Hosting decisions (backups, observability, HA, secrets vault, staging, SLO) stay with the operator. Hero OS provides the product; operators provide the ops.

This is the umbrella tracker for taking Hero OS from "working demo" (Phase 2, home#185) to "prod-level distribution".

Six pillars

All six must be green for "prod":

# Pillar Concrete
1 Reproducible artifacts Tag a commit → CI publishes versioned binaries. Deploys fetch them. No source build per VM.
2 Multi-platform install install_core detects platform (TFGrid btrfs, Hetzner ext4, Ubuntu/Debian, etc) and adapts. No hardcoded 10.1.2.2 / btrfs assumptions.
3 Modular Operator picks: full stack, --core only, --no-office, --no-voice, etc. Each service degrades gracefully if its dependency isn't running.
4 Quality gates Every Hero repo has green CI on push and PR. Cross-repo integration test suite.
5 Auth out of the box One canonical path (hero_proxy --auth-mode); demo helper retired. Tracked at home#186.
6 Docs for three audiences Deployer (runbook — already strong), User (what each archipelago does), Developer (architecture + how to add a service).

Distribution model

nu-shell-native. Each Hero repo publishes its binaries on tag push:

forge.ourworld.tf/lhumina_code/hero_<repo>/releases/v0.1.0-dev
  └── hero_<repo>-v0.1.0-dev-linux-x86_64.tar.gz

A meta-release on hero_demo pins all sub-versions for a given Hero OS version:

forge.ourworld.tf/lhumina_code/hero_demo/releases/v0.1.0-dev
  └── release notes list which version of each Hero repo this bundles

service_install_all learns a download mode:

service_install_all --release v0.1.0-dev   # fetch artifacts from forge
service_install_all --build-from-source    # explicit fallback for dev
service_install_all                        # current default — TBD which becomes default once trusted

Docker / k8s / nomad / whatever wrapper format is operator's call. The existing hero_demo/.forgejo/workflows/build-container.yaml (docker-era pipeline) stays where it is — no active maintenance, no rework. Operators who need a container image can wrap the binaries themselves.

P0 — distribution bedrock (next 1-2 weeks)

  • Versioning conventionvMAJOR.MINOR.PATCH-dev until v1.0; semver after. One source of truth in each repo's Cargo.toml / buildenv.sh.
  • Per-repo release.yaml workflowhero_router already implements this. release.yaml fires on tag v* push, cross-compiles to x86_64-unknown-linux-musl (static-pie), uploads each binary in $BINARIES as <BIN>-linux-amd64-musl. Verified producing working 8.6 MB statically-linked PIE assets at hero_router/releases/v0.2.2. Port this template to every other Hero repo as part of home#188 — pattern is uniform, not bespoke.
  • Cross-repo green CI — see home#188 for the per-repo tracker (15 red, 5 missing CI as of 2026-04-25). Same fix pattern as hero_skills #131-#132. Gates the release-artifacts work below — no point adding release.yaml to a red repo.
  • service_install_all --release <ver> download path — new code path in hero_skills. Per-service module gets a pkg_url(version) helper; falls through to source build if download fails or --build-from-source is given.
  • Cross-repo version coordination — script in hero_demo that bumps every Hero repo's version constant + tags them in coordinated order. Could be hero_demo/scripts/release.sh v0.1.1-dev.
  • Meta-release on hero_demo — single point that says "Hero OS v0.1.0-dev = these versions of each Hero repo". Release notes are auto-generated from the per-repo CHANGELOGs.
  • Multi-platform install_core — drop hardcoded /data/btrfs / 10.1.2.2 assumptions; detect or accept env overrides. Hetzner / DO / bare-metal Ubuntu 24 should all work.

P1 — quality + hardening (next month)

  • Auth finalizationhome#186 (filed)
  • Cross-repo integration test suite — black-box: deploy clean → curl every advertised endpoint. Catches the URL-scheme / port-mismatch / wrong-flag class of bugs before deploy.
  • Modular install — clean service_complete --no-office, --no-voice, --no-collab flags. Document supported subsets in the runbook.
  • Developer documentation — short companion to DEPLOYMENT_NU_HERO_OS.md: how the architecture fits together, how to add a new service module, how hero_proc action+service registration works in nu.

P2 — architecture follow-throughs (next quarter)

  • ort crate version unification across hero_voice + hero_embedderhome#173. Multi-day cross-repo refactor; deploy-side workaround in place via two parallel ONNX installs.
  • hero_biz Hero0Config → OSIS per-domain refactorhome#180. Multi-day; native Business island works as alternative.

Out of scope (operator's call, not Hero OS's)

Hero OS exposes the hooks; operators build the stack they want around them.

  • Backup strategy / retention / off-site storage (operators pick: restic, borg, S3, TFGrid object store, etc.)
  • Observability stack (Grafana / Datadog / Prometheus / Loki / nothing)
  • HA topology / multi-region / failover / load balancing
  • Secrets vault backend (hero_secrets.nu is the hook; Vault / SOPS / cloud KMS / env is the operator's call)
  • Staging / dev / prod environment separation
  • Uptime SLO / SLI definitions / on-call rotation / incident response
  • Container images, k8s manifests, nomad jobspecs (operators wrap the binaries)

Acceptance — "Hero OS v0.1.0-dev shipped" means

  • Every Hero repo has a release tagged v0.1.0-dev with a binary .tar.gz asset
  • hero_demo has a meta-release v0.1.0-dev pointing at them
  • service_install_all --release v0.1.0-dev produces a working Hero OS deploy in <10 min on a fresh Ubuntu 24 VM (any provider — TFGrid, Hetzner, DO, bare metal)
  • The Quick path in DEPLOYMENT_NU_HERO_OS.md §0.1 works as written, with a --release flag where appropriate
  • All Hero repo CI is green on push and PR

Signed-off-by: mik-tf


Originally filed as home#187 on 2026-04-25 by mik-tf — moved to hero_demo as part of consolidating issue tracking.

## Goal Ship Hero OS as a coherent, **versioned distribution** that any Linux operator can deploy out of the box. Hosting decisions (backups, observability, HA, secrets vault, staging, SLO) stay with the operator. Hero OS provides the **product**; operators provide the **ops**. This is the umbrella tracker for taking Hero OS from "working demo" (Phase 2, [home#185](https://forge.ourworld.tf/lhumina_code/home/issues/185)) to "prod-level distribution". ## Six pillars All six must be green for "prod": | # | Pillar | Concrete | |---|--------|----------| | 1 | **Reproducible artifacts** | Tag a commit → CI publishes versioned binaries. Deploys fetch them. No source build per VM. | | 2 | **Multi-platform install** | `install_core` detects platform (TFGrid btrfs, Hetzner ext4, Ubuntu/Debian, etc) and adapts. No hardcoded `10.1.2.2` / btrfs assumptions. | | 3 | **Modular** | Operator picks: full stack, `--core` only, `--no-office`, `--no-voice`, etc. Each service degrades gracefully if its dependency isn't running. | | 4 | **Quality gates** | Every Hero repo has green CI on push **and** PR. Cross-repo integration test suite. | | 5 | **Auth out of the box** | One canonical path (`hero_proxy --auth-mode`); demo helper retired. Tracked at [home#186](https://forge.ourworld.tf/lhumina_code/home/issues/186). | | 6 | **Docs for three audiences** | Deployer (runbook — already strong), User (what each archipelago does), Developer (architecture + how to add a service). | ## Distribution model nu-shell-native. Each Hero repo publishes its binaries on tag push: ``` forge.ourworld.tf/lhumina_code/hero_<repo>/releases/v0.1.0-dev └── hero_<repo>-v0.1.0-dev-linux-x86_64.tar.gz ``` A **meta-release** on `hero_demo` pins all sub-versions for a given Hero OS version: ``` forge.ourworld.tf/lhumina_code/hero_demo/releases/v0.1.0-dev └── release notes list which version of each Hero repo this bundles ``` `service_install_all` learns a download mode: ```nu service_install_all --release v0.1.0-dev # fetch artifacts from forge service_install_all --build-from-source # explicit fallback for dev service_install_all # current default — TBD which becomes default once trusted ``` Docker / k8s / nomad / whatever wrapper format is **operator's call**. The existing `hero_demo/.forgejo/workflows/build-container.yaml` (docker-era pipeline) stays where it is — no active maintenance, no rework. Operators who need a container image can wrap the binaries themselves. ## P0 — distribution bedrock (next 1-2 weeks) - [ ] **Versioning convention** — `vMAJOR.MINOR.PATCH-dev` until v1.0; semver after. One source of truth in each repo's `Cargo.toml` / `buildenv.sh`. - [ ] **Per-repo `release.yaml` workflow** — `hero_router` already implements this. [`release.yaml`](https://forge.ourworld.tf/lhumina_code/hero_router/src/branch/development/.forgejo/workflows/release.yaml) fires on tag `v*` push, cross-compiles to `x86_64-unknown-linux-musl` (static-pie), uploads each binary in `$BINARIES` as `<BIN>-linux-amd64-musl`. Verified producing working 8.6 MB statically-linked PIE assets at [`hero_router/releases/v0.2.2`](https://forge.ourworld.tf/lhumina_code/hero_router/releases/tag/v0.2.2). **Port this template to every other Hero repo** as part of [home#188](https://forge.ourworld.tf/lhumina_code/home/issues/188) — pattern is uniform, not bespoke. - [ ] **Cross-repo green CI** — see [home#188](https://forge.ourworld.tf/lhumina_code/home/issues/188) for the per-repo tracker (15 red, 5 missing CI as of 2026-04-25). Same fix pattern as [hero_skills #131-#132](https://forge.ourworld.tf/lhumina_code/hero_skills/pulls/131). **Gates the release-artifacts work below** — no point adding `release.yaml` to a red repo. - [ ] **`service_install_all --release <ver>` download path** — new code path in `hero_skills`. Per-service module gets a `pkg_url(version)` helper; falls through to source build if download fails or `--build-from-source` is given. - [ ] **Cross-repo version coordination** — script in `hero_demo` that bumps every Hero repo's version constant + tags them in coordinated order. Could be `hero_demo/scripts/release.sh v0.1.1-dev`. - [ ] **Meta-release on `hero_demo`** — single point that says "Hero OS v0.1.0-dev = these versions of each Hero repo". Release notes are auto-generated from the per-repo CHANGELOGs. - [ ] **Multi-platform `install_core`** — drop hardcoded `/data/btrfs` / `10.1.2.2` assumptions; detect or accept env overrides. Hetzner / DO / bare-metal Ubuntu 24 should all work. ## P1 — quality + hardening (next month) - [ ] **Auth finalization** — [home#186](https://forge.ourworld.tf/lhumina_code/home/issues/186) (filed) - [ ] **Cross-repo integration test suite** — black-box: `deploy clean → curl every advertised endpoint`. Catches the URL-scheme / port-mismatch / wrong-flag class of bugs before deploy. - [ ] **Modular install** — clean `service_complete --no-office`, `--no-voice`, `--no-collab` flags. Document supported subsets in the runbook. - [ ] **Developer documentation** — short companion to `DEPLOYMENT_NU_HERO_OS.md`: how the architecture fits together, how to add a new service module, how `hero_proc` action+service registration works in nu. ## P2 — architecture follow-throughs (next quarter) - [ ] **`ort` crate version unification** across `hero_voice` + `hero_embedder` — [home#173](https://forge.ourworld.tf/lhumina_code/home/issues/173). Multi-day cross-repo refactor; deploy-side workaround in place via two parallel ONNX installs. - [ ] **`hero_biz` `Hero0Config` → OSIS per-domain refactor** — [home#180](https://forge.ourworld.tf/lhumina_code/home/issues/180). Multi-day; native Business island works as alternative. ## Out of scope (operator's call, not Hero OS's) Hero OS exposes the **hooks**; operators build the **stack** they want around them. - Backup strategy / retention / off-site storage (operators pick: restic, borg, S3, TFGrid object store, etc.) - Observability stack (Grafana / Datadog / Prometheus / Loki / nothing) - HA topology / multi-region / failover / load balancing - Secrets vault backend (`hero_secrets.nu` is the hook; Vault / SOPS / cloud KMS / env is the operator's call) - Staging / dev / prod environment separation - Uptime SLO / SLI definitions / on-call rotation / incident response - Container images, k8s manifests, nomad jobspecs (operators wrap the binaries) ## Acceptance — "Hero OS v0.1.0-dev shipped" means - [ ] Every Hero repo has a release tagged `v0.1.0-dev` with a binary `.tar.gz` asset - [ ] `hero_demo` has a meta-release `v0.1.0-dev` pointing at them - [ ] `service_install_all --release v0.1.0-dev` produces a working Hero OS deploy in **<10 min** on a fresh Ubuntu 24 VM (any provider — TFGrid, Hetzner, DO, bare metal) - [ ] The Quick path in [DEPLOYMENT_NU_HERO_OS.md §0.1](https://forge.ourworld.tf/lhumina_code/hero_demo/src/branch/development/docs/ops/DEPLOYMENT_NU_HERO_OS.md) works as written, with a `--release` flag where appropriate - [ ] All Hero repo CI is green on push **and** PR ## Related trackers - [home#185](https://forge.ourworld.tf/lhumina_code/home/issues/185) — Phase 2 (codify hot-fixes — mostly done) - [home#186](https://forge.ourworld.tf/lhumina_code/home/issues/186) — auth finalization - [home#173](https://forge.ourworld.tf/lhumina_code/home/issues/173) — ort unification (P2) - [home#180](https://forge.ourworld.tf/lhumina_code/home/issues/180) — hero_biz refactor (P2) - [home#168](https://forge.ourworld.tf/lhumina_code/home/issues/168) — embedder build retry race Signed-off-by: mik-tf --- *Originally filed as [home#187](https://forge.ourworld.tf/lhumina_code/home/issues/187) on 2026-04-25 by mik-tf — moved to hero_demo as part of consolidating issue tracking.*
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_demo#38
No description provided.