v0.1 scope — admin tool to provision per-user Hero OS demo VMs #2

Open
opened 2026-05-20 21:42:06 +00:00 by mik-tf · 0 comments
Owner

v0.1 scope — admin tool to provision per-user Hero OS demo VMs

Umbrella issue for the deployer admin tool's first cut. Builds directly on the meeting minutes and the s132 VM-bootstrap groundwork. Filing this as a scope-declaration so the team can object / coordinate before code starts.

Goal

A single admin-facing Rust binary that, given a username, produces a working Hero OS demo VM accessible to that user via a Forge OAuth-gated public HTTPS URL. The team uses this to hand out demo machines to investors, partners, community members, Bifrost contacts.

Explicitly NOT in v0.1:

  • Self-service onboarding (admin-only — team operator creates users)
  • Billing / payment processing (deferred per meeting decision)
  • Multi-tenancy per VM (1 user owns 1 VM)
  • AI inference cost coverage (BYO-key model — cockpit handles this on the VM side)

Architecture

Scaffold from hero_template so the deployer follows the canonical Hero service workspace shape:

Crate Binary Socket Purpose
hero_os_tfgrid_deployer hero_os_tfgrid_deployer Lifecycle CLI
hero_os_tfgrid_deployer_server hero_os_tfgrid_deployer_server hero_os_tfgrid_deployer/rpc.sock OpenRPC backend — user provisioning, VM lifecycle orchestration
hero_os_tfgrid_deployer_sdk (lib) Auto-generated typed client
hero_os_tfgrid_deployer_admin hero_os_tfgrid_deployer_admin hero_os_tfgrid_deployer/admin.sock Admin dashboard — user list, per-user state, deploy/delete actions

Per /hero_ui_dashboard_admin for the admin UI shape; hero_admin_lib mandatory.

State backend: sqlite. Schema in a sibling issue.

Provisioning flow (per user)

Receives: username + optional display_name + optional forge_email.

  1. Check Forge user — REST call to forge.ourworld.tf — does this user exist?
    • If yes: ensure we have a Forge access token for the user (may need to ask admin to generate one, or use the deployer's admin token to mint one if Forge allows it)
    • If no: create the user with a random password + minimal profile. Admin captures the password to share OOB with the user (initial credentials).
  2. Generate SSH key — per-user ED25519. Public + private stored encrypted at rest in deployer's sqlite. Public also injected into the VM at deploy time.
  3. Deploy VMComputeService.deploy_vm (see hero_compute#deployer-integration) with 16 CPU / 8 GB / 200 GB / 16 GB rootfs / Ubuntu 24.04 / mycelium-only / publicip=false / node_id pinned to a gateway-capable freefarm node (1/8/13/50; deployer picks the one with most headroom via gridproxy.grid.tf).
  4. Deploy gatewayComputeService.deploy_webgateway mapping <username>.<node>.grid.tfhttp://<vm_ip>:9988 (hero_router's listen port).
  5. Wait until VM is SSH-able (poll get_vm for mycelium_ip).
  6. Template + scp the per-user component manifest to the VM at ~driver/hero/cfg/cockpit/services.toml (format defined in hero_cockpit#1 §6). Default profile = demo (proxy + router + proc + cockpit + embedder-small + db + books).
  7. Run setup-binaries.sh via SSH (or vm_exec once that's confirmed working for long-running scripts). The script reads the manifest, installs the selected binaries via lab build <name> --download --install, starts the bootstrap-core daemons.
  8. Verify — hit the VM's public gateway URL, confirm HTTP 200 on /health from cockpit.
  9. Store state — sqlite: user record, VM id + spec, gateway URL, SSH key pair, provisioning timestamp, deployer version.
  10. Return — admin sees the user record with shareable URL + initial credentials.

Out-of-band coordination needed (not in this scope)

  • Forge OAuth client registration for hero_proxy on each VM. Likely needs an admin step on forge.ourworld.tf once per deployer (or once per VM — TBD).
  • Mahmoud's hero_compute end-to-end stability confirmation. Today's s132 used OpenTofu directly as a fallback path; deployer v0.1 will be flag-switchable between the two backends (see sibling issue).

Decommission flow

Reverse-order:

  1. ComputeService.delete_webgateway
  2. ComputeService.delete_vm
  3. Delete sqlite record (or mark as decommissioned for audit trail — TBD)
  4. Forge user is NOT deleted (they may want to retain their feedback / Forge presence)

Implementation plan

Sub-issue Focus
D1 — Scaffold + schema Crate scaffold from hero_template, sqlite schema (users / vms / ssh_keys / gateway_mappings / events)
D2 — Forge user lifecycle Forge REST client + create/check/token-gen flow
D3 — VM-deploy adapter OpenTofu-now + hero_compute-later, behind a trait
D4 — Post-deploy flow SSH or vm_exec + setup-binaries dispatch + verify
D5 — Admin UI List users, deploy/delete actions, per-user state view

Each sub-issue will be filed separately. This umbrella tracks overall progress.

References


Sub-issues (filed 2026-05-20)

# Title
D1 scaffold + sqlite schema
D2 Forge user lifecycle (REST client + create/check/token-gen flow)
D3 VM-deploy adapter (OpenTofu fallback + hero_compute primary)
D4 post-deploy flow (manifest + scp + setup-binaries dispatch + verify)
D5 admin UI (list users, deploy/delete actions, per-user state view)
D6 setup-binaries.sh refactor (per-user manifest-driven install loop, lives in hero_demo)

Coordination references

  • hero_compute integration: hero_compute#116 — confirms VM lifecycle methods available + flags small gaps
  • Cockpit spec (the user-facing counterpart to this admin tool): hero_cockpit#1
## v0.1 scope — admin tool to provision per-user Hero OS demo VMs Umbrella issue for the deployer admin tool's first cut. Builds directly on the [meeting minutes](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/1) and the [s132 VM-bootstrap groundwork](https://forge.ourworld.tf/lhumina_code/hero_demo/src/branch/development/deploy/single-vm/scripts/setup-binaries.sh). Filing this as a scope-declaration so the team can object / coordinate before code starts. ## Goal A single admin-facing Rust binary that, given a username, produces a working Hero OS demo VM accessible to that user via a Forge OAuth-gated public HTTPS URL. The team uses this to hand out demo machines to investors, partners, community members, Bifrost contacts. Explicitly NOT in v0.1: - Self-service onboarding (admin-only — team operator creates users) - Billing / payment processing (deferred per meeting decision) - Multi-tenancy per VM (1 user owns 1 VM) - AI inference cost coverage (BYO-key model — cockpit handles this on the VM side) ## Architecture Scaffold from [`hero_template`](https://forge.ourworld.tf/lhumina_code/hero_template) so the deployer follows the canonical Hero service workspace shape: | Crate | Binary | Socket | Purpose | |---|---|---|---| | `hero_os_tfgrid_deployer` | `hero_os_tfgrid_deployer` | — | Lifecycle CLI | | `hero_os_tfgrid_deployer_server` | `hero_os_tfgrid_deployer_server` | `hero_os_tfgrid_deployer/rpc.sock` | OpenRPC backend — user provisioning, VM lifecycle orchestration | | `hero_os_tfgrid_deployer_sdk` | (lib) | — | Auto-generated typed client | | `hero_os_tfgrid_deployer_admin` | `hero_os_tfgrid_deployer_admin` | `hero_os_tfgrid_deployer/admin.sock` | Admin dashboard — user list, per-user state, deploy/delete actions | Per `/hero_ui_dashboard_admin` for the admin UI shape; `hero_admin_lib` mandatory. State backend: sqlite. Schema in a sibling issue. ## Provisioning flow (per user) Receives: `username` + optional `display_name` + optional `forge_email`. 1. **Check Forge user** — REST call to forge.ourworld.tf — does this user exist? - If yes: ensure we have a Forge access token for the user (may need to ask admin to generate one, or use the deployer's admin token to mint one if Forge allows it) - If no: create the user with a random password + minimal profile. Admin captures the password to share OOB with the user (initial credentials). 2. **Generate SSH key** — per-user ED25519. Public + private stored encrypted at rest in deployer's sqlite. Public also injected into the VM at deploy time. 3. **Deploy VM** — `ComputeService.deploy_vm` (see [`hero_compute#deployer-integration`](https://forge.ourworld.tf/lhumina_code/hero_compute/issues/?)) with 16 CPU / 8 GB / 200 GB / 16 GB rootfs / Ubuntu 24.04 / mycelium-only / publicip=false / node_id pinned to a gateway-capable freefarm node (1/8/13/50; deployer picks the one with most headroom via gridproxy.grid.tf). 4. **Deploy gateway** — `ComputeService.deploy_webgateway` mapping `<username>.<node>.grid.tf` → `http://<vm_ip>:9988` (hero_router's listen port). 5. **Wait until VM is SSH-able** (poll `get_vm` for mycelium_ip). 6. **Template + scp the per-user component manifest** to the VM at `~driver/hero/cfg/cockpit/services.toml` (format defined in [`hero_cockpit#1` §6](https://forge.ourworld.tf/lhumina_code/hero_cockpit/issues/1)). Default profile = `demo` (proxy + router + proc + cockpit + embedder-small + db + books). 7. **Run setup-binaries.sh** via SSH (or `vm_exec` once that's confirmed working for long-running scripts). The script reads the manifest, installs the selected binaries via `lab build <name> --download --install`, starts the bootstrap-core daemons. 8. **Verify** — hit the VM's public gateway URL, confirm HTTP 200 on `/health` from cockpit. 9. **Store state** — sqlite: user record, VM id + spec, gateway URL, SSH key pair, provisioning timestamp, deployer version. 10. **Return** — admin sees the user record with shareable URL + initial credentials. ## Out-of-band coordination needed (not in this scope) - **Forge OAuth client registration** for hero_proxy on each VM. Likely needs an admin step on forge.ourworld.tf once per deployer (or once per VM — TBD). - **Mahmoud's `hero_compute` end-to-end stability** confirmation. Today's s132 used OpenTofu directly as a fallback path; deployer v0.1 will be flag-switchable between the two backends (see sibling issue). ## Decommission flow Reverse-order: 1. `ComputeService.delete_webgateway` 2. `ComputeService.delete_vm` 3. Delete sqlite record (or mark as decommissioned for audit trail — TBD) 4. Forge user is NOT deleted (they may want to retain their feedback / Forge presence) ## Implementation plan | Sub-issue | Focus | |---|---| | D1 — Scaffold + schema | Crate scaffold from hero_template, sqlite schema (users / vms / ssh_keys / gateway_mappings / events) | | D2 — Forge user lifecycle | Forge REST client + create/check/token-gen flow | | D3 — VM-deploy adapter | OpenTofu-now + hero_compute-later, behind a trait | | D4 — Post-deploy flow | SSH or vm_exec + setup-binaries dispatch + verify | | D5 — Admin UI | List users, deploy/delete actions, per-user state view | Each sub-issue will be filed separately. This umbrella tracks overall progress. ## References - Meeting notes: [`hero_os_tfgrid_deployer#1`](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/1) - VM bootstrap: [`hero_demo` setup-binaries.sh](https://forge.ourworld.tf/lhumina_code/hero_demo/src/branch/development/deploy/single-vm/scripts/setup-binaries.sh) - Cockpit spec: [`hero_cockpit#1`](https://forge.ourworld.tf/lhumina_code/hero_cockpit/issues/1) - hero_compute integration: [`hero_compute#deployer-integration`](https://forge.ourworld.tf/lhumina_code/hero_compute/issues/?) — confirmation + gap-discussion - Canonical template: [`hero_template`](https://forge.ourworld.tf/lhumina_code/hero_template) - Skills: `/hero_website` · `/hero_ui_dashboard_admin` · `/hero_service_check_fix` --- ## Sub-issues (filed 2026-05-20) | # | Title | |---|---| | [D1](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/3) | scaffold + sqlite schema | | [D2](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/4) | Forge user lifecycle (REST client + create/check/token-gen flow) | | [D3](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/5) | VM-deploy adapter (OpenTofu fallback + hero_compute primary) | | [D4](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/6) | post-deploy flow (manifest + scp + setup-binaries dispatch + verify) | | [D5](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/7) | admin UI (list users, deploy/delete actions, per-user state view) | | [D6](https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/8) | setup-binaries.sh refactor (per-user manifest-driven install loop, lives in hero_demo) | ## Coordination references - `hero_compute` integration: [hero_compute#116](https://forge.ourworld.tf/lhumina_code/hero_compute/issues/116) — confirms VM lifecycle methods available + flags small gaps - Cockpit spec (the user-facing counterpart to this admin tool): [hero_cockpit#1](https://forge.ourworld.tf/lhumina_code/hero_cockpit/issues/1)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_os_tfgrid_deployer#2
No description provided.