lhumina_code/hero_os_tfgrid_deployer

Fork 0

Converge deployer on development RPC stack #31

Merged

mik-tf merged 184 commits from development_mik into development

2026-06-24 04:33:07 +00:00

mik-tf commented

2026-06-24 04:32:46 +00:00

Owner

Refs: #30
Refs: lhumina_code/hero_proc#163

Signed-by: mik-tf mik-tf@noreply.invalid

Normalizes deployer OpenRPC methods for the development macro stack, keeps generated params single-input, installs hero_components with the admin app set, and wires deployer secret operations to the restored context-aware hero_proc SDK. Refs: https://forge.ourworld.tf/lhumina_code/hero_os_tfgrid_deployer/issues/30 Refs: https://forge.ourworld.tf/lhumina_code/hero_proc/issues/163 Signed-by: mik-tf <mik-tf@noreply.invalid>

mik-tf added 184 commits

2026-06-24 04:32:47 +00:00

deployer: scope the per-tester base bundle to the demo app set

lab publish / publish (push) Successful in 10m5s

Details

895bb38bad

The base bundle a fresh tester installs is now the demo set — cockpit,
planner, slides, whiteboard, agent, voice — on top of the always-on base
(proxy, router, supervisor, data store, orchestrator). Drop hero_memory,
hero_books, and hero_biz, which are not part of this demo; they remain
install-on-demand.

Add hero_db ahead of the apps: slides persists to it, and without it
slides logs "hero_db socket unreachable" and can fail its first boot.

Also fix two install aborts the trim exposed:
- the shared-engine wiring no longer restarts hero_memory_server, which is
  no longer installed (restarting an absent service aborts the install).
- the default-library seed is forced empty when hero_books is not in the
  bundle, instead of falling back to a HERO_BOOKS_DEFAULT_REPOS value that
  may linger in the deployer's environment and re-trigger a hero_books
  restart.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: drop hero_agent from the base bundle

lab publish / publish (push) Successful in 23m4s

Details

55525543b5

hero_agent is the old assistant and is not the path forward (it does not
ship from the stable branch and its admin UI binary fails to register
because it depends on the bare hero_agent CLI, which cannot run as a
service). Remove it from the per-tester base bundle so a fresh tester does
not show a stopped agent tile; the new agent is added separately once it
ships from the stable branch. The shared-engine wiring no longer restarts
hero_agent_server or sets its routing-mode knob, since neither is installed.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): add the Kimi assistant to the tester base bundle + wire its MCP and key

lab publish / publish (push) Successful in 23m29s

Details

59e78914a2

Add hero_kimi (built from hero_kimi_rust) to the per-tester stack so every
tester gets the Kimi AI assistant. The component name is the binary prefix
hero_kimi; the install download resolves it to the hero_kimi_rust repo via the
now-activated COMPONENT_REPO map, while the start loop matches hero_kimi_web.

Write two per-tester files into ~/.kimi at install time: config.toml points the
agent at OpenRouter and reads the key from the OPENAI_API_KEY process env (never
on disk), and mcp.json registers the tester's own planner and whiteboard as MCP
servers reached through the router's local MCP gateway, addressed as the tester.

Proven live: install adds kimi to the bundle, both config files land, the router
MCP gateway exposes the planner (71 tools) and whiteboard (88 tools) surfaces and
tool calls create content on both.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): write the Kimi assistant config with the working provider type

lab publish / publish (push) Successful in 23m40s

Details

054fc12822

The per-tester ~/.kimi/config.toml declared the OpenRouter provider as
type "openai_legacy", but the assistant's agent implements only its built-in
"kimi" chat provider (which itself speaks the OpenAI-compatible
chat/completions API and so drives OpenRouter via base_url). The unsupported
type meant the agent never built an LLM client and every chat failed with
"LLM is not set". Switch the provider type to "kimi" and update the comment to
record that the key is read from the OPENROUTER_API_KEY / OPENAI_API_KEY
process env, with api_key left empty on disk.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): never publish a tester without a login gate; self-heal the web gateway at install

lab publish / publish (push) Successful in 10m3s

Details

6641a6a5e5

A tester provisioned through the admin add -> provision -> install flow
could end up live with no login gate. At provision time the web gateway
deploy can return an error to the deployer after the gateway contract was
already created on the grid (or time out), so the deployer recorded no
gateway domain. With the domain empty, the per-tester OAuth app step was
skipped, and install pushed empty OAuth credentials to the tester, so its
cockpit was served with no Forge sign-in gate (the admin also showed no
Cockpit URL).

Changes:
- Extract the gateway deploy + persist into a shared ensure_webgateway
  helper, used by both provision and install.
- install_hero_stack now self-heals: if the row has no gateway domain it
  re-runs ensure_webgateway, and if the per-tester OAuth app is missing it
  creates it, then fails closed (refuses the install) unless a full login
  gate is present. A single Install/Reinstall click repairs a tester that
  came up without a Cockpit URL. An explicit DEPLOYER_ALLOW_INSECURE_INSTALL=1
  override remains for intentional debugging installs.
- Admin shows a "Set up gateway & install" action for a ready VM that has
  no gateway domain yet.

The underlying daemon contract gap (deploy returning an error after the
contract is live, and the lack of a gateway lookup) is filed at
lhumina_code/hero_compute#133.
Tracking: lhumina_code/home#253.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): show a loading spinner on VM action buttons

lab publish / publish (push) Successful in 23m50s

Details

9700e33cee

The Provision / Install / Reinstall / Set up gateway / Destroy / Delete
buttons were plain synchronous form POSTs with no client feedback, so the
page looked frozen during the slow on-chain grid call. Add a delegated
submit handler that disables the clicked submit button and swaps in a
Bootstrap spinner with an active label (Provisioning..., Installing...,
Destroying..., etc.). The page stays on screen with the spinner until the
server returns the re-rendered page.

lhumina_code/home#252

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: enable whiteboard public share on tester install

lab publish / publish (push) Successful in 23m45s

Details

b39a2dabf1

Set core/HERO_PROXY_PUBLIC_WHITEBOARD_SHARE=1 on a tester at install time
so a freshly provisioned sandbox tester comes up with whiteboard share
links reachable without a login, matching the live tester. The whiteboard
backend still re-scopes each shared call to its token.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): simplify tester onboarding (no SSH key needed; register existing Forge users)

lab publish / publish (push) Successful in 23m53s

Details

b232314bc2

Sandbox testers use the cockpit web apps only and never open a shell on
their VM, so provisioning no longer reads or requires a tester SSH key.
A provisioned VM gets only the shared installer key (so the install step
can SSH in and run setup) and the workspace admin keys. This removes the
"upload an SSH key first" onboarding step and the provisioning-disabled-
without-a-key failure mode; the admin UI no longer shows the SSH-key
warning or gates the Provision button on it.

Adds an "add existing user" path so the admin can onboard someone who
already has a forge.ourworld.tf account: a new deployer.add_existing_user
RPC (verifies the account on Forge, pulls display name and email from the
profile, writes the deployer row, creates no account and sets no password)
plus a matching admin form. They sign in with their existing credentials
over SSO.

lhumina_code/home#247

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): onboard existing users with mixed-case names + clearer create-user result

lab publish / publish (push) Successful in 23m29s

Details

78c242e5a3

Registering an existing Forge account whose username has uppercase (or other
characters not valid in a hostname) would have produced an invalid web gateway
name and a half-provisioned VM. The gateway name is now lowercased from the
username, and add_existing_user rejects up front any username that cannot form
a valid sandbox web address (letters and digits only), so we never register a
user we then cannot provision.

Also fixes the "Create user" result when the account already exists: it now
says clearly that the account already exists and no password was changed, and
points to "Add existing user", instead of a "User created" heading with an
empty initial-password field.

lhumina_code/home#247

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): welcome email on ready + unified add-user flow

lab publish / publish (push) Has been cancelled

Details

88e761e1e4

Port an EmailProvider trait + Resend implementation (ureq 3.x, with a
dev-mode fallback when no key is set) into the deployer and send a
best-effort "your Hero sandbox is ready" email at the install ready
transition, the first point at which the cockpit URL and the login gate
both exist. The recipient is read from the user row; a send failure only
logs and never affects the install result.

Merge the two add-user admin cards into one "Add a user" form with an
Existing-Forge-account vs Create-new toggle, and add an optional email
override to add_existing_user (openrpc.json + regenerated client) so an
existing user whose Forge profile email is private still gets the mail.

Sender defaults to "Hero OS <noreply@hero.lhumina.org>", overridable via
the EMAIL_FROM_ADDRESS / EMAIL_FROM_NAME env without a rebuild; the key is
read from RESEND_API_KEY. The new [[env]] blocks need the service
re-registered (not just restarted) to take effect.

Signed-by: mik-tf <mik-tf@noreply.invalid>

chore(deployer): default email sender to noreply@mail.lhumina.org

lab publish / publish (push) Successful in 23m39s

Details

8c2869b139

Align the baked EMAIL_FROM_ADDRESS default with the verified Resend
sending subdomain (mail.lhumina.org) so a fresh deploy is correct without
an env override. Still overridable via EMAIL_FROM_ADDRESS.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): single welcome email states username + login, plus username slug normalization

lab publish / publish (push) Successful in 22m57s

Details

b24c757c63

Welcome email: the one email sent at install->ready now states the login
explicitly and links straight at the gated cockpit app, not the bare domain.
A new account carries its one-time Forge password (generated at create time,
stashed on the user row via a new nullable temp_password column [migration
M8], and wiped after a real send); an existing Forge account is told to use
its existing password (no temp password is invented). Still a single email,
fired once at ready. Dev-mode sends (no API key) skip the wipe so the stashed
password is not lost when nothing was actually delivered.

Username slug: add gateway_slug() — lowercase then keep [a-z0-9] — as the
single canonical transform from a Forge username to the gateway / workload /
DNS label, used in ensure_webgateway and add_existing_user. A username like
mik-tf now onboards as miktf instead of being rejected; add_existing_user
only refuses a name with no letters or digits at all. The Forge username
itself is kept in its canonical case for SSO identity (proxy auth is
case-insensitive); only the web-address label is lowercased.

Refresh the create_user next_steps: step 2 no longer says the admin sends the
cockpit URL out of band — it now arrives by email once the sandbox is ready.

Tests: gateway_slug normalization, temp_password set/get/clear round-trip
(also exercises M8), welcome-email new-vs-existing-account branches and the
cockpit app link. 98 server tests green; fmt + clippy clean.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): per-tester build identity + check-for-updates

lab publish / publish (push) Successful in 23m28s

Details

094bccef8d

Record what build each tester VM is running and surface whether a newer
build is available, productizing the recurring "did the update land?" gap
from the hand-driven fast-path deploys.

- M9 adds vms.installed_releases: at each successful install the deployer
  snapshots the source commit (target_commitish) of every installed
  component's rolling `latest` Forgejo release and stores it as JSON.
  Source commits, not binary md5, so the snapshot is immune to the UPX
  pre-pack md5 wrinkle. When a repo publishes a branch name as
  target_commitish, the staleness diff falls back to the release publish
  time so those repos are still tracked.
- The snapshot is captured at install START and written only on ready, so
  it can never claim a build newer than what actually landed.
- The tracked set is derived from the install manifest
  (A30_STACK_COMPONENTS via a component->repo map mirroring
  setup-binaries.sh), so components added to the bundle later are tracked
  automatically.
- New read-only deployer.check_build_updates RPC diffs the stored snapshot
  against current `latest`, run on demand (an admin button) not on the
  status poll, to bound Forge load.
- Admin user_detail gains a Build column (per-component commits, capture
  date) and a "Check for updates" button; the reinstall action is
  relabeled "Update / Reinstall".
- setup-binaries.sh passes --force so a reinstall actually re-pulls current
  `latest` instead of skipping already-present binaries; first installs
  have no cache so it is a no-op there.

Updating a machine reuses install_hero_stack (self-heals, fails closed,
holds the installing lock) rather than a bespoke partial-update path.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): one-click Add & set up onboarding + node capacity awareness

lab publish / publish (push) Failing after 35s

Details

be7312121b

Add a single "Add & set up" action to the admin users page that chains
register/create user, provision VM, and install the Hero stack in one
go, with staged progress (adding, provisioning, installing, ready) and a
copy-ready cockpit URL. The per-step buttons stay on the user detail
page as the resume path for a failed or deferred step. The browser
orchestrates the existing calls via three new JSON admin routes
(onboard-create / provision / install) that wrap the existing SDK
methods; sequencing stays client-side. A new account's one-time password
is surfaced inside the flow; the welcome email still fires only when the
install reaches ready.

Add node capacity awareness so a full or offline node is handled before
anything is created:
- New read-only deployer.node_capacity RPC reads ComputeService
  list_nodes (total + online status) and list_slices (per-slice
  free/used) and reports free/total slots plus "room for N more testers"
  (free slots divided by the fixed demo-bundle slice count).
- The users form shows a live "room for N more testers" readout, and the
  same capacity feeds a hard preflight inside provision_vm that refuses
  up front (no contract created) when the node is full or offline. The
  preflight fails open on unknown capacity, so it only ever turns a
  guaranteed failure into a fast, clean rejection, never a new blocker.

Proven end to end on the QA node: the one-click chain created,
provisioned and installed a throwaway tester to ready (login gate 302,
welcome email sent, build snapshot recorded), the readout tracked 4, 3,
4 testers as it was added and removed, and teardown was clean.

Closes lhumina_code/home#255

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): set email sender and default Kimi key from the admin dashboard

lab publish / publish (push) Failing after 44s

Details

5dfc626a85

Add a Service setup panel to the deployer admin dashboard so an operator can
configure the welcome-email sender and the default tester assistant key from
the browser instead of by hand on the server.

- email.rs: read the Resend sender config (api key, from address, from name)
  live from the secret store at send time instead of the start-time env
  snapshot, so a dashboard change applies on the next email with no deployer
  restart. Replaces from_env with from_parts plus send_welcome_with.
- web.rs: new deployer.get_service_config (presence booleans plus the
  non-secret from fields only, never the keys) and deployer.set_service_config
  (writes only the provided non-empty values), plus an admin_secret_value
  helper that reads the local hero_proc store.
- ssh.rs: the tester install now writes the Kimi subscription config
  (api.kimi.com/coding/v1, model kimi-for-coding) so the assistant has web
  search and fetch, and seeds the operator default key into the tester
  core/KIMI_API_KEY slot then restarts hero_kimi_web so the assistant works on
  first login. A tester can still override the key in the cockpit settings.
- admin: Service setup card with write-only key fields and set/not-set badges,
  wired to the new RPCs over the generated SDK client.
- openrpc.json: the two new methods and their output schemas.

Keys are write-only in the UI and never returned or logged. Seeded keys are
expected to be spend-capped and rotated.

Refs lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): per-tester assistant key seeding and welcome email controls

lab publish / publish (push) Successful in 23m55s

Details

b76da1e9d6

Generalize the single default Kimi key into a provider registry (Groq,
OpenRouter, SambaNova, Kimi). The admin dashboard stores a default key per
provider; when adding a tester the operator seeds all configured keys or a
chosen subset, or none for a bring-your-own-key tester. Kimi keeps its
dedicated path (assistant config plus hero_kimi_web restart); the rest are
plain core secret slots a consumer reads from its process env, so seeding a
key whose consumer is not installed yet is harmless and future-proof.

Add welcome-email controls: a per-tester send toggle and an instance-wide
email master switch, both defaulting to on so existing machines are
unchanged with no migration. The operator can customize the welcome email
subject, opening paragraph, and sign-off (the login link and sign-in line
are always system-rendered, so a customization cannot break the email), and
send a test copy to a chosen address before any tester receives one.

Wire changes live in openrpc.json (the SDK client is regenerated by the
openrpc_client! macro): seed_providers and send_welcome_email params on
install_hero_stack, the provider and email fields on get/set_service_config,
and a new send_test_welcome method.

lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer-admin): navbar plus dedicated Service setup page, honest per-tester key checkboxes

lab publish / publish (push) Successful in 23m49s

Details

7e263fbf61

The Service setup card was buried on the home page with no navbar link, so
configuring keys and email wording was undiscoverable. Add a top nav with
Overview, Users, and Service setup (with active-page highlighting), move the
configuration to its own /settings page split into Email sender, Default
assistant keys, and Welcome email sections, and turn the home page into a
launcher with a Service setup card.

On the add-tester form, the per-provider seed checkboxes now read the live
configuration and disable any provider with no key set, with an inline link
to Service setup, so the selection reflects what will actually happen.

lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer-admin): add an admin Manual page

lab publish / publish (push) Has been cancelled

Details

1968f50616

Add a Manual tab to the deployer admin dashboard, mirroring the tester
cockpit's manual. A new /manual page renders a Markdown admin guide via the
shared markdown viewer component, served as text from /manual.md. The guide
covers service setup (keys, email sender and on/off, welcome email wording and
the test send), adding a tester (existing vs new account, the two buttons, and
the per-tester setup options), managing testers, and a short FAQ. Adds a Manual
link to the navbar and a Manual card on the overview.

lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

style(deployer-admin): rename Service setup nav to Settings, drop nav icons

lab publish / publish (push) Has been cancelled

Details

4bcd127977

Match the rest of the navbar (Overview and Users have no icons) by removing the
gear and book icons from the Settings and Manual links, and rename the "Service
setup" label to "Settings" everywhere user-facing (navbar, overview card, page
heading, and the manual) so it matches the /settings route.

lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer-admin): align the Send test button with its input on the Settings preview

lab publish / publish (push) Failing after 41s

Details

fc9d968bad

The preview row used align-items-end, which dropped the button to the help-text
line instead of lining it up with the email input. Put the input and button on
one row with the label above and help text below.

lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): manage admin SSH keys from the dashboard

lab publish / publish (push) Has been cancelled

Details

a5a43240ac

Admin SSH keys (the operators' public keys injected into every tester for
shell access) can now be viewed, added, and removed from the admin Settings
page instead of only as a server-side secret set by hand.

- New deployer.set_admin_ssh_keys RPC writes the full list to the
  core/ADMIN_SSH_PUBKEYS secret (empty clears it); each entry is validated
  as an SSH public key. get_service_config returns the list (public keys)
  plus a master-installer-key presence flag.
- Provisioning reads ADMIN_SSH_PUBKEYS live from the secret store (env
  fallback) so a dashboard edit applies to the next tester with no deployer
  restart, matching the email-config live-read pattern.
- Settings page gains an "Admin SSH keys" editor; openrpc.json + SDK
  regenerated; unit tests for the pubkey validation.

Applies to newly provisioned testers; recreate a tester to refresh its keys
(sandbox tier). Part of lhumina_code/home#256

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): per-tester cockpit allowlist + opt-in tester SSH key

lab publish / publish (push) Successful in 23m19s

Details

baa04fc4ba

Admins can now manage, from the user-detail page, which Forge accounts may
sign in to a tester's cockpit, and optionally grant a technical tester shell
access to their VM.

- M10 adds users.extra_allowed_users + users.tester_ssh_pubkey, keyed to the
  user so both survive a VM recreate (plain ALTER, not a vms rebuild).
- Cockpit allowlist: the effective list is force-unioned server-side from the
  admin set (ADMIN_FORGE_USERS) + the tester's own username + the operator
  extras, so editing the extras can never lock the team or the tester out of
  the cockpit (that slot is the sole gate for the whole cockpit). Folded into
  the install-time allowlist; applies on the next install/reinstall.
- Tester SSH key: opt-in, off by default; injected at the next provision when
  set. Sandbox/testing tier, so recreate the VM to apply. A tester shell can
  read the still-shared provider keys, so it is operator-set per tester.
- New RPCs get_tester_access / set_tester_allowlist / set_tester_ssh_keys
  (openrpc.json SSOT + regenerated SDK + smoke tests); user-detail "Access &
  keys" panel; unit tests for the force-union guard + username validation.

Completes lhumina_code/home#256 part 2.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): embed the assistant widget and voice bar in the navbar

lab publish / publish (push) Successful in 24m28s

Details

dc811dfe53

Add the agent assistant widget (Kimi default) and the voice bar to the admin
dashboard navbar so an operator can use the assistant by chat or by voice.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): hide the provision node field (fixed to the default slice)

lab publish / publish (push) Successful in 23m17s

Details

c9dead1ec8

The provision Node field is really a slice SID, and any value other than the
default silently fails to provision. Hide it and submit the default so an
operator cannot enter a non-existent slice. Explicit node selection returns
when the deployer supports more than one dedicated node.

Signed-by: mik-tf <mik-tf@noreply.invalid>

Make fresh-tester install reliable: route wait-retry, deterministic OAuth gate seed, resilient service starts

lab publish / publish (push) Successful in 28m9s

Details

1e5b63323b

A fresh tester install was failing in three independent ways that compounded:

1. Mycelium route wait. A freshly-provisioned VM reports running on-chain well
   before its overlay route converges (minutes, sometimes longer). The bare scp
   died in ~3s with "No route to host" and the whole install failed. install_hero_stack
   now waits for SSH reachability before transferring, spending up to half the
   budget so a slow-but-eventual route still installs on the first attempt, with
   a clear error if it never comes up. The install timeout default rises to 1800s
   to cover the wait plus the ~10-15 min install (also the stale-lock window).

2. OAuth gate seed. hero_proxy seeds the forge OAuth provider once at boot by
   reading the client_id/secret from hero_proc. On a fresh tester that read races
   and version-skews, silently returning "not set" even when the secret is present,
   so the provider is never created and every page returns 500. The deployer already
   holds the per-tester client_id/secret (it minted the OAuth app), so it now pushes
   the provider straight into hero_proxy via oauth.set_provider after the gateway is
   healthy. Success is confirmed by the response, not the HTTP status, and the install
   fails closed if it never succeeds. Skipped when there is no OAuth app (empty
   client_id), so the gate stays inert rather than seeding a broken record.

3. Cold-boot service-start races. hero_proc can still be registering its RPC
   surface when the first lab service --start round runs, so a component (e.g.
   hero_orchestrator) could fail with "method not found: action.set". setup-binaries.sh
   now waits for hero_proc to answer before starting services and retries any
   component that failed the first round. The deployer's hero_proxy restart also
   falls back to re-registering the service if a race left it unregistered, instead
   of aborting the whole install with "service not found".

See lhumina_code/home#265

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: derive the gateway fqdn when the daemon returns ready without one

lab publish / publish (push) Failing after 5m48s

Details

e6fd3ad7de

deploy_webgateway can reach on-chain ready while the grid SDK read-back returns
an empty fqdn (an intermittent miss). The deployer treated that as a hard error,
which skipped the per-tester OAuth app and blocked install, stalling onboarding
for that tester until manual intervention.

A name gateway's fqdn is deterministic: <gateway_name>.<zone>, where the zone is
the gateway node's domain suffix shared by every tester. When the daemon returns
ready without an fqdn, derive it (zone from TFGRID_GATEWAY_ZONE or inferred from
an existing tester's fqdn), persist it, and continue, so install + the login gate
proceed normally. install_hero_stack's repair path reuses this, so a single
Install recovers a tester that came up without a Cockpit URL.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: decouple gateway name from username + SSD-aware capacity

lab publish / publish (push) Successful in 10m12s

Details

596553cb53

Two onboarding-reliability fixes for the single-node sandbox.

Gateway name: the web-gateway / DNS label is no longer the bare Forge
username, so re-provisioning a user never collides on a leftover
on-chain name contract. A fresh VM gets a name unique to that VM (the
username slug plus the VM short id, e.g. alice003x), and the operator
can pin a custom web-address label on the Add / Provision form
(lowercased and stripped to [a-z0-9]). A new vms.gateway_name column
holds it; existing testers are backfilled from their current fqdn label
so a reinstall keeps their address. ensure_webgateway resolves the name
once and persists it so the provision and install-repair paths agree.

Capacity: node_capacity now reports how many slices actually fit right
now, from the compute daemon's live free vCPU / RAM / SSD with the
deploy headroom (the binding constraint is usually SSD), instead of a
raw free-slot count that over-reported on a disk-bound node. It falls
back to the catalog free-slice count when talking to a daemon that
predates the new ComputeService.node_capacity, so the readout never
regresses across a version skew. The admin "room for N more testers"
banner and the pre-provision check use the honest figure.

#22
#21

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(install): seed a catch-all deny route so unrouted hosts are refused

lab publish / publish (push) Successful in 23m58s

Details

5c9a2565f7

The tester install already registers the public hostname as an
oauth-gated route on hero_proxy. Add a sibling "*" deny route so any
request that does not match the public hostname (for example one reaching
the VM by its raw mycelium or backend address) is refused with 404 rather
than served the cockpit unauthenticated. SSO is deliberately not used for
that path: the login redirect is bound to the public hostname and could
never complete off it, so a flat deny is the honest answer. The public
hostname keeps its own exact-match oauth gate.

Requires the matching hero_proxy change (catch-all "*" / deny support);
the route is inert against an older proxy.

lhumina_code/home#271

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): multi-node and multi-chain provisioning across a fleet of compute daemons

lab publish / publish (push) Has been cancelled

Details

85aad0ebd0

Replace the single HERO_COMPUTE_NODE_ADDR with a fleet of compute daemons, one
per TFGrid network, and aggregate their dedicated nodes. The deployer can now
see and place tester VMs across more than one node, and across more than one
chain; the runtime stays unified by mycelium regardless of which chain rented a
node. A single-chain deployment (no HERO_COMPUTE_DAEMONS set) is unchanged.

- Compute fleet built from HERO_COMPUTE_DAEMONS (JSON) or the single-daemon env.
- Each VM records its owning daemon (schema M12, additive column) so provision,
  delete and gateway repair route back to the right chain.
- New deployer.list_nodes RPC aggregating every node across every chain with a
  live SSD-aware capacity snapshot and a fleet summary.
- Node selection: an explicit (node, chain) pin or "auto", which picks the
  most-free fitting node server-side at provision time so it cannot go stale.
- Admin dashboard: a Nodes page, an Overview capacity strip and Nodes card, and
  a labelled node dropdown on the Add form (auto by default).

Proven live: deployed to the admin machine with no regression (M12 migrated
11->12, existing VMs intact); the aggregated view exposed a second QA node that
the single-node UI hid; a tester provisioned onto that second node and was then
torn down, with provision and delete both routing by the recorded chain.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): per-daemon hero_router path so co-located daemons share one router port

lab publish / publish (push) Failing after 4m31s

Details

2014b02556

The compute fleet distinguished daemons only by addr while using a fixed
/my_compute_zos/rpc/rpc path, which assumed each daemon sat behind its own
router port. hero_router is a single TCP entry point that routes to services by
path, so a second daemon co-located on one admin VM is registered under a
distinct socket name (e.g. my_compute_zos_main) and reached on the same port at
/my_compute_zos_main/rpc/rpc. Add an optional rpc_path to each
HERO_COMPUTE_DAEMONS entry (and ComputeAdapter), defaulting to the canonical
path so single-daemon and existing multi-daemon configs are unchanged.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: canonical lab-publish workflow (build main/development/integration)

lab publish / publish (push) Has been cancelled

Details

3928f5ef9e

Publishes musl-x86_64 binaries to per-branch releases (latest,
latest-dev, latest-integration) and installs lab from the matching
hero_skills branch (clone + build via --branch). Triggers only on push
to these three branches.

ci: canonical lab-publish workflow (build main/development/integration)

lab publish / publish (push) Has been cancelled

Details

4cc3493141

Publishes musl-x86_64 binaries to per-branch releases (latest,
latest-dev, latest-integration) and installs lab from the matching
hero_skills branch (clone + build via --branch). Triggers only on push
to these three branches.

ci: trigger lab-publish run

lab publish / publish (push) Waiting to run

Details

a8dc79b998

ci: trigger lab-publish run

lab publish / publish (push) Failing after 2m47s

Details

25fc0d454c

feat(admin): manage deployer admin access settings 89e3019ed5

Merge remote-tracking branch 'origin/main' into main_home_277_admin_access

lab publish / publish (push) Waiting to run

Details

bb1c149620

feat(deployer): apply access settings to existing testers

lab publish / publish (push) Waiting to run

Details

87c7fc140b

feat(admin): show the TFGrid network per VM on the Users page

lab publish / publish (push) Waiting to run

Details

2e5e5e5e76

The deployer now provisions across more than one TFGrid network from one admin
VM, but the Users page VM table showed only a node short id, which is ambiguous
(the same short id exists on more than one network). Thread the owning fleet
daemon (daemon_label) through list_vms (VmRow schema + response) into the admin
and add a Network column to the VM table, so each tester clearly shows which
network and dedicated node it lives on. Display only; the value is already
stored per VM from the multi-node work. Empty (pre-multi-node rows) shows
`default` (the first-configured daemon).

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(admin): serve shared static assets before service static a7e423bd87

Merge remote-tracking branch 'origin/main' into main_home_277_admin_access

lab publish / publish (push) Successful in 8m58s

Details

8c32e45e35

feat(deployer): manage dedicated nodes from admin

lab publish / publish (push) Successful in 7m1s

Details

d468dc4cfb

fix(admin): polish dedicated node search UX

lab publish / publish (push) Successful in 7m9s

Details

8a9b51e59e

feat(admin): manage compute wallet mnemonics

lab publish / publish (push) Successful in 7m6s

Details

efb9141350

feat(deployer): add testnet compute fleet support

lab publish / publish (push) Successful in 7m2s

Details

b6db118dcb

fix(deployer): present networks in canonical order

lab publish / publish (push) Successful in 7m10s

Details

9f8490f605

fix(deployer): add onboarding retry controls

lab publish / publish (push) Successful in 7m11s

Details

68b2571d1d

feat(deployer): add node detail drawer

lab publish / publish (push) Has been cancelled

Details

6ae9db320e

feat(deployer): show active testers per node

lab publish / publish (push) Successful in 9m2s

Details

90d0141cc7

fix(deployer): default node search to rentable

lab publish / publish (push) Has been cancelled

Details

f6d42e8e4b

fix(deployer): remove users locally only

lab publish / publish (push) Successful in 10m13s

Details

5e3633afaa

feat(deployer): copy mycelium addresses

lab publish / publish (push) Successful in 7m7s

Details

b3ee76bfa5

fix(deployer): cap demo testers at one slice

lab publish / publish (push) Has been cancelled

Details

78b7e590e1

Set the demo tester profile default to one TFGrid slice, which is 4 GB RAM. Keep the OpenRPC description and SDK compile smoke aligned so the admin capacity math and provision default speak the same profile size.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): require two-slice testers with disk floor

lab publish / publish (push) Has been cancelled

Details

9fabc02c44

Use the current hero_compute slice model honestly for demo testers: two slices gives 2 vCPU and 8 GB RAM today. Add a 50 GB SSD floor to node suitability so the Nodes search and auto-placement do not treat low-disk-per-slice nodes as viable for the expanded tester bundle.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): lower tester disk floor

lab publish / publish (push) Successful in 10m46s

Details

05d8accd3f

feat(deployer): show node links and include foundry

lab publish / publish (push) Successful in 7m4s

Details

a282a3de4a

ci(deployer): publish main to latest

lab publish / publish (push) Successful in 23m47s

Details

0d454e6316

feat(deployer): include indexer in tester bundle

lab publish / publish (push) Has been cancelled

Details

e9c75d659b

fix(deployer): simplify users card label

lab publish / publish (push) Successful in 26m38s

Details

1768e90728

feat(deployer): add release-channel tester updates

lab publish / publish (push) Successful in 28m43s

Details

0478e581c3

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): restart services after tester updates

lab publish / publish (push) Has been cancelled

Details

ee0d443001

Reinstalling a tester VM already forced binary downloads, but the follow-up
service pass used idempotent starts. Existing processes therefore kept running
old binaries after an update. Restart hero_proc_server and reset each installed
service so update/reinstall actually applies the downloaded release channel.

Signed-by: mik-tf <mik-tf@noreply.invalid>

test(deployer): keep tfgrid deployer off tester bundle 055b20b1fd

Pin the tester base bundle so hero_tfgrid_deployer is never installed into tester VMs. The deployer belongs on the admin/control VM only.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: lab-release workflow (prebaked lab-builder image) 3e03d4b8eb

Replaces lab-publish.yaml with a single lab-release workflow that pulls the
prebaked lab-builder image and publishes per-branch releases (main=stable,
development/integration=pre-release). No per-run toolchain/lab install.

ci: lab-release workflow (prebaked lab-builder image) cf82abf1cd

Replaces lab-publish.yaml with a single lab-release workflow that pulls the
prebaked lab-builder image and publishes per-branch releases (main=stable,
development/integration=pre-release). No per-run toolchain/lab install.

feat(deployer): add admin control surface

lab release / release (push) Successful in 7m41s

Details

f53ac75e56

Add a Control tab to the TFGrid deployer admin UI for admin-VM-only shared services and provider dashboards.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): keep control page focused on shared providers

lab release / release (push) Successful in 6m24s

Details

06d80ca3c8

Remove redundant deployer and misleading voice entries from the admin VM Control page so it only exposes dedicated shared provider admin surfaces.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: trigger lab-release (latest-main)

lab release / release (push) Successful in 19m41s

Details

474f7242d6

ci: trigger lab-release (latest-main)

lab release / release (push) Has been cancelled

Details

0dea3a52fb

fix(deployer): avoid proxy restart during admin allowlist save

lab release / release (push) Successful in 5m0s

Details

ce57ce8141

The admin allowlist save path writes ADMIN_FORGE_USERS and previously restarted hero_proxy_server before returning. That killed the in-flight dashboard response because the request itself is proxied through hero_proxy, so the browser saw a Bad Gateway body instead of JSON.

Keep the compatibility response field, but let hero_proxy pick up the new allowlist through its existing short TTL cache. Update the settings UI and OpenRPC description to match the no-restart behavior.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): skip uninstalled VMs during access sync

lab release / release (push) Successful in 7m21s

Details

388e50c060

Applying admin access settings expects the tester VM to have a Hero stack so it can write hero_proc secrets and restart the tester proxy. Running that path against a provisioned but uninstalled VM reports a false failure even though the VM is not a valid sync target yet.

Skip VMs whose install_state is not ready so the settings page reports actionable failures only.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): use branch-suffixed release channels

lab release / release (push) Successful in 19m18s

Details

02defc0937

Align tester install and update flows with the current lab-release tag contract: main uses latest-main, development uses latest-development, and integration uses latest-integration.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: multi-arch lab-release (linux-musl x86_64 + arm64)

lab release / release (push) Successful in 14m25s

Details

28aa3f3142

ci: canonical lab-release (cargo check + multi-arch + hero.releaser)

lab release / release (push) Has been cancelled

Details

b36ed7aa29

ci: canonical lab-release (cargo check + multi-arch + hero.releaser)

lab release / release (push) Successful in 34m22s

Details

d353f649e4

fix(deployer): scope auto node selection by network 46abf6f5be

Auto provisioning can now be scoped by daemon/network, so a QA tester stays on a QA-capable node even when another chain has a duplicate node SID with more free capacity. The admin user forms submit the selected daemon for both explicit nodes and Auto, and the regression is covered by server tests.

Signed-by: mik-tf <mik-tf@noreply.invalid>

merge remote main before deployer auto-scope push

lab release / release (push) Has been cancelled

Details

eb08e9b120

Integrate the canonical lab-release workflow updates that landed on main while the deployer auto-selection fix was being prepared.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): mark managed nodes in finder

lab release / release (push) Successful in 8m10s

Details

2a898e7738

The Nodes finder now detects TFGrid nodes already registered in the deployer and shows a Registered action that opens the managed-node detail drawer instead of offering a duplicate Register action.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): require rent before node registration

lab release / release (push) Successful in 12m29s

Details

ad23d0384f

Dedicated TFGrid nodes can no longer be added to the deployer catalog unless Grid Proxy confirms the selected daemon's wallet twin owns the rent. The Nodes finder now shows Rent + register as the only action for rentable unrented nodes and keeps actions visibly busy while rent/register/unregister calls run.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): clarify node unregister behavior

lab release / release (push) Successful in 12m20s

Details

6057596e80

The Nodes page now labels the trash action as deployer-catalog unregister only and asks for confirmation before running it. The server-side unregister guard also treats legacy empty-daemon VM rows as QA when a QA daemon exists, so mainnet node removal is not blocked by old QA tester rows sharing the same short SID.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): add node unrent action

lab release / release (push) Successful in 8m9s

Details

d14d8cf20c

The deployer now exposes a guarded cancel-rent RPC that infers the active rent contract from Grid Proxy, verifies daemon ownership when available, and refuses while the node is still registered. The Nodes finder shows Unrent for rented unmanaged nodes, keeping catalog unregister and on-chain rent cancellation as separate operator actions.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): add node adopt retire lifecycle

lab release / release (push) Successful in 12m10s

Details

a926403461

Add a managed-node retire RPC that refuses while tester VMs still use the node, unregisters it from the compute catalog, and cancels the TFGrid rent contract when owned by the deployer wallet. Update the Nodes UI terminology to Adopt node for rent/register and Retire node for unregister/cancel-rent, keeping Unregister only as an explicit advanced path.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): use app modal for node lifecycle actions

lab release / release (push) Successful in 8m15s

Details

033f2dd948

Replace browser-native node lifecycle confirmations with the deployer UI modal and unwrap nested compute RPC errors so node retire/unregister failures show actionable text.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): hide rpc prefix in admin json errors

lab release / release (push) Successful in 12m12s

Details

a534dfe0c6

Strip generated JSON-RPC transport prefixes from admin JSON error bodies so node lifecycle banners show the actionable message returned by the deployer server.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): show compute-local node blockers

lab release / release (push) Successful in 8m13s

Details

c033a106e6

Add a node_compute_vms RPC so the Nodes drawer can show VMs that exist in
the compute daemon but are not assigned to a deployer user row. Retire and
unregister now refuse on those compute-local blockers before attempting node
catalog removal, and the drawer disables lifecycle actions with an explicit
warning table.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): manage compute-local node blockers

lab release / release (push) Successful in 8m13s

Details

dcbb60c0d2

Add guarded cleanup for unassigned compute-daemon VMs from the Nodes drawer, refresh the operator manual, and align top-level deployer page guidance.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): route manual sections

lab release / release (push) Successful in 29m44s

Details

0dec02b3ec

Show one manual section at a time through hash routes so sidebar navigation changes the docs panel instead of scrolling a long page.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat: include aibroker in tester bundle

lab release / release (push) Successful in 8m26s

Details

aa23cb9f97

Add hero_aibroker to the deployer-managed tester stack and installer component map so fresh sandbox installs pull the current latest-main broker binaries alongside Books and Memory.

Signed-by: mik-tf <mik-tf@noreply.invalid>

Merge origin/main into integration

lab release / release (push) Successful in 11m38s

Details

e6460feeca

Bring current main changes into integration before release convergence.

Signed-by: mik-tf <mik-tf@noreply.invalid>

Default sandbox installs to integration channel

lab release / release (push) Has been cancelled

Details

d9fa9aca24

Make the deployer admin UI, install script, RPC defaults, and operator docs use latest-integration as the sandbox install default while keeping latest-main selectable for promoted demos and rollback.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: canonical-only lab-release (+cargo test); remove other workflows

lab release / release (push) Successful in 18m39s

Details

b2e90b568a

feat(deployer): show linked tester build details

lab release / release (push) Has been cancelled

Details

1a9028e485

Record the selected release tag in new tester VM build snapshots and keep old snapshots compatible through serde defaults. Make the user VM Build badge open a details modal with per-component Forge repo, release, and commit links.

Local gates:
- cargo +1.96 check -p hero_tfgrid_deployer_admin -p hero_tfgrid_deployer_server
- cargo +1.96 test -p hero_tfgrid_deployer_server releases::tests --lib
- git diff --check

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): clarify whole-stack update channel

lab release / release (push) Has been cancelled

Details

a4b31654f4

The deployer update controls reinstall tester VMs from one selected stack channel.
Make the selector and action tooltips explicit so this is not confused with Cockpit per-service channel choices.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): preserve service channels on tester updates

lab release / release (push) Successful in 45m13s

Details

b2410fd34e

Ready tester VM batch updates now delegate to the tester Cockpit and update each service bundle from its recorded channel.
First installs and retries keep using the selected whole-stack channel, and new installs copy the stack build snapshot to the tester for Cockpit build backfill.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): report per-daemon health in the fleet listing 8d172d3727

deployer.list_nodes now returns ok and error per configured daemon plus
a daemons_unreachable count, so an unreachable chain daemon is
distinguishable from an empty fleet. Previously a downed daemon's nodes
silently vanished from the response and the dashboard showed
"no dedicated nodes configured" during an outage.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): daemon health rendering, logs tab, services tile, channel copy

lab release / release (push) Has been cancelled

Details

ff3f35bce4

Nodes page and the overview fleet strip now show which chain daemon is
unreachable with its last error and point at the admin services page,
instead of rendering an outage as an empty fleet. New Logs page embeds
the shared logs viewer against a read-only relay to the supervisor's
logs.filter (the relay renames the component's src_prefix field to the
supervisor's src and refuses non-read methods). Control gains an
"Admin VM services" tile linking the admin Cockpit services page. The
Users page channel selector is labeled as the default for new installs
and the subtitle says updates preserve each service's recorded channel.

Signed-by: mik-tf <mik-tf@noreply.invalid>

chore(sdk): regenerate client snapshot for daemon health fields

lab release / release (push) Has been cancelled

Details

aa4912623f

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(admin): bound the Control page supervisor lookup

lab release / release (push) Has been cancelled

Details

edd3c5b871

The page awaited hero_proc service.status_all with no timeout, so a
wedged supervisor hung the browser tab forever. The lookup now times
out after 8s and the page renders the fallback card list with a
warning banner instead.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(admin): route the services links through the router path

lab release / release (push) Has been cancelled

Details

e32e276945

The admin domain proxies everything to hero_router, which only routes
service-prefixed paths, so the bare /services link returned 404 for
authenticated users. Point the Control tile and the Nodes daemon
warning at /hero_cockpit/web/services.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): navbar link to this machine's own Cockpit

lab release / release (push) Successful in 14m54s

Details

6791cf4697

The deployer admin is the fleet surface; the machine surface is this
VM's Cockpit. Make the machine surface a first-class navbar entry
instead of a tile buried on Control.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(admin): name the Control card Hero Cockpit and land on its main page

lab release / release (push) Has been cancelled

Details

72a6864acd

The Control card and the This-machine navbar link open this admin VM's own
Hero Cockpit, not a tester's; the card now says so explicitly and both
links land on the Cockpit main page instead of jumping straight to the
Services tab. The Nodes-page daemon-restart alert keeps its deep link to
the Services tab because it points at a specific restart action.

lhumina_code/home#282

Signed-by: mik-tf <mik-tf@noreply.invalid>

ui(admin): Control page tiles match the cockpit Apps page look

lab release / release (push) Has been cancelled

Details

4cf4a512ea

Same visual language as the tester-facing Apps page: gradient hero
banner, lift/glow tiles with an icon art banner, title row with the
status badge, and full-size Open buttons, so the two machine surfaces
read as one product.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ui(admin): drop This-machine navbar link; Control heading matches sibling pages

lab release / release (push) Successful in 30m5s

Details

bb4d537fad

Control is the navbar home for reaching this machine's Hero Cockpit, so
the extra navbar link is redundant. The Control heading returns to the
standard admin layout (title, lead, refresh button) used by the Nodes
page; only the tiles keep the cockpit Apps look.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(placement): sandbox VMs deploy only to nodes our wallet rents

lab release / release (push) Has been cancelled

Details

27f4cabb8a

Provisioning could place a tester VM on any registered node, including
shared (unrented) ones, where capacity can be consumed by any grid user
at any time and where paid chains bill per deployment instead of inside
the rent. The node registration guard also waved shared nodes through:
it only protected against registering a node rented by someone else.

One placement policy now covers both paths. Registration and provision
both verify the live rent status from Grid Proxy and refuse any node
that is not rented by this deployer's twin, failing closed when the
node cannot be verified at all. The provision-time check matters even
with registration guarded, because nodes can enter a daemon's catalog
without passing through the deployer.

The policy is deliberately a gate, not a wall: setting the
deployer/TFGRID_ALLOW_SHARED_NODES secret to true opens shared-node
placement fleet-wide with no restart or rebuild (the value is read live
per decision). A node rented by another twin and an unrented dedicated
node stay blocked even with the gate open, since the chain refuses our
deployments there regardless.

#24

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(placement): chain-identity rails — daemon network and gateway zone must match

lab release / release (push) Has been cancelled

Details

4a935cbd0c

Cross-chain mixing (a VM on one network with its public gateway on
another) is structurally prevented by the per-daemon provision flow, but
only as long as each daemon really is on the chain its label claims. A
stale daemon build proved that assumption can silently fail: it ignored
its config context and answered for QA while labelled main. Mycelium
spans all chains, so a wrong-chain gateway would even have worked,
publishing a tester on another network's domain.

Three rails, all fail-closed, none with an override (there is no
legitimate reason to mix networks):

The provision path now verifies the daemon's chain identity first: the
gateway zones a daemon reports come from its own grid view, so they are
a property of the chain its credentials point at; a mismatch with the
configured network refuses the provision and names the zone. The fleet
listing runs the same probe and excludes a mismatched daemon's nodes
from placement, surfacing the mismatch in daemon health (a probe
transport failure only logs; the provision-time check still protects).

The gateway mint refuses to persist-and-publish an fqdn whose DNS zone
belongs to a different network than the owning daemon. QA, testnet,
devnet and mainnet zones are distinct; mainnet is the bare grid.tf form.

The derived-fqdn recovery path (daemon returned ready without an fqdn)
now only accepts a zone belonging to the requesting daemon's network;
it previously inferred the zone from any tester's fqdn, which would
cross chains as soon as testers span networks.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(provision): normalize the VM name to the daemon's charset

lab release / release (push) Successful in 33m30s

Details

91583079ba

The compute daemon only accepts VM names made of lowercase letters,
digits, and hyphens, but the provision default derived the name from
the Forge username verbatim, so any mixed-case username (or one with
dots or underscores) failed its first provision with an invalid-name
error from the daemon. The gateway label already had this treatment
via gateway_slug; the VM name now gets the same: lowercased, common
separators mapped to hyphens, anything else dropped, and an explicit
name that ends up empty is rejected instead of silently renamed.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(placement): shared node support, liveness filter, gateway-name availability

lab release / release (push) Successful in 18m32s

Details

411d1b184a

Auto-placement prefers dedicated (rented) nodes and overflows to shared
(unrented) nodes only where the per-network shared gate is open. Candidate
selection and node adoption filter on Grid Proxy liveness (healthy + recent
heartbeat) instead of the rentable/status flags a dead node keeps advertising.
Shared placement is gated by TFGRID_ALLOW_SHARED_NODES with an extra mainnet
opt-in. Provisions onto one node are serialized across the capacity preflight
and the deploy that consumes the slices. A custom web address already registered
on-chain is rejected before the VM is deployed; a blank address gets a per-VM
auto name with a free-variant suffix. Nodes admin page reframed with a
Dedicated/Shared badge, a stale-node liveness flag, type-labelled capacity, and
split Rent & adopt vs Use shared actions.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(nodes-ui): explicit Dedicated/Shared badge, drop confusing "not reserved"

lab release / release (push) Successful in 33m41s

Details

1a9157d887

Add an explicit Dedicated (blue) / Shared (amber) badge next to each node SID in
the managed nodes table, and replace the "(not reserved)" capacity wording with a
plain slice count plus a "shared capacity, best-effort: other grid users can take
these slices" note on shared rows.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(nodes-ui): shared nodes show "Remove", not "Retire" (no rent to cancel)

lab release / release (push) Has been cancelled

Details

33ce873787

Shared nodes are never rented, so "Retire" (which cancels the rent and stops
billing) was misleading. Shared nodes now show a single "Remove" action with a
confirm that states nothing is cancelled or billed and the node stays public;
dedicated nodes keep Retire plus Unregister-only.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(nodes-ui): plain-English node removal verbs (drop Retire/Unregister jargon)

lab release / release (push) Has been cancelled

Details

833012a172

Shared node: "Remove" (nothing rented/billed). Dedicated node: "Remove & end
rental" (cancels the rental, stops billing) and "Remove, keep rental" (keeps
paying). Confirm dialogs and flash messages reworded to match.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(nodes-ux): consistent Reserve/Unreserve + Shared vocabulary across the UI

lab release / release (push) Has been cancelled

Details

5eeaa34210

Adopt with an explicit choice (Reserve as dedicated / Add as shared) where both
apply; removal mirrors it (Remove and Unreserve / Remove, keep reserved / Remove
for shared). Add-user and Provision node dropdowns label each option Dedicated or
Shared; Auto is "dedicated first, then shared" with the preview matching the
server's dedicated-first placement.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(nodes-ux): Find-Nodes badge reflects node type, not rent status

lab release / release (push) Has been cancelled

Details

93ffb39de0

Discovery badge now keys on the node's dedicated flag (Dedicated vs Shared) so an
unrented dedicated-type node reads Dedicated, not Shared. The managed-table badge
still keys on whether we rent it. Whether a dedicated node can also be used shared
is left to verify empirically; wording avoids over-asserting it.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(nodes-ux): correct dedicated-node tooltip — shared deploy IS possible

lab release / release (push) Has been cancelled

Details

835e781948

A Dedicated+Rentable node can be deployed on shared (per-VM) without reserving it
(verified on the ThreeFold dashboard). Badge now "Dedicated-capable" with a
tooltip noting it can be reserved OR used shared per-VM (later renter could evict
a shared workload). Opening shared placement on dedicated-capable nodes in the
deployer guard is a follow-up product decision.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(placement): allow shared deploy on dedicated-capable nodes; offer both adopt options

lab release / release (push) Successful in 44m59s

Details

1c6361dc74

A single VM deploys shared (per-resource, no rent) on a Dedicated+Rentable node
(verified on the TF dashboard), so shared placement is not limited to
non-dedicated nodes. The placement guard now allows shared on any not-rented node
when the shared gate is open; Find Nodes offers "Add as shared" on every
not-rented node plus "Reserve as dedicated" when rentable. Eviction caveat shown
in the tooltip, not blocked. Guard tests updated.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(gateway): route tester public-URL gateway to a decoupled gateway daemon

lab release / release (push) Successful in 42m35s

Details

0d99352e89

A tester VM's public-URL gateway-name-proxy is now minted on a designated
gateway daemon (the reliable QA chain by default), independent of the chain
the tester's compute VM runs on. A gateway-name-proxy is a reverse proxy to a
mycelium backend, and mycelium is one overlay across chains, so a QA gateway
can front a VM whose compute runs on another chain. This unblocks onboarding
while the foundation mainnet name-gateways are not converging to ready.

- new TFGRID_GATEWAY_DAEMON_LABEL knob (default "qa"); resolve_gateway_daemon
  falls back to the VM's own compute daemon when no such daemon is configured,
  so single-chain deployments stay byte-identical.
- ensure_webgateway now keys the gateway node sid, deploy adapter, on-chain
  name-contract availability check, fqdn-zone derivation, and the zone guard
  off the gateway daemon's network. The zone guard is re-pointed (still
  validates the fqdn belongs to the gateway daemon's chain), not removed.
- delete reaps the gateway name contract on the chain it was minted on,
  reverse-derived from the stored fqdn zone, so a cross-chain tester never
  orphans its gateway name contract; delete_vm stays on the compute daemon.
- 3 new unit tests for daemon routing + reverse-derive; no schema change.

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): composite VM identity (daemon_label + vm_sid) so two networks can share a vm id

lab release / release (push) Successful in 33m28s

Details

275cdbcf11

Each compute daemon (one per TFGrid network) mints VM short ids from its
own sequence, so two networks can hand out the same vm_sid (for example a
test VM and a main VM both numbered 005o). The vms table keyed uniqueness
on vm_sid alone, so the second provision insert was rejected even though
the VM already deployed on chain, leaving a running VM the deployer never
recorded.

Make stored VM identity composite on (daemon_label, vm_sid):

- M13 recreates vms with UNIQUE(daemon_label, vm_sid) instead of
  UNIQUE(vm_sid). Safe on existing data (vm_sid is currently globally
  unique, so the composite holds for every existing row).
- insert_vm writes daemon_label in the same INSERT (not a follow-up
  UPDATE), so the composite key is complete the instant the row exists;
  a separate update would let both networks' rows land under ('', vm_sid)
  and still collide before the label was ever set.
- every vms mutator (state, webgateway, gateway_name, install_state,
  oauth, tenant_token, installed_releases, delete) keys on the composite,
  so a write can never touch the wrong network's row.
- find_vm(vm_sid, optional daemon_label) resolves a VM exactly when the
  label is supplied, else by vm_sid alone and errors if more than one
  network has that id (never a silent wrong-row read or delete).
- delete_vm / install_hero_stack / update_vm_services /
  check_build_updates gain an optional daemon_label param (openrpc + sdk)
  so a caller can disambiguate a shared vm_sid.

Tests: same vm_sid under two daemons both insert; exact and ambiguous
lookup; mutators target only the owning row. 172 server + 13 sdk green.

#26

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(deployer): install never leaves hero-home dirs root-owned

lab release / release (push) Has been cancelled

Details

7b98bd78e3

The install SSH payload runs as root, but every Hero service runs as the
driver user. The stack-snapshot block hand-wrote a plain root
`mkdir -p /home/driver/hero/var/cockpit` and chowned only the file
inside it, leaving the directory itself root-owned. The cockpit (driver)
then could not create its own per-action upgrade/install log files there,
so every dashboard service upgrade failed with "create log file failed:
Permission denied (os error 13)". Regressed in b2410fd; before that the
cockpit created the dir itself as driver.

Fix the whole class, not just the one dir:
- Route hero-home dir creation through a `driver_owned_dir` helper
  (`install -d -o driver -g driver`), used by the cockpit-state and .kimi
  blocks. `install -d` is idempotent and re-asserts ownership, so a dir an
  earlier install left root-owned self-heals on the next install.
- Add a tail ownership backstop: re-assert driver ownership over the hero
  state tree (`chown -R driver:driver /home/driver/hero/var`) as the last
  filesystem step, so a future slip can never ship a root-owned dir.
- Tests: pin the driver-owned cockpit dir, assert the backstop, and add a
  class-wide guard that fails if any new write-site reintroduces a root
  `mkdir` under /home/driver.

Existing testers installed by an affected deployer self-heal on their
next install; a live tester can be fixed without reinstall by
`chown -R driver:driver /home/driver/hero/var/cockpit`.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(install): drop hero_biz, hero_slides, hero_whiteboard from the tester bundle

lab release / release (push) Has been cancelled

Details

112835af0a

Remove the three apps from the tester install set so new tester VMs no
longer install or enable them. Adds a test asserting they stay out.

Signed-by: mik-tf <mik-tf@noreply.invalid>

Revert "feat(install): drop hero_biz, hero_slides, hero_whiteboard from the tester bundle"

lab release / release (push) Successful in 31m15s

Details

c802d30be1

This reverts commit 112835af0a.

ci(release): clear stale release assets before lab upload

lab release / release (push) Successful in 42m42s

Details

39513ae6d9

lab build --upload skips release assets that already exist by name, so the
rolling latest-* release froze at its first-uploaded binaries (assets dated
2026-06-12 while the release record kept refreshing on every push). A
downloaded binary was therefore stale (pre-M13), and it panicked opening the
migrated deployer DB, which broke dashboard-driven self-upgrade of the
deployer.

Delete the tag's existing assets via the Forge API before the upload so each
push republishes fresh binaries. See
#29.

Signed-by: mik-tf <mik-tf@noreply.invalid>

ci: drop per-repo delete-before-upload workaround from lab-release

lab release / release (push) Successful in 36m19s

Details

42abc3599a

The shared lab-builder image now carries the fixed lab that re-uploads a
release asset whenever its md5 changes, so the manual asset-delete step
this repo carried is redundant. This restores the canonical publish step
shared by the other hero repos.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): configurable gateway daemon + ordered gateway-node fallback

lab release / release (push) Successful in 16m3s

Details

4de3387caa

Mint each tester's public gateway-name-proxy on a configurable compute
daemon and an ordered list of gateway-capable nodes, most-preferred
first. The first node that mints a ready gateway on that daemon's
network wins; a failed attempt rolls back its own on-chain workload
before the next node is tried.

Two new optional, secret-backed env vars (defaults empty, so an
unconfigured or single-chain/QA deployment is byte-identical):

  TFGRID_GATEWAY_DAEMON_LABEL  which daemon mints gateways (e.g. main)
  TFGRID_GATEWAY_NODE_SIDS     ordered node ids, e.g. 8,1,13,50
                               (prefer gent02, fall back gent01/03/04)

This lets a mainnet sandbox front testers on the mainnet gent nodes
(gent02 verified converging) with automatic fallback, instead of the
single hardcoded gateway node, without a rebuild to retune.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): always-on login-gate floor + one-click web gateway repair

lab release / release (push) Successful in 45m55s

Details

68977980e6

A tester whose web gateway domain was never recorded ended up with zero
hero_proxy domain routes, so hero_proxy fell through to ungated path-prefix
forwarding and served the cockpit with no Forge sign in. Close that hole and
make it repairable from the admin.

- Always seed the catch-all "*"/deny route in the install payload, regardless
  of whether the tester has a public fqdn, so an unrouted host is refused 404
  rather than served open. The per-tester OAuth route is still added only when
  a real fqdn + OAuth app exist.
- Push the deny floor first (before the OAuth provider seed and the fqdn
  route), so a slow or failing OAuth step can never leave a window where
  hero_proxy is up but has no deny route. Factored the gate seeding into one
  shared builder so install and repair push byte-identical routes.
- New deployer.retry_vm_webgateway RPC: recover the gateway fqdn (re-derive it
  from the already-minted gateway, or mint a fresh one), create the per-tester
  OAuth app if missing, then re-push the gate routes onto the live tester over
  SSH. Idempotent (re-adds are hero_proxy no-ops; the route domain is unique).
- Admin one-click "Repair gateway & login" / "Repair login gate" buttons on
  the user detail page.

lhumina_code/home#253

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(admin): name the operating model (admin instance / member instance / organization)

lab release / release (push) Has been cancelled

Details

ff8cf1d622

The deployer admin manual's Operating Model now uses the Hero platform naming
convention: the deployer runs on the admin instance (the organization's control
plane) and manages member instances. One organization is one admin instance
plus its member instances.

lhumina_code/home#285

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(admin): make the Model page a real Hero platform explainer

lab release / release (push) Successful in 43m6s

Details

8b9b93f2e2

Expand the manual's Model section into a full "What is the Hero platform"
explainer: a visual organization diagram (admin instance + member instances
behind the Forge login gate), a what-runs-where service list with one line per
service, how members sign in (Forge SSO + optional 2FA, gated-or-404), and how
each organization gets its own instantiation of a Hero platform (Acme example).
Fix the manual header subtitle to the same naming.

lhumina_code/home#285

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(admin): add a Services section + per-service pages to the manual

lab release / release (push) Has been cancelled

Details

b0766a803e

The operator manual now has a Services section listing every service across the
Hero platform, grouped by where it runs (every instance, member instance which
is the Hero stack, admin instance), each card opening a full per-service page
rendered from the platform single-source docs (vendored into the crate). The
Architecture page now names the Hero stack and lists hero_os, and the manual's
em dashes are cleaned. Mirrors the cockpit member manual; the renderer can be
promoted into hero_admin_lib later to de-duplicate.

lhumina_code/home#285

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(admin): place the services submenu directly under Services

lab release / release (push) Has been cancelled

Details

c34a9d6b73

The deployer manual has nine top-level sections, so the services submenu
appeared at the bottom of the list. Split the sidebar so the submenu sits
inline directly under the Services item: on the SPA it reveals only when
Services is active (leading with an Overview link to the card grid), and on the
per-service pages it stays open with the current service highlighted.

lhumina_code/home#285

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(admin): capitalize Hero Platform throughout the admin manual

lab release / release (push) Successful in 26m59s

Details

d1c89d631e

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): app catalog with a strong default for member-instance install

lab release / release (push) Has been cancelled

Details

0b8b319ae8

Turn the flat, always-full install set into an operator-facing app catalog
(home#286). The canonical install list now splits into an always-on base
(proxy, router, supervisor, data store, member cockpit) and toggleable catalog
apps, each with a strong-default on/off. The operator picks apps on the Add and
set up form; the selection threads to install via a new optional enabled_apps
array on deployer.install_hero_stack.

Server:
- base/app catalog model (BASE_COMPONENTS, DEFAULT_OFF_APPS) + resolve_enabled_
  components(): base is always forced on, unknown/base entries ignored, result
  in canonical install order.
- render_cockpit_services_toml() now takes the resolved set; the books default-
  repos seed, the build-identity snapshot, the stack_components count, and the
  check_build_updates target snapshot all read the selection so a trimmed member
  records and checks exactly its own apps.
- absent enabled_apps preserves the member's currently installed apps (the plain
  reinstall / update-to-latest path) instead of resetting to the default.
- new read-only deployer.app_catalog RPC returns the toggleable apps (component,
  label, default_on) plus the base list.

Admin: /app_catalog.json route + the Add and set up form renders the catalog as
checkboxes (strong default pre-checked, select-all / reset-to-default) and posts
the selection in the install body.

186 server tests (9 new on the catalog model, manifest render, and RPC shape);
fmt + clippy -D warnings + musl release build clean.

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): saved setups (named app selection plus release channel)

lab release / release (push) Has been cancelled

Details

a11af0005c

Persist a chosen app selection plus a release channel as a named,
reusable setup the operator builds once on the add form and reapplies
to later member instances, so a consistent organization is repeatable.
Builds directly on the app catalog: a setup is a saved enabled_apps
selection plus a channel, both already first-class on the add form.

Server: new standalone setups table (schema M14, additive CREATE TABLE,
no recreate of users/vms) with name UNIQUE; upsert-by-name CRUD;
save_setup / list_setups / delete_setup RPCs. save_setup normalizes the
chosen apps against the catalog (drops base and unknown names, canonical
install order) so a stored setup is always a valid, base-free app set.

Admin: a setup picker on the Add and set up form (apply checks the
matching app boxes and sets the release channel) plus Save as setup and
Delete, fronted by /setups.json, /setups/save.json, /setups/delete.json.

190 server tests (db CRUD round-trip, upsert-by-name, normalize drops
base/unknown), SDK type smoke test, fmt and clippy -D warnings and musl
release build clean. Proven live on the testing organization admin
instance: save/list/upsert/delete round-trip, normalize dropped base and
unknown, M14 ran on the live DB.

lhumina_code/home#287

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(admin): move the release-channel picker into Setup options, per setup

lab release / release (push) Has been cancelled

Details

3d7854b3be

The release channel was a single selector at the top of the Users page,
next to Update all testers but labelled "Default for new installs". That
adjacency was misleading (it does not drive the fleet update) and the
saved-setups block had no channel control of its own even though a setup
stores a channel.

Move the picker down into the Add a user form's Setup options, right by
the saved-setups picker and the app checkboxes, so a setup is built and
read in one place: apply a setup checks its apps and sets this channel,
Save as setup captures the apps and channel shown there. Default stays
latest-integration; main and development remain selectable.

Update all testers no longer reads this picker: a ready member instance
refreshes on its own recorded release channel, and a non-ready install
falls back to the form's fixed latest-integration. Admin UI only, no
server or storage change. fmt, clippy -D warnings, build, and the 190
server tests stay green; proven live on the testing organization admin
instance (picker now in Setup options, top bar is the button alone).

lhumina_code/home#287

Signed-by: mik-tf <mik-tf@noreply.invalid>

fix(admin): order release channels main, integration, development

lab release / release (push) Has been cancelled

Details

55904d8ab1

Present the channel options in the natural progression main then
integration then development, instead of integration first. Integration
stays the pre-selected default. Applied to both the Users add form picker
and the per-user VM page so the dropdown reads the same everywhere.

lhumina_code/home#287

Signed-by: mik-tf <mik-tf@noreply.invalid>

feat(deployer): create a group of member instances from a saved setup and a name list

lab release / release (push) Successful in 33m47s

Details

4c387d07d4

Add a "Create a group" admin form that stands up many member instances at
once: pick a saved setup (apps plus release channel), name the organization,
and paste a list of people (existing Forge accounts or new ones). The form
loops the existing per-member onboard RPCs (create or add-existing, then
provision, then install), provisioning and kicking install for each member,
then polling them all to ready, with a per-member progress row. Re-running the
same list retries failed members and skips ones already up (create is
idempotent but provision is not, so the loop guards on an existing good VM).

Tag each member at provision with its organization, the setup it was built
from, and its release channel, so the organization can later be managed and
refreshed as one unit without a schema change:

- schema M15: three additive ALTER ADD COLUMN on vms (org, setup,
  release_channel; constant '' defaults, no recreate), with map_vm_row and
  every explicit vms SELECT list extended in lockstep.
- provision_vm takes optional org/setup/release_channel and records them in the
  same RPC right after insert (all known up front for a group), so a member is
  never left untagged even if its install later fails.
- install_hero_stack confirms the channel a member actually installed on, so the
  recorded channel reflects the running build for every member.

lhumina_code/home#288

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: organization object + Launcher (home#291 Part 1)

lab release / release (push) Successful in 15m7s

Details

c9f58d921a

Consolidate the deployer admin around composable building blocks and add a
real, savable organization entity, per the Launcher arc.

- Schema M16 (additive, no recreate): enrich `setups` to the full install
  bundle with `seed_providers` + `send_welcome_email`; add `organizations`
  and `organization_members` tables (a named org, its setup, and a granular
  member roster where each member is independently existing or new with its
  own email).
- Server RPCs: deployer.save_organization / list_organizations /
  get_organization / delete_organization; save_setup / list_setups carry the
  two new setup fields. openrpc.json is the SSOT; the SDK regenerates.
- Admin: the users page becomes the Launcher (the old /users route redirects).
  The inline Setup options move into a dedicated Setups card; the group-create
  card becomes the Organizations card with a granular mixed-member roster
  (add rows or paste a list), Save and Save-and-deploy, and a saved-orgs list.
  Deploy reuses the per-member onboard loop and tags each member at provision.

198 server + 17 SDK tests; cargo fmt + clippy -D warnings + musl release clean.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: Launcher information architecture (home#291 stage 2)

lab release / release (push) Successful in 16m43s

Details

b4b96b69e2

Reshape the deployer console into the create/operate split from the
Launcher plan, before the deeper building blocks land.

- Top nav gains Organizations. Launcher is now a sidebar shell with three
  surfaces (Organizations to compose and launch, Setups, Infrastructure as
  a "coming next" stub); the member roster moves off it.
- New Organizations registry page: every member instance grouped by its
  organization, with untagged members folded into a default Testers
  organization, a context switcher, cockpit links, and an "Add a member"
  action that opens the Launcher pre-filled with that organization.
- list_vms now reports each VM's org and setup tag (additive to the VmRow
  result), so the registry can group by organization.

198 server + 17 SDK tests; cargo fmt + clippy -D warnings + musl release
clean across server, SDK, and admin.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: refine the Launcher into one hub (home#291 stage 2 refinement)

lab release / release (push) Successful in 14m22s

Details

a7ad898eaf

Fold the operator console into a single Launcher hub per the refined plan.

- One hub: drop the separate Organizations top-nav. The Launcher sidebar is
  in build-dependency order (Infrastructure, Setups, Organizations), and the
  Organizations surface both composes new organizations and lists existing
  ones, every member grouped by its organization.
- Operate in place: per-organization "Update all instances" and a global
  "Update all organizations" (re-install each member on its recorded channel,
  apps preserved); the misplaced global "Update all testers" is gone from the
  header.
- Add a member in context: "Add a member" on an organization reveals the
  onboard form inline and tags the new member with that organization, no page
  bounce.
- Setups now lists every saved setup with its contents (apps, channel,
  assistant keys, welcome email), so you can see what a setup is.
- list_vms reports release_channel so the per-org update knows each channel.

198 server + 17 SDK tests; cargo fmt + clippy -D warnings + musl release clean.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: the unified Deploy flow (home#291 stage 3)

lab release / release (push) Has been cancelled

Details

13946be143

Collapse the two deploy forms into one, restoring a clean add-person to
deploy path and the every-member-is-in-an-organization invariant.

- One Deploy form: choose who (member rows, 1 to N), which organization
  (an existing one or a new one, since every member lives in one), what
  (a setup picker), and where (auto for now, infrastructure groups next),
  then Deploy. A standalone person is just a one-member organization; ad
  hoc deploys land in the default organization.
- The per-org "Add a member" and "New organization" both open this one
  form, scoped to the right organization, with no page or form hop.
- Every saved setup carries a "Use in a deployment" action that opens the
  Deploy form pre-selected, so a setup is never a dead end.
- Deploying to an existing org tags the new members without replacing its
  saved roster; a new org is saved then deployed.

198 server + 17 SDK tests; cargo fmt + clippy -D warnings + musl release clean.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: fix stale Setups copy referencing the removed single-add flow

lab release / release (push) Has been cancelled

Details

716819c806

The Setups section still told the operator to use a setup "for a single
member (Add a member)" and said its apps "apply to Add & set up only",
both of which named buttons that the unified Deploy flow removed. Point the
copy at the real path instead: save a setup, then Use in a deployment (or
pick it in Organizations, Deploy).

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: stack presets + member display name (home#291 stage 3 cont.)

lab release / release (push) Has been cancelled

Details

bb1cc155ec

A first cut of the locked layered model: differentiate stack and setup,
and restore the member display name.

- Stack presets (Default / Base / Full) are first-class. The Deploy "what"
  picker offers them ahead of the saved setups, so a deploy always has a
  default and never needs a setup built first. The Setups builder labels its
  app shortcuts as the Default / Base / Full stack presets.
- A STACK is which apps; a SETUP is a stack plus configs (channel, assistant
  keys, welcome email). The picker resolves either to the install config; the
  clean stack/setup name is what is stored on the organization and tagged on
  each member instance.
- Member rows carry an optional display name again (username + display name +
  email), threaded to account creation.

cargo fmt + clippy -D warnings + musl release clean (admin).

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: stack labels Core/Default/Full + fix Use-in-a-deployment

lab release / release (push) Has been cancelled

Details

4bbc6cc175

- Stack presets read Core / Default / Full (clean labels, that order), in
  both the Deploy picker and the Setups builder shortcuts.
- "Use in a deployment" did nothing because it revealed the Deploy form in
  the hidden Organizations section; it now switches to that section first.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: edit a saved setup in place

lab release / release (push) Successful in 15m25s

Details

4d916de59c

Selecting a saved setup now pre-fills its name and loads its stack and
configs into the builder, so adjusting and Save updates that setup instead
of forcing a re-type or a duplicate. The picker button reads Edit.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: per-organization settings, own email + assistant keys (home#291 stage 2)

lab release / release (push) Has been cancelled

Details

3dfa5b81cb

Each organization can now carry its own Resend key, email sender
(from-address and from-name), and assistant keys (Kimi, Groq, OpenRouter,
SambaNova), used when deploying that organization's members and falling back
to the General defaults when blank, so one organization can send from its own
address with its own keys while another inherits the shared defaults.

- Storage: per-organization overrides live alongside the General defaults in
  the deployer's hero_proc secret store, namespaced by the organization's
  stable row id (ORG_<id>_<SLOT>), so renaming an organization keeps its
  settings and two organizations never collide on a slot. No schema migration.
- RPCs: deployer.get_organization_settings (presence booleans only for the
  keys, raw sender fields with empty meaning inherit) and
  deployer.set_organization_settings (a field present and non-empty sets the
  override, present and empty clears it back to inherit, absent leaves it).
- Resolution at install: the member's organization tag resolves to its id, and
  the assistant-key seeding plus the welcome-email sender resolve
  org-override-else-General, so the setting actually takes effect on deploy.
- Admin: a Settings panel on each saved organization card (an in-app modal)
  loads and writes the above; the General Settings page is unchanged.

Per-network grid wallets are intentionally not included: a wallet is bound to a
compute daemon, not an organization, so per-organization wallets are a
deploy-path change rather than admin UI, tracked at
lhumina_code/home#294.

Tests: org slot namespacing and the set/clear/leave field rule (server); the
regenerated settings types compile (SDK). Full suites green (200 server, 18
SDK); fmt, clippy -D warnings, and the musl release build clean. Deployed and
confirmed live on the testing admin instance.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: fix the Stack preset Core link (id collision) and trim its note

lab release / release (push) Has been cancelled

Details

395e3d9758

The Core preset link and the note below it both used id onboard-apps-base, so
getElementById returned the link and the catalog JS overwrote the word "Core"
with the long "Core services (...) are always installed..." sentence. The
preset line rendered as that whole sentence instead of "Core". Drop the
duplicate id, make the note a short static line without the service-list
parenthetical, and remove the now-unused JS that set it. The preset line now
reads Core / Default / Full as intended.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: assign members to an organization, enforce one org per member (home#291)

lab release / release (push) Has been cancelled

Details

4f64c63fb7

Existing members deployed before the organization concept stay untagged and fold
into a default bucket; this adds a way to graduate them into a real organization,
and guarantees a member belongs to exactly one.

- M17: a UNIQUE index on organization_members(username) makes it impossible for a
  member to appear in two organization rosters. Additive (the table is empty until
  an organization is composed).
- New deployer.assign_members_to_organization(org, usernames): move semantics. For
  each member it removes them from any other organization, adds them to this one's
  roster, and sets the organization tag on all their instances. The target
  organization is created if it does not exist. No redeploy; the organization's own
  keys and email apply on a member's next install.
- save_organization now also moves a composed member out of any other organization
  before adding them, so both write paths uphold the invariant.
- Admin: the default bucket of untagged members is renamed "Unassigned" (a view,
  not an organization, so it never collides with a real name) and gains per-member
  checkboxes plus a "Move selected into an organization" action.

Tests: the one-org invariant (move keeps a member in exactly one organization, the
UNIQUE index rejects a blind second insert) and the narrow org-tag update that
preserves setup and channel; the regenerated assign types compile. Full suites
green (202 server, 19 SDK); fmt, clippy -D warnings, musl release clean. Deployed
to the testing admin instance, where the existing tester fleet was moved into a
real "Hero Testers" organization and confirmed.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: fix org Settings modal (script order) + Cockpit URL copy button

lab release / release (push) Successful in 23m38s

Details

b42de4b6db

Two launcher fixes from live use:

- The per-organization Settings button did nothing: the modal script runs in the
  content block, which the layout renders before bootstrap.bundle.min.js, so
  `bootstrap` was undefined at parse time and the script returned early, never
  attaching the click handler. Attach the handler unconditionally and resolve the
  Bootstrap Modal lazily at click time, by when the bundle has loaded.

- The organization table's Cockpit column showed only a bare "cockpit" link. Show
  the full Cockpit URL as a clickable link plus a copy-to-clipboard button, matching
  the member detail page, and add the shared [data-copy] copy handler.

Template-only; admin rebuilt and confirmed live on the testing admin instance.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: fold the Nodes page into the Launcher and add node groups

lab release / release (push) Successful in 13m38s

Details

19ae6bc362

Move the whole Nodes page into the Launcher's Infrastructure section as
three sub-tabs (Managed Nodes, Add Nodes, Group Nodes) and drop the
top-nav Nodes entry; /nodes now redirects to /launcher#infrastructure
(query preserved so a node deep-link still opens its detail drawer). The
moved view initializes lazily the first time Infrastructure is shown, so
the Launcher no longer fetches the fleet on every page view.

Group Nodes (new): a named placement pool scoped to one network, by farm
ids and/or specific node sids. New node_groups table (additive migration),
deployer.save_node_group / list_node_groups / delete_node_group RPCs, and
a Group Nodes sub-tab to create, list, and delete groups.

Deploy: the previously-disabled Infrastructure picker now lists the groups
(plus Auto). Picking a group scopes placement client-side to that group's
network and farms/nodes (freest node first, dedicated before shared), and
the whole organization is checked to fit the group's free capacity before
any account is created; it refuses with a clear message otherwise. Auto is
unchanged (the server picks the most free node).

Server node RPCs are unchanged; only the admin UI and the new group RPCs
are added. 205 server + 20 SDK tests; fmt and clippy clean.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: build a node group by picking from Managed Nodes, not typing ids

lab release / release (push) Has been cancelled

Details

2981ff227e

The Group Nodes builder now lists your Managed Nodes for the chosen
network, grouped by farm, instead of free-text farm/node id inputs. Tick a
whole farm (every managed node in it, including ones added later) and/or
individual nodes; ticking a farm disables its node checkboxes so the two
never double-count. This makes a group always a slice of the nodes you
actually manage: an organization can be pinned to its own nodes, or to a
specific farm it has a contract on.

Storage and placement are unchanged (farm_ids + node_sids; a node is in
the group if its farm or its sid matches), so deploy still only ever
places on managed nodes.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: let a node group span networks (network-qualified entries)

lab release / release (push) Has been cancelled

Details

43cc85227c

Farm ids and node sids are not unique across TFChains (mainnet farm 1 is
not QAnet farm 1), so a group's single network column forced every entry
onto one network. Drop it (migration M19) and store each entry
network-qualified ("net:id", e.g. "main:5"), so one group can mix nodes
and farms from different networks. The Group Nodes builder drops the
network select and lists your managed nodes across all networks,
sectioned network then farm; ticking a farm or node records its network
with it. Placement and the capacity preflight match per-entry network.

This is the natural model: a group is just "these nodes, wherever they
live" — an organization can run on its own nodes across networks, or be
pinned to one farm it has a contract on.

206 server + 20 SDK tests; fmt and clippy clean.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: in-app confirmations and Update selected in the Launcher

lab release / release (push) Has been cancelled

Details

a6e4dc69ce

Replace the remaining browser confirm() dialogs in the Launcher with one
reusable in-app modal: a shared launcher-confirm-modal plus a window.heroConfirm
helper now backs deleting a setup, forgetting a saved organization, and the
update-all-instances / update-all-organizations actions, matching the modal
pattern the per-organization settings and node actions already use.

Add row-selection batch update: the per-member selection checkboxes now render
on every organization card, and a new "Update selected" button beside "Update
all instances" re-installs only the ticked members on their recorded channels.

UI only; no schema, RPC, or settings change. Closes the cross-cutting polish of
lhumina_code/home#291 .

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: edit a node group after creating it

lab release / release (push) Has been cancelled

Details

825c359965

A node group could be created but never changed, so a group could not grow
as an organization took on more members. Each saved group now has an Edit
button that loads its name and current farm/node selection back into the
builder; saving the same name updates the group in place (the save already
upserts by name). Entries that are no longer among the managed nodes are not
re-ticked, so an update keeps only nodes still managed.

UI only; reuses the existing save_node_group upsert. Part of
lhumina_code/home#291 .

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: renaming a node group while editing moves it, not clones it

lab release / release (push) Has been cancelled

Details

8c04c21d14

Editing a group and changing its name left both the old and new names, because
save upserts by name and only created the new one. Track the name loaded for
editing and, when the saved name differs, delete the old group after the save so
a rename moves the group instead of cloning it.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: "Add members" (plural) and a display-name-aware bulk paste

lab release / release (push) Has been cancelled

Details

8fffe4a59b

The per-organization action is now "Add members" since the Deploy form already
takes many at once. The "Add several at once" paste box accepts an optional
display name per line (username, email, Display Name) and splits on comma,
semicolon, or tab so a spreadsheet column pastes cleanly, instead of splitting
on spaces (which broke display names).

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: an existing member takes only a username, not a display name

lab release / release (push) Has been cancelled

Details

8b848debcc

For a member added as an existing Forge account, the display name comes from the
Forge profile and a typed one is ignored server-side, so the field is now disabled
(and cleared) when the account kind is Existing, and re-enabled for New. Email
stays an optional override for existing members (the Forge email can be private or
empty), with the placeholder relabelled to say so.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: removing a dedicated node always unreserves; New organization toggles

lab release / release (push) Successful in 38m23s

Details

e93288baed

Drop the "Remove, keep reserved" trash button from a dedicated node (table row and
detail drawer): keeping a node reserved while removing it from the deployer means
it silently keeps billing, which is misleading. A dedicated node now has one
action, "Remove and Unreserve"; shared nodes keep their single "Remove". The
unregister handler is shared-only now.

The top "New organization" button now toggles the Deploy form: a second click
folds it away instead of leaving it taking up the page.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: edit an organization in place + propagate its setup's apps

lab release / release (push) Has been cancelled

Details

b08c39a7b5

Add deployer.update_organization to rename a saved organization and/or
switch its setup without disturbing its roster. A rename cascades in one
transaction to every member instance's org tag (which keys the
per-organization secret resolution) and to the organization-owned setup,
so a renamed organization keeps its email identity and editable bundle;
a rename that collides with another organization is rejected.

Each organization now owns a uniquely-named setup (created from the
picked stack when it is saved), so editing it affects only that
organization. Update all instances now sends the organization's setup's
current apps to each member, so adding a service to the setup and
updating the organization installs it on everyone. Guard delete_setup
against removing a setup an organization still uses.

lhumina_code/home#295

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: rewrite the Manual around the Launcher and organizations

lab release / release (push) Has been cancelled

Details

c0084f7b55

Replace the sandbox-era Users and Nodes tabs with Organizations, Setups,
and Infrastructure, mirroring the Launcher, and refresh Overview,
Architecture, Updates, Settings, Control, and Troubleshooting to the
organization model: deploy a list of members on a setup behind their own
logins, manage them as one unit (update all, edit, per-organization keys
and email), build reusable setups, and group the grid nodes members run
on. The Manual now reads as the finished product an operator drives.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: richer Manual Overview and Updates

lab release / release (push) Has been cancelled

Details

5ef8e7c936

Overview now shows the whole picture from both sides: what the operator
does (the Launcher journey) and what each member gets (their own private,
login-gated Hero), with organizations tying them together. Updates lays
out every granularity: a member updating one service or all their
services from their Cockpit, and the operator updating one member,
selected members, a whole organization, or every organization at once,
each applying the organization's setup.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: precise member wording in the Manual Overview

lab release / release (push) Has been cancelled

Details

f5aa84e645

A member lands on their own member instance (their machine, running their
Hero stack), where the Cockpit is the console and Hero OS is the shell
that presents their apps as one view. Replaces the vague "their own Hero",
matching the Architecture and platform-overview vocabulary.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: Manual member view — Cockpit is the console, hero_os is an app

lab release / release (push) Has been cancelled

Details

343e3d9bee

Correct the member overview: a member lands on their member instance and
their console is the Cockpit (hero_cockpit); hero_os is one app among the
others (the dock-and-islands desktop), not the shell that presents
everything.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: Manual — Cockpit and Hero OS as two distinct surfaces

lab release / release (push) Successful in 32m41s

Details

61722b6563

A member instance runs a Hero stack (a compilation of services). The
Cockpit is the control surface for running and managing those services;
Hero OS is the more integrated environment, the apps brought together as
one desktop of islands you can sign in to and work in. Reframe the
Overview and the hero_os service page away from calling hero_os "the
shell", so the two surfaces are described distinctly.

lhumina_code/home#291

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: setup ownership column (setups.org_id) + Setups page CRUD

lab release / release (push) Has been cancelled

Details

fcca243f88

Make setup ownership a queryable column instead of an implicit name
match, and turn the Setups page into create / edit-in-place / duplicate
actions (home#295).

- M20 adds setups.org_id (0 = reusable template, >0 = owning org), with a
  name-join backfill so setups that were owned by naming convention
  before this column are stamped correctly and never read as templates.
- save_organization stamps the owned setup server-side (the client never
  passes org_id); upsert_setup preserves org_id on conflict so editing an
  owned setup's apps never reclassifies it; forgetting an org reaps its
  owned setup.
- Setups page: New / Edit / Duplicate / Delete per row; the per-row
  "Use in a deployment" button is removed (deploy from Organizations).
- org_id filters owned setups out of the Deploy picker, restricts the
  Edit-organization setup switch to templates plus the org's own setup,
  and the Settings modal shows the org's setup with an edit shortcut.
- openrpc.json + SDK regenerated; 212 server + 20 SDK tests (3 new).

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: Manual renders the final organization-editing vision

lab release / release (push) Successful in 38m33s

Details

9c56d6adb4

Update the Manual to the locked model (home#296): an organization is
edited in one place, remembers where it runs, and sets its own settings
at create.

- Deploy flow: the chosen infrastructure group is remembered on the
  organization (reused on updates and added members), and you can set the
  organization's own settings (bring your own key) at create.
- Manage: merge the separate Edit and Settings steps into one
  "Edit organization" surface (name, setup, infrastructure, own settings).
- Vocabulary: a setup is a stack plus its non-secret settings; rename the
  per-organization "configs" wording to "Settings"; settings (keys and
  sender) live in Settings as a General default or per-organization
  override and never inside a setup.

Vision-only doc change (Manual describes the complete product; home#296
closes reality to it). Live on the testing organization admin instance.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: one Edit-organization place — persisted node group + settings at create

lab release / release (push) Successful in 18m29s

Details

319451dc86

home#296: an organization now remembers where it runs and is edited in one
screen.

- M21: organizations.node_group (additive, empty = Auto). Threaded through
  OrganizationRow, save_organization, update_organization, get/list, openrpc.json
  (Organization + save/update params and outputs) and the regenerated SDK.
- The chosen Infrastructure node group is saved onto the organization at deploy,
  shown and changeable in Edit, and reused for updates and added members instead
  of being re-picked each time.
- delete_node_group is guarded: a node group an organization runs on cannot be
  deleted (or renamed via delete) out from under it, so a stored placement name
  never silently dangles to Auto. Mirrors the owned-setup delete guard.
- The separate Edit (name + setup) and Settings (keys + email) dialogs merge into
  one Edit-organization screen covering name, setup, infrastructure, and the
  organization's own settings.
- Per-organization settings can be set at create (bring your own keys/sender) in
  an optional section on the Deploy form; blank inherits the General default.
- delete_organization reaps the organization's own core/ORG_<id>_* settings
  secrets so a forgotten organization leaves no orphaned key.
- Vocabulary: the per-organization "configs" are now "Settings"; a setup is the
  stack plus non-secret settings, and keys/email live on the organization, never
  in a setup.

215 server tests (3 new: node_group round-trip, delete-node-group guard, M21
additive) + 20 SDK smoke; fmt/clippy/musl-release clean. Live-verified on the
testing organization admin instance: M21 applied, the 7-member org untouched,
node group persisted + the delete guard refused an in-use group + repoint to Auto
freed it.

lhumina_code/home#296

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: auto-load an organization's remembered node group into the Deploy picker

lab release / release (push) Successful in 40m30s

Details

199e602061

home#296 follow-on: when you add members to an existing organization, the
Infrastructure picker now defaults to the node group the organization runs on, so
added members reuse its placement without re-picking. launcherDeployTo() fetches
the organization and pre-selects its stored node group when the group still
exists (otherwise it stays on Auto; a transient fetch error leaves the picker
untouched). Changing the group only ever updates the stored placement; existing
member instances are never moved.

lhumina_code/home#296

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: split setups into stacks and settings profiles

lab release / release (push) Has been cancelled

Details

d98cb37abc

Adds Stack and Settings profile storage/API/UI so organizations reference what runs separately from how it is configured. Profile secrets live in shared profile slots and install resolution now prefers org override, then Settings profile, then General defaults.

Deployed and verified on the live testing admin instance 0069.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: allow profile keys to select providers

lab release / release (push) Has been cancelled

Details

5d86558317

Entering a provider key in a Settings profile now enables and checks that provider for the profile. Missing General keys are shown as hints instead of disabling profile provider choices.

Deployed and verified on the live testing admin instance 0069.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: uncheck auto-selected provider when key is cleared

lab release / release (push) Successful in 16m19s

Details

0545fec977

Provider key inputs now auto-select their matching Settings profile provider only while the key field remains populated. Clearing a key field reverses that auto-selection without overriding manual checkbox choices.

Deployed and verified on the live testing admin instance 0069.

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs: align the Manual with the Stack and Settings profile model

lab release / release (push) Successful in 41m13s

Details

b97d927d3d

The operator-facing Manual still described a single combined setup
object. The deployer now exposes two reusable building blocks: a Stack
(apps plus release channel) and a Settings profile (provider keys,
email sender, welcome behavior). An organization references one Stack
and an optional Settings profile, values resolve organization override
then Settings profile then General default, and a referenced Stack or
profile cannot be deleted while an organization still uses it.

Rewrites the Manual "Setups" section into "Stacks & Settings", fixes
the overview, organizations, updates, and settings copy, renames the
manual route and the per-service page back-link to match, and updates
the vendored hero_tfgrid_deployer service page.

lhumina_code/home#297

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: every deployment lands in a real, named organization

lab release / release (push) Successful in 30m44s

Details

bb397957b7

The Launcher compose deploy already required an organization, but it seeded a
fake default option ("Testers organization") that the client mapped to an empty
vms.org tag, dropping those members into the Unassigned view; and the deploy
picker scraped every org card, so the Unassigned recovery bucket was itself
offered as a deploy target.

Make a deployment self-consistent at the one seam that writes vms.org. When an
organization name is present, handle_provision_vm now ensures a real
organizations row exists for that name and the member is in its roster, before
writing the tag. Without this, a name written to vms.org with no backing row
would strand a member as neither a real organization (no settings, no roster)
nor Unassigned (that view keys on an empty tag). A new idempotent
ensure_member_in_org leaves an existing same-org roster row untouched (so the
compose flow, which saves the organization first, is a no-op) and otherwise
moves the member in, upholding the one-member-one-organization invariant.

Client: drop the fake default organization option and the name-to-empty
normalization, offer only saved organizations in the deploy picker (the
Unassigned bucket and any orphaned-tag group are recovery views, never deploy
targets), and prefill the "+ New organization" name with a renameable
suggestion derived from the first member, so a solo deploy still gets a real,
settings-capable organization from the start.

lhumina_code/home#298

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: add a Launcher Overview landing page with live counts

lab release / release (push) Successful in 31m52s

Details

4c64a27b4e

Open the Launcher on a new Overview section, now the default landing route:
six tiles counting organizations, members (with a ready/installing/failed
rollup), fleet capacity, Stacks, Settings profiles, and node groups; a
"Needs attention" block (failed installs to retry, instances with a newer
build available checked on demand, and any Stack or Settings profile no
organization references) that collapses to one "all good" line; and an
"Organizations at a glance" list, one row per organization with its member
count and health.

A new read-only /overview.json aggregate serves the counts and signals by
reusing the Organizations member rollup. Fleet capacity loads lazily from
/nodes.json (a live grid query); the newer-build check is on demand so it
never hammers the build host on a page view. The top navigation page becomes
navigation only, with the fleet capacity strip moved onto the Overview. The
Manual now describes the Overview.

lhumina_code/home#299

Signed-by: mik-tf <mik-tf@noreply.invalid>

Seed the admin console URL onto each member instance at provision

lab release / release (push) Has been cancelled

Details

7c78868bbe

A member instance now receives its admin instance's own console URL
(core/ADMIN_CONSOLE_URL) plus the admin allowlist mirrored into the core
context (core/ADMIN_FORGE_USERS, previously seeded only in the deployer
context), so the member cockpit can render an admin-only link back to the
control plane. The URL is taken from an explicit ADMIN_CONSOLE_URL env or
derived from the admin OAuth callback host; it is empty when neither is
available, which leaves the member-side link hidden rather than wrong.

lhumina_code/home#300

Signed-by: mik-tf <mik-tf@noreply.invalid>

Gate the admin-console link on an admin-only list, seed it on update

lab release / release (push) Successful in 38m18s

Details

e49b6da141

The member cockpit now reads core/ADMIN_CONSOLE_USERS (the workspace admins
only) instead of the proxy login allowlist. The login allowlist also contains
the member's own username, so gating on it would have shown the admin-console
link to the member themselves. The install path and the Update all instances
path both seed core/ADMIN_CONSOLE_USERS plus the console URL, so the link
works on the existing fleet after a routine update, not only on fresh
installs. Seeds on the update path are non-fatal so they never abort the
binary update, which is the critical path.

lhumina_code/home#300

Signed-by: mik-tf <mik-tf@noreply.invalid>

Seed the tester Kimi config to drop the shell and file-read tools

lab release / release (push) Successful in 34m39s

Details

f1eddc4a0a

A sandbox tester VM is pre-fed shared provider API keys for the assistant. The
Kimi agent ships a shell tool that runs as the service user, so a browser-only
tester could read those shared keys out of the agent's environment or files. The
tester's seeded ~/.kimi/config.toml now sets exclude_tools to drop the shell and
the arbitrary-path file readers, which the agent honors, closing that read path
without affecting a developer's own coding agent.

lhumina_code/home#249

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: install the Hero OS desktop bundle on provision and update

lab release / release (push) Has been cancelled

Details

0d065b1e4e

The hero_os_admin service serves the Dioxus desktop from
~/hero/share/hero_os/public and refuses to start without it, but that
desktop is a WASM bundle (built with dx), not a musl service binary, so
the lab build/download loop never installs it. A fresh member therefore
booted hero_os_admin asset-less and the desktop only existed where it
had been copied by hand.

Add a step after the lab build loop that, when hero_os is in the enabled
set, fetches the published hero_os_app-web-dist.tar.gz for the active
release channel and unpacks it into ~/hero/share/hero_os/public. Runs on
a fresh provision and on "Update all instances", so members self-deliver
and self-update the desktop.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: refresh the Hero OS desktop bundle on the update path too

lab release / release (push) Successful in 21m11s

Details

b0e1914526

The full install delivers the desktop bundle via setup-binaries.sh, but
"Update all instances" on an already-ready member runs update_vm_services,
which refreshes the service binaries through cockpit.upgrade_service and
never runs setup-binaries.sh. So a normal fleet update advanced the hero_os
binaries while leaving the WASM desktop bundle stale (or absent on a member
that never had it), the exact drift this pipeline is meant to remove.

Append a shared bundle-refresh block to the binary-update payload: when
hero_os is in the member's install set, fetch the published bundle for the
member's release channel, unpack it into ~/hero/share/hero_os/public, and
restart hero_os_admin. It runs after the binary update and is non-fatal, so
a missing asset can never block the critical binary path. Both delivery
paths (fresh install and update) now self-deliver the current desktop.

Signed-by: mik-tf <mik-tf@noreply.invalid>

docs(manual): vendor hero_office shared-engine service page

lab release / release (push) Successful in 32m39s

Details

5cdca0bdbf

Sync the deployer Manual's hero_office page with the platform docs: the
OnlyOffice editor is a shared engine on the admin instance (not a per-member
container), reaching back into each member to fetch and save documents that
never leave that member.

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: wire members to the shared OnlyOffice editor on install and update

lab release / release (push) Successful in 34m24s

Details

195e598311

When the admin instance runs a shared OnlyOffice Document Server (hub mode with
ONLYOFFICE_JWT_SECRET set) and a member's install set includes hero_office, seed
that member's OnlyOffice slots so the engine can fetch and save through the
member without bypassing its login gate: ONLYOFFICE_JWT_SECRET,
CONNECTOR_EXTERNAL_URL (the member's own hero_proxy at mycelium:9997),
OO_UPSTREAM_BASE (the admin engine at mycelium:80), and the
HERO_PROXY_PUBLIC_HERO_OFFICE carve-out flag.

The install path seeds the slots and restarts hero_office; the update path
("Update all instances") self-gates on hero_office being present, then seeds and
re-registers hero_office plus hero_proxy with --reset so the updated binaries'
new env blocks take effect. Read the shared secret from the deployer's own
ONLYOFFICE_JWT_SECRET env (a new service.toml env block; adding it needs a
deployer re-register).

Tracked at lhumina_code/home#304

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: point the OnlyOffice connector at the member FQDN, not mycelium

lab release / release (push) Successful in 14m56s

Details

4e636e1abb

The shared OnlyOffice Document Server fetches each member's document and
posts the save callback by calling the connector URL we seed into the
member (core/CONNECTOR_EXTERNAL_URL). We were seeding the member's
mycelium address as http://[<ipv6>]:9997, but the Document Server's URL
parser rejects bracketed IPv6 literals (ERR_INVALID_URL), so it could
never download the document or post the callback: editing failed with
"Download failed" the moment a real edit ran. A curl-based check passed
because curl handles bracketed IPv6, which hid the gap.

Seed the connector as the member's https FQDN instead, at both the
install and the update wiring sites. The engine then reaches the same two
carve-out paths (/files + /callback) through the member's public gateway,
still JWT-gated and still behind the login floor for everything else;
falls back to the mycelium form only if the FQDN is unset.

Proven on a live member: minting a valid JWT and fetching the document
over https://<member-fqdn>/hero_office/ui/files/... returns 200 with the
real document bytes (and 404 for a forged token).

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: seed each member its own Hero OS context at install and update

lab release / release (push) Successful in 14m24s

Details

c558c5bab5

Add a per-member Hero OS context seed to both the install SSH payload and
the "Update all instances" payload: set the member's core/HERO_OS_MEMBER_CONTEXT
slot (its forge username) and re-register hero_os_admin with --reset so the
new [[env]] enters the stored service def and takes effect. The desktop then
redirects every browser entry into the member's own workspace, so a member
lands in a context named for themselves and a support admin signing in lands
in the member's context too, instead of a build-time demo default.

Self-gated on hero_os being installed and best-effort (|| true) so it never
fails the install or update. Adds tester_context to TesterServiceUpdateParams
(populated from the member's forge username at the update-all call site) and
unit tests asserting both payloads seed the slot and re-register before the
restart.

lhumina_code/home#305

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: install hero_browser per member so Slides can export PDF/PPTX

lab release / release (push) Successful in 31m32s

Details

5d9e17555c

hero_slides renders its PDF and PPTX exports by driving a local headless
Chrome (hero_browser) over 127.0.0.1:8884. Add hero_browser to the member
install catalog (opt-in, off by default since it brings in a full Chrome),
pull it in automatically whenever Slides is enabled so an export never
silently fails, and install Google Chrome in the member setup script when
hero_browser is in the install set. Fresh provision and "Update all
instances" share the same resolver and install path.

lhumina_code/home#279

Signed-by: mik-tf <mik-tf@noreply.invalid>

deployer: default a member's Office to their own workspace

lab release / release (push) Successful in 46m21s

Details

0ae0a8be8c

hero_office stores documents under the active Hero context and falls
back to DEFAULT_CONTEXT when none is given. Seed that default to the
member's own context (its forge username) at install and update so a
member's Office lands in their own workspace instead of the build-time
"demo" default. DEFAULT_CONTEXT has a non-empty default in the office
service.toml, which shadows a plain secret set, so it is passed on the
lab re-register to take effect. Self-gated on hero_office being
installed and best-effort so it never fails an install.

Refs lhumina_code/home#306

Signed-by: mik-tf <mik-tf@noreply.invalid>

Make Browser, Slides, Planner, and Whiteboard default-on member apps

lab release / release (push) Successful in 33m33s

Details

2531140400

Remove hero_browser, hero_slides, hero_planner, and hero_whiteboard from
DEFAULT_OFF_APPS so a default member instance installs them out of the box.
DEFAULT_OFF_APPS is the single source feeding the provision default,
resolve_enabled_components, and the app_catalog default_on flag (and thus the
Launcher "Default" preset and the New-Stack checkbox state), so the provision
default and the UI preset never drift apart. hero_browser brings a headless
Chrome (the accepted cost of having Slides export work everywhere); the setup
script already installs Chrome when hero_browser is in the set, and the cockpit
already surfaces these apps on the Admin and Services pages. Office, Collab,
Biz, Code, Orchestrator, and the parked AI Broker stay opt-in.

lhumina_code/home#279

Signed-by: mik-tf <mik-tf@noreply.invalid>

chore(deps): align to the development stack (branch + herolib_macros) 0883830439

Convergence step 1 (mechanical): base = integration's functional state (D-45
gateway routing etc.); point all hero_* git deps at the development branch and
rename herolib_derive -> herolib_macros to match the development libs. Source
build/API drift fixes follow in the next step.

Signed-by: mik-tf <mik-tf@noreply.invalid>

chore: regenerate Cargo.lock for the development stack deps b62864ec81

Signed-by: mik-tf <mik-tf@noreply.invalid>

Converge deployer on development RPC stack eb16beb945

Normalize deployer OpenRPC methods for the development macro stack, keep method params single-input, install hero_components with the admin app set, and wire deployer secret operations to the restored context-aware hero_proc SDK. Pin hero_proc_sdk to the merged development revision containing the context secrets API.

Refs: #30
Refs: lhumina_code/hero_proc#163

Signed-by: mik-tf <mik-tf@noreply.invalid>

Merge development into deployer convergence branch a2460d7593

Resolve the development merge by adopting the branch's ignored Cargo.lock policy while preserving the deployer SDK socket override needed for local admin tooling.

Refs: #30

Signed-by: mik-tf <mik-tf@noreply.invalid>

mik-tf merged commit b533e06972 into development

2026-06-24 04:33:07 +00:00

No reviewers

No labels

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

lhumina_code/hero_os_tfgrid_deployer!31

No description provided.

Rows
Columns