deploy/single-vm scaffolding never end-to-end tested; multiple gaps #72

Closed
opened 2026-05-20 09:43:40 +00:00 by zaelgohary · 0 comments
Member

Bringing up deploy/single-vm (branch development_feat_deploy_single_vm) end-to-end against dev grid required ~10 patches. Each is a blocker for a clean make all ENV=dev.

Scaffolding bugs:

  1. tf/main.tf missing yggdrasil arg in grid_scheduler.requests — required by threefoldtech/grid v1.11.6.
  2. scripts/setup.sh writes a systemd unit and calls systemctl, but the TFGrid VM entrypoint is /sbin/zinit init. Replace with a zinit unit + wrapper script that sources /root/app.env (zinit YAML can't EnvironmentFile).
  3. envs/{dev,prod}/app.env.example pins HERO_PROC_VERSION=0.5.0-rc1, but those registry artifacts are named -x86_64-unknown-linux-musl, while setup.sh fetches -linux-amd64. Pin to 0.4.4 or align names.
  4. envs/dev/app.env.example pins HERO_DB_VERSION=dev, but only the CLI is published at dev; no _server artifact. Pin to 0.3.2.
  5. scripts/{setup,update}.sh fetches hero_db_admin-linux-amd64 — registry only has hero_db_ui-linux-amd64 and slides doesn't use it. Drop the fetch.
  6. No TCP front. hero_slides_admin only binds a unix socket, but the TFGrid gateway proxies HTTPS→VM:8883 → gateway returns 502. Add a socat zinit unit on :8883 → /data/hero/var/sockets/hero_slides/admin.sock + apt-get install -y socat.
  7. hero_db 0.3.2 SDK uses old socket name hero_proc_server.sock; hero_proc 0.4.4+ creates hero_proc/rpc.sock. Add a transitional symlink in setup.sh.
  8. hero_slides_server expects /data/hero/var/sockets/hero_db/rpc.sock, but hero_db 0.3.2 publishes flat hero_db_server.sock/rpc returns Backend unavailable. Second transitional symlink needed.
  9. envs/*/app.env.example doesn't document FORGE_TOKEN + WEBROOT — hero_proc 0.4.4 hard-refuses to start without them (see hero_proc issue).
  10. No .forgejo/workflows/. The dev binaries used here were published manually; CI would catch (3)–(5) immediately.

Deployed successfully after applying all patches manually: https://devslides.gent02.dev.grid.tf/

Bringing up `deploy/single-vm` (branch `development_feat_deploy_single_vm`) end-to-end against `dev` grid required ~10 patches. Each is a blocker for a clean `make all ENV=dev`. **Scaffolding bugs:** 1. `tf/main.tf` missing `yggdrasil` arg in `grid_scheduler.requests` — required by `threefoldtech/grid` v1.11.6. 2. `scripts/setup.sh` writes a systemd unit and calls `systemctl`, but the TFGrid VM entrypoint is `/sbin/zinit init`. Replace with a zinit unit + wrapper script that sources `/root/app.env` (zinit YAML can't `EnvironmentFile`). 3. `envs/{dev,prod}/app.env.example` pins `HERO_PROC_VERSION=0.5.0-rc1`, but those registry artifacts are named `-x86_64-unknown-linux-musl`, while `setup.sh` fetches `-linux-amd64`. Pin to 0.4.4 or align names. 4. `envs/dev/app.env.example` pins `HERO_DB_VERSION=dev`, but only the CLI is published at `dev`; no `_server` artifact. Pin to 0.3.2. 5. `scripts/{setup,update}.sh` fetches `hero_db_admin-linux-amd64` — registry only has `hero_db_ui-linux-amd64` and slides doesn't use it. Drop the fetch. 6. No TCP front. `hero_slides_admin` only binds a unix socket, but the TFGrid gateway proxies HTTPS→VM:8883 → gateway returns 502. Add a socat zinit unit on `:8883 → /data/hero/var/sockets/hero_slides/admin.sock` + `apt-get install -y socat`. 7. `hero_db` 0.3.2 SDK uses old socket name `hero_proc_server.sock`; hero_proc 0.4.4+ creates `hero_proc/rpc.sock`. Add a transitional symlink in setup.sh. 8. `hero_slides_server` expects `/data/hero/var/sockets/hero_db/rpc.sock`, but hero_db 0.3.2 publishes flat `hero_db_server.sock` → `/rpc` returns `Backend unavailable`. Second transitional symlink needed. 9. `envs/*/app.env.example` doesn't document `FORGE_TOKEN` + `WEBROOT` — hero_proc 0.4.4 hard-refuses to start without them (see hero_proc issue). 10. No `.forgejo/workflows/`. The `dev` binaries used here were published manually; CI would catch (3)–(5) immediately. Deployed successfully after applying all patches manually: https://devslides.gent02.dev.grid.tf/
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_slides#72
No description provided.