CI/CD Improvements for All 3 Environments #72

Open
opened 2026-04-06 13:57:36 +00:00 by peter · 2 comments
Owner

Current State

  • Dev (migrate.dev.projectmycelium.com): deploy-dev.yml deploys on push to development via rsync + SSH to 185.69.166.168. Works but uses old pattern.
  • Testnet (migrate.test.projectmycelium.com): No CI/CD. Manual docker compose pull && up on 46.225.220.102.
  • Production (migrate.projectmycelium.com): Two overlapping workflows (deploy-production.yml on push to main, build-container.yml on tag push) both deploy to the same k3s cluster — redundant and potentially conflicting.

Since v0.1.14, HEROLEDGER_NETWORK is a runtime env var. One image works for all environments — just set HEROLEDGER_NETWORK=dev/test/main at container start. No separate Dockerfile per network needed.


Option A: Tag-Based Promotion

Each environment has its own trigger: development branch for dev, -rc tags for testnet, release tags for production.

Workflows

Workflow Trigger Image tag Deploy to
deploy-dev.yml push to development :dev Dev VM (185.69.166.168)
deploy-testnet.yml tag v*-rc* :<version> (e.g. :0.1.15-rc1) Testnet VM (46.225.220.102)
deploy-production.yml tag v* (not -rc) :<version> + :latest k3s cluster

Flow

Push to development
  └─> deploy-dev.yml: build → push :dev → deploy to dev VM

git tag v0.1.15-rc1 && git push origin v0.1.15-rc1
  └─> deploy-testnet.yml: build → push :0.1.15-rc1 → pull on testnet VM → health check

git tag v0.1.15 && git push origin v0.1.15
  └─> deploy-production.yml: build → push :0.1.15 + :latest → kubectl set image → health check → create release

Pros

  • Testnet can be held at a specific version while dev moves forward
  • Clear version history per environment
  • Testnet and production deploys are deliberate actions (tag push)
  • Easy rollback: kubectl set image portal=...:0.1.14

Cons

  • Different actions for different environments (push vs tag)
  • Redeploying same code to testnet requires a new tag (-rc2, -rc3)
  • Image is rebuilt for each environment

Option B: Single Build, Promote to Production

One build on push to development deploys to both dev and testnet. Production promotion is a tag that re-tags the existing image (no rebuild).

Workflows

Workflow Trigger What happens
build.yml push to development Build → push :dev + :<short-sha> → deploy to dev VM → deploy to testnet VM
release.yml tag v* Re-tag :dev as :<version> + :latest → deploy to k3s → create release

Flow

Push to development
  └─> build.yml:
        ├─ build image
        ├─ push :dev + :abc1234
        ├─ deploy to dev VM (rsync binary + restart)
        └─ deploy to testnet VM (docker compose pull + restart + health check)

git tag v0.1.15 && git push origin v0.1.15
  └─> release.yml:
        ├─ re-tag :dev as :0.1.15 + :latest (no rebuild)
        ├─ kubectl set image portal=...:0.1.15
        ├─ health check
        └─ create Forgejo release

Pros

  • One build, same image everywhere — what you test on testnet is exactly what goes to production
  • Dev and testnet always in sync automatically
  • Production deploy is fast (re-tag + kubectl, no rebuild)
  • Simpler tag management (no -rc tags needed)

Cons

  • Testnet always gets latest development (cannot hold at a specific version independently)
  • Dev and testnet are coupled — every push to development deploys to both

Recommendation

Both options work. The choice depends on how the team uses testnet:

  • If testnet is a staging gate where you test specific versions before promoting to production → Option A (tag-based)
  • If testnet is a preview environment that should always reflect the latest development branch work → Option B (single build)

Either way, the existing deploy-production.yml (push to main) and build-container.yml should be consolidated to avoid two workflows deploying to the same k3s cluster.

Shared requirements for both options

  1. Secrets needed in repo:

    • REGISTRY_USERNAME + REGISTRY_TOKEN — Forgejo registry (already configured)
    • KUBE_CONFIG — base64 kubeconfig for k3s ci-deployer (already configured)
    • TESTNET_SSH_KEY — SSH key for testnet VM (46.225.220.102, port 34022, not yet configured)
    • OURWORLD_IT_SECRETS_SSH_KEY — SSH key for dev VM (already configured)
  2. Testnet VM docker-compose.yml already uses registry image with HEROLEDGER_NETWORK=test — just needs docker compose pull && docker compose up -d

  3. Dev VM stays as native binary via rsync (no Docker)

## Current State - **Dev** (`migrate.dev.projectmycelium.com`): `deploy-dev.yml` deploys on push to `development` via rsync + SSH to 185.69.166.168. Works but uses old pattern. - **Testnet** (`migrate.test.projectmycelium.com`): No CI/CD. Manual `docker compose pull && up` on 46.225.220.102. - **Production** (`migrate.projectmycelium.com`): Two overlapping workflows (`deploy-production.yml` on push to `main`, `build-container.yml` on tag push) both deploy to the same k3s cluster — redundant and potentially conflicting. Since v0.1.14, `HEROLEDGER_NETWORK` is a runtime env var. One image works for all environments — just set `HEROLEDGER_NETWORK=dev/test/main` at container start. No separate Dockerfile per network needed. --- ## Option A: Tag-Based Promotion Each environment has its own trigger: `development` branch for dev, `-rc` tags for testnet, release tags for production. ### Workflows | Workflow | Trigger | Image tag | Deploy to | |----------|---------|-----------|-----------| | `deploy-dev.yml` | push to `development` | `:dev` | Dev VM (185.69.166.168) | | `deploy-testnet.yml` | tag `v*-rc*` | `:<version>` (e.g. `:0.1.15-rc1`) | Testnet VM (46.225.220.102) | | `deploy-production.yml` | tag `v*` (not `-rc`) | `:<version>` + `:latest` | k3s cluster | ### Flow ``` Push to development └─> deploy-dev.yml: build → push :dev → deploy to dev VM git tag v0.1.15-rc1 && git push origin v0.1.15-rc1 └─> deploy-testnet.yml: build → push :0.1.15-rc1 → pull on testnet VM → health check git tag v0.1.15 && git push origin v0.1.15 └─> deploy-production.yml: build → push :0.1.15 + :latest → kubectl set image → health check → create release ``` ### Pros - Testnet can be held at a specific version while dev moves forward - Clear version history per environment - Testnet and production deploys are deliberate actions (tag push) - Easy rollback: `kubectl set image portal=...:0.1.14` ### Cons - Different actions for different environments (push vs tag) - Redeploying same code to testnet requires a new tag (`-rc2`, `-rc3`) - Image is rebuilt for each environment --- ## Option B: Single Build, Promote to Production One build on push to `development` deploys to both dev and testnet. Production promotion is a tag that re-tags the existing image (no rebuild). ### Workflows | Workflow | Trigger | What happens | |----------|---------|-------------| | `build.yml` | push to `development` | Build → push `:dev` + `:<short-sha>` → deploy to dev VM → deploy to testnet VM | | `release.yml` | tag `v*` | Re-tag `:dev` as `:<version>` + `:latest` → deploy to k3s → create release | ### Flow ``` Push to development └─> build.yml: ├─ build image ├─ push :dev + :abc1234 ├─ deploy to dev VM (rsync binary + restart) └─ deploy to testnet VM (docker compose pull + restart + health check) git tag v0.1.15 && git push origin v0.1.15 └─> release.yml: ├─ re-tag :dev as :0.1.15 + :latest (no rebuild) ├─ kubectl set image portal=...:0.1.15 ├─ health check └─ create Forgejo release ``` ### Pros - One build, same image everywhere — what you test on testnet is exactly what goes to production - Dev and testnet always in sync automatically - Production deploy is fast (re-tag + kubectl, no rebuild) - Simpler tag management (no `-rc` tags needed) ### Cons - Testnet always gets latest development (cannot hold at a specific version independently) - Dev and testnet are coupled — every push to `development` deploys to both --- ## Recommendation Both options work. The choice depends on how the team uses testnet: - If testnet is a **staging gate** where you test specific versions before promoting to production → **Option A** (tag-based) - If testnet is a **preview environment** that should always reflect the latest development branch work → **Option B** (single build) Either way, the existing `deploy-production.yml` (push to `main`) and `build-container.yml` should be consolidated to avoid two workflows deploying to the same k3s cluster. ### Shared requirements for both options 1. **Secrets needed in repo:** - `REGISTRY_USERNAME` + `REGISTRY_TOKEN` — Forgejo registry (already configured) - `KUBE_CONFIG` — base64 kubeconfig for k3s ci-deployer (already configured) - `TESTNET_SSH_KEY` — SSH key for testnet VM (46.225.220.102, port 34022, not yet configured) - `OURWORLD_IT_SECRETS_SSH_KEY` — SSH key for dev VM (already configured) 2. **Testnet VM docker-compose.yml** already uses registry image with `HEROLEDGER_NETWORK=test` — just needs `docker compose pull && docker compose up -d` 3. **Dev VM** stays as native binary via rsync (no Docker)
Author
Owner

Implemented Option A (Tag-Based Promotion) in branch feature/cicd-option-a — PR #73 pending merge to development .

What changed

File Change
deploy-testnet.yml New — triggers on v*-rc* tags, builds image, pushes to registry, SSH to testnet VM, docker compose pull && up, health check + smoke tests
deploy-production.yml Renamed from build-container.yml — keeps the existing working production logic, adds -rc tag exclusion so RC tags only go to testnet
build-container.yml Removed — renamed to deploy-production.yml
deploy-dev.yml No changes
test.yml No changes

CI/CD flow after merge

Push to development
  └─> deploy-dev.yml → build from source → rsync to dev VM (185.69.166.168)

git tag v0.1.15-rc1 && git push origin v0.1.15-rc1
  └─> deploy-testnet.yml → build image → push :0.1.15-rc1 → pull on testnet VM (46.225.220.102) → health check

git tag v0.1.15 && git push origin v0.1.15
  └─> deploy-production.yml → build image → push :0.1.15 + :latest → kubectl set image on k3s → health check → create release

Action required

Add TESTNET_SSH_KEY secret to the repo settings.

Implemented **Option A (Tag-Based Promotion)** in branch `feature/cicd-option-a` — PR https://forge.ourworld.tf/mycelium/www_migrate_mycelium/pulls/73 pending merge to `development` . ### What changed | File | Change | |------|--------| | `deploy-testnet.yml` | **New** — triggers on `v*-rc*` tags, builds image, pushes to registry, SSH to testnet VM, `docker compose pull && up`, health check + smoke tests | | `deploy-production.yml` | **Renamed from `build-container.yml`** — keeps the existing working production logic, adds `-rc` tag exclusion so RC tags only go to testnet | | `build-container.yml` | **Removed** — renamed to `deploy-production.yml` | | `deploy-dev.yml` | No changes | | `test.yml` | No changes | ### CI/CD flow after merge ``` Push to development └─> deploy-dev.yml → build from source → rsync to dev VM (185.69.166.168) git tag v0.1.15-rc1 && git push origin v0.1.15-rc1 └─> deploy-testnet.yml → build image → push :0.1.15-rc1 → pull on testnet VM (46.225.220.102) → health check git tag v0.1.15 && git push origin v0.1.15 └─> deploy-production.yml → build image → push :0.1.15 + :latest → kubectl set image on k3s → health check → create release ``` ### Action required Add `TESTNET_SSH_KEY` secret to the repo settings.
Member

@peter Nice cleanup!
The per-environment separation is much clearer than the old overlapping workflows.

One concern with Option A: both testnet and production trigger fresh Docker builds from different tags. Docker builds aren't reproducible, floating base images (node:20-alpine), apt-get update, and npm ci can all pull different versions between builds. So the image tested on testnet isn't guaranteed to be the same binary that goes to production.

My suggestion is build once, promote by retagging:

  1. Push to development → build image, tag with commit SHA → deploy to dev VM
  2. Tag v0.1.15 on development → retag the dev image as : (no rebuild) → deploy to testnet VM
  3. Merge to main → retag the same image as :latest (no rebuild) → deploy to production k8s

This gives you:

  • Identical binary across all three environments
  • main always reflects what's running in production
  • Version tags give history and rollback points
  • No -rc suffix needed , testnet is the RC environment (Also Thabet was pushing before againest -rc suffix, but you can confirm with him too)
  • Faster deployments (retag is instant vs multi-minute Rust build)
@peter Nice cleanup! The per-environment separation is much clearer than the old overlapping workflows. One concern with Option A: both testnet and production trigger fresh Docker builds from different tags. Docker builds aren't reproducible, floating base images (node:20-alpine), apt-get update, and npm ci can all pull different versions between builds. So the image tested on testnet isn't guaranteed to be the same binary that goes to production. My suggestion is build once, promote by retagging: 1. Push to development → build image, tag with commit SHA → deploy to dev VM 2. Tag v0.1.15 on development → retag the dev image as :<version> (no rebuild) → deploy to testnet VM 3. Merge to main → retag the same image as :latest (no rebuild) → deploy to production k8s This gives you: - Identical binary across all three environments - main always reflects what's running in production - Version tags give history and rollback points - No -rc suffix needed , testnet is the RC environment (Also Thabet was pushing before againest -rc suffix, but you can confirm with him too) - Faster deployments (retag is instant vs multi-minute Rust build)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
mycelium/www_migrate_mycelium#72
No description provided.