#3 - feat(test): E2E test suite via browser-deployed SPA — multi-user, multi-project, real-time - lhumina_code/hero_assistance

mik-tf commented

2026-05-01 02:14:23 +00:00

Owner

Why

Phase 16 (session 24) demonstrated that the SPA path (_ui --dist + socat TCP→UDS bridge) is fully drivable from a headless Chromium. This unlocks the missing top of the test pyramid: actual user-experience tests that exercise enrollment, ticket flows, presence, live updates, attachments, and cross-project isolation against the real wire stack — not just unit + integration tests against the JSON-RPC layer.

Goal: "OK it works" confidence. Catch regressions like Phase 17's WebSocket-reactive-dep bug or session-23's Bootstrap-load-mechanism breakage at PR time, not in the user's hands.

Sequencing — depends on #5 (Phase 18)

This issue is Phase 19 in the roadmap, gated by #5 (Phase 18: vision + e2e_checklist + architecture + testing docs). Reason: Phase 18 lands the row-by-row spec matrix (freezone-pattern), and every Playwright spec written here will reference a row in that checklist. Writing tests before the spec rows risks orphan tests with no clear acceptance criterion.

Demo / test infrastructure (8083-8086 scheme)

Per session-24's rebrand:

URL	Company	Role	Sign-in email
8083	Mycelium	support	`support@mycelium.com`
8084	Mycelium	customer	`alice@example.com`
8085	Hero	support	`support@hero.com`
8086	Hero	customer	`bob@example.com`

Each port = separate origin = isolated localStorage = independent logged-in identity (drivable from a single Chromium with multiple BrowserContexts).

Add carol@example.com / dave@example.com per project for two-customer presence-comparison tests.

Deliverables

Seed simplification — port the rebrand from session-24's stopgap rebrand.py into crates/hero_assistance_server/examples/phase13_seed.rs. Rename acme profile → mycelium and globex → hero. Update workspace names + user emails to the simplified scheme.
Demo orchestrator — extend scripts/phase13_demo.sh to also start the 2 _ui --dist instances + 4 socat bridges; new --browser flag (default --app for backward compat).
Test framework: Playwright (recommended over hero_browser MCP — parallel drive is more reliable; supports BrowserContext-per-origin natively; built-in trace/video). Layer at tests/e2e/ with playwright.config.ts + per-scenario .spec.ts files.
Phase 1 scenarios:
- Enrollment per role (support + customer) — sign-in flow; localStorage["hero_assistance.identity"] persists; F5 keeps you signed in
- Single-user ticket lifecycle — file → comment → optimistic insert reconciled with server row
- Two-user real-time — customer + support both view the same ticket; presence shows in "Viewing now" sidebar; back-and-forth comments arrive live without refresh
- Cross-project isolation — Brian (Hero) posts a comment; Alice (Mycelium) does NOT see it
- Image attachment — synthesize a small PNG via Uint8Array → File → fire programmatic paste event; verify upload + inline thumbnail render
- Modal flow — "New ticket" Modal opens, submits, dismisses on success
- Error states — wrong magic-link token shows -32002 "Request a new link" affordance
- Responsive — 375 / 768 / 1920 viewports; no horizontal overflow
CI integration — make e2e target; Forgejo workflow runs E2E on every PR (or nightly if too slow).
Visual regression — pixelmatch screenshot comparison, <1% perceptual diff threshold per D-09. Replaces the manual baseline-recapture step in Phase 16b.
Documentation — docs/dev/testing.md (test-pyramid layering); docs/dev/demo.md (manual exploration runbook for the 4-port browser scheme).
New decision file D-XX-e2e-test-architecture.md — Playwright + multi-context-per-origin pattern; the seed-rebrand convention; visual-regression policy.

Out of scope

Auth split (separate issue — see #4)
Visual polish beyond Phase 16 (separate issue — see #2)
Vision + executable-spec docs (separate issue — see #5; this issue references those rows)

Estimated effort

2-3 sessions:

Session A: Playwright bootstrap + seed rename + 3-4 baseline scenarios
Session B: cross-project + image-paste + responsive scenarios
Session C: CI integration + visual regression + docs + decision file

Success criterion

make e2e exits 0 on a clean checkout. Sanity check: intentionally introduce a regression (e.g., revert the Phase 17 WebSocket-reactive-dep fix) and confirm the suite goes red.

## Why Phase 16 (session 24) demonstrated that the SPA path (`_ui --dist` + socat TCP→UDS bridge) is fully drivable from a headless Chromium. This unlocks the missing top of the test pyramid: actual user-experience tests that exercise enrollment, ticket flows, presence, live updates, attachments, and cross-project isolation against the real wire stack — not just unit + integration tests against the JSON-RPC layer. Goal: **"OK it works"** confidence. Catch regressions like Phase 17's WebSocket-reactive-dep bug or session-23's Bootstrap-load-mechanism breakage at PR time, not in the user's hands. ## Sequencing — depends on #5 (Phase 18) This issue is **Phase 19** in the roadmap, gated by #5 (Phase 18: vision + e2e_checklist + architecture + testing docs). Reason: Phase 18 lands the row-by-row spec matrix (freezone-pattern), and every Playwright spec written here will reference a row in that checklist. Writing tests before the spec rows risks orphan tests with no clear acceptance criterion. ## Demo / test infrastructure (8083-8086 scheme) Per session-24's rebrand: | URL | Company | Role | Sign-in email | |---|---|---|---| | 8083 | Mycelium | support | `support@mycelium.com` | | 8084 | Mycelium | customer | `alice@example.com` | | 8085 | Hero | support | `support@hero.com` | | 8086 | Hero | customer | `bob@example.com` | Each port = separate origin = isolated localStorage = independent logged-in identity (drivable from a single Chromium with multiple BrowserContexts). Add `carol@example.com` / `dave@example.com` per project for two-customer presence-comparison tests. ## Deliverables 1. **Seed simplification** — port the rebrand from session-24's stopgap `rebrand.py` into `crates/hero_assistance_server/examples/phase13_seed.rs`. Rename `acme` profile → `mycelium` and `globex` → `hero`. Update workspace names + user emails to the simplified scheme. 2. **Demo orchestrator** — extend `scripts/phase13_demo.sh` to also start the 2 `_ui --dist` instances + 4 socat bridges; new `--browser` flag (default `--app` for backward compat). 3. **Test framework: Playwright** (recommended over hero_browser MCP — parallel drive is more reliable; supports BrowserContext-per-origin natively; built-in trace/video). Layer at `tests/e2e/` with `playwright.config.ts` + per-scenario `.spec.ts` files. 4. **Phase 1 scenarios:** - Enrollment per role (support + customer) — sign-in flow; `localStorage["hero_assistance.identity"]` persists; F5 keeps you signed in - Single-user ticket lifecycle — file → comment → optimistic insert reconciled with server row - Two-user real-time — customer + support both view the same ticket; presence shows in "Viewing now" sidebar; back-and-forth comments arrive live without refresh - Cross-project isolation — Brian (Hero) posts a comment; Alice (Mycelium) does NOT see it - Image attachment — synthesize a small PNG via `Uint8Array` → `File` → fire programmatic `paste` event; verify upload + inline thumbnail render - Modal flow — "New ticket" Modal opens, submits, dismisses on success - Error states — wrong magic-link token shows `-32002` "Request a new link" affordance - Responsive — 375 / 768 / 1920 viewports; no horizontal overflow 5. **CI integration** — `make e2e` target; Forgejo workflow runs E2E on every PR (or nightly if too slow). 6. **Visual regression** — pixelmatch screenshot comparison, `<1%` perceptual diff threshold per D-09. Replaces the manual baseline-recapture step in Phase 16b. 7. **Documentation** — `docs/dev/testing.md` (test-pyramid layering); `docs/dev/demo.md` (manual exploration runbook for the 4-port browser scheme). 8. **New decision file** `D-XX-e2e-test-architecture.md` — Playwright + multi-context-per-origin pattern; the seed-rebrand convention; visual-regression policy. ## Out of scope - Auth split (separate issue — see #4) - Visual polish beyond Phase 16 (separate issue — see #2) - Vision + executable-spec docs (separate issue — see #5; this issue references those rows) ## Estimated effort 2-3 sessions: - Session A: Playwright bootstrap + seed rename + 3-4 baseline scenarios - Session B: cross-project + image-paste + responsive scenarios - Session C: CI integration + visual regression + docs + decision file ## Success criterion `make e2e` exits 0 on a clean checkout. Sanity check: intentionally introduce a regression (e.g., revert the Phase 17 WebSocket-reactive-dep fix) and confirm the suite goes red.

mik-tf referenced this issue

2026-05-01 02:14:55 +00:00

feat(auth): split auth — magic-link for customers, email+password for support agents (v0.3.0) #4

mik-tf referenced this issue

2026-05-01 02:16:02 +00:00

docs(spec): Phase 18 — vision + executable spec + architecture + testing pyramid (freezone pattern) #5

mik-tf referenced this issue from a commit

2026-05-01 02:29:08 +00:00

feat(session-24): Phase 16a — UI polish foundation + D-18 + issues #3/#4/#5

mik-tf referenced this issue from a commit

2026-05-01 02:29:08 +00:00

chore(sessions): record session-24 manifest + CLAUDE.md/prompt.md handoff

mik-tf referenced this issue

2026-05-01 02:41:13 +00:00

feat(auth): split auth — magic-link for customers, email+password for support agents (v0.3.0) #4

mik-tf referenced this issue from a commit

2026-05-01 03:43:16 +00:00

docs(session-25): Phase 18a — vision + executable-spec skeleton + sections A-D

mik-tf referenced this issue

2026-05-01 03:44:01 +00:00

docs(spec): Phase 18 — vision + executable spec + architecture + testing pyramid (freezone pattern) #5

mik-tf referenced this issue from a commit

2026-05-01 04:15:32 +00:00

chore(sessions): record session-26 manifest + CLAUDE.md/prompt.md handoff

mik-tf referenced this issue

2026-05-01 04:16:13 +00:00

docs(spec): Phase 18 — vision + executable spec + architecture + testing pyramid (freezone pattern) #5

mik-tf referenced this issue

2026-05-01 14:48:15 +00:00

Phase 16 — UI polish: from "functional" to "amazing" #2

mik-tf referenced this issue from a commit

2026-05-01 17:53:07 +00:00

feat(tests,scripts)(session-28): Phase 18c — Playwright suite + 4-port browser scheme + L2/L4/L5/L6 substrate

mik-tf referenced this issue from a commit

2026-05-01 18:53:32 +00:00

feat(tests,docs)(session-29): Phase 18c-release — un-skip enrol+breadcrumb specs, land 6 SPA L7 baselines, flip ~31 checklist cells, v0.3.0

mik-tf closed this issue

2026-05-01 18:53:32 +00:00

mik-tf commented

2026-05-01 18:54:01 +00:00

Author

Owner

v0.3.0 shipped — Phase 18c-release closes this issue.

Commit: 39d85e5 on development; tag v0.3.0 (annotated).

What landed in s29:

Un-skipped tests/playwright/regression/enrol.spec.ts — three concrete fixes (none of which were the s28-theorised hydration race): log-grep path was relative to test cwd not repo root; Token-step submit button accessible name has a leading space from the icon glyph (regex /^sign in$/i → /sign in/i); parsed.email assertion replaced with user_id > 0 + role.length > 0 since IdentityState::Authenticated.email is intentionally None post-enrolment per enrollment.rs:226. Also added a defensive clickAndWaitForRpc helper that gates click+RPC on page.waitForResponse with a method-name post-data filter and 5-attempt retry — hydration-race insurance even though hydration wasn't the root cause.
Un-skipped tests/playwright/regression/breadcrumb.spec.ts — root cause was stale WASM dist (target/dx/hero_assistance_ui_wasm/release/web/public was Apr 30 18:50, predating s27's W8 Breadcrumb landing on May 1). The running _ui --dist served the prior ← Back to tickets ad-hoc link instead of the typed Breadcrumb component, so nav[aria-label="breadcrumb"] never matched. Resolved by make dist; spec body unchanged.
Landed 6 L7 SPA baseline PNGs via scripts/capture-spa-baselines.sh against the running 4-port demo + hero_browser_server on :8884:
- tests/baselines/spa-enroll-{375,768,1920}.png (EnrollmentModal)
- tests/baselines/spa-home-{375,768,1920}.png (Anonymous CTA)
  Detail-page capture deferred per script comment — needs identity pre-seed that respects Dioxus 0.7 hydration order.
Flipped 31 Browser?/MCP? cells in docs/dev/e2e_checklist.md Sections A/C/D/G/L with <test_file>:<test_name> evidence per §Process invariants (every ✅ on a Wired/Browser/MCP cell points at a specific test or capture). Plus 8 Wired? enhancements with tests/e2e_journey.sh:Phase<N> evidence on Section L methods. Plus 6 new rows in Section M (M-15..M-20) for the SPA captures with MCP ✅ via hero_browser MCP capture.
Updated docs/dev/testing.md §2 layer counts: L5 9 active+2 skipped → 11 passing; L7 14 baselines → 20 baselines; total ~381 → ~389.

Test posture:

L5 Playwright regression: 11 passing (was 9 active + 2 skipped)
L6 Playwright adversarial: 7 passing (unchanged)
L7 Visual regression: 20 baselines (5 desktop manual + 9 askama-auto + 6 SPA-auto from s29)
L1 unit / L2 smoke / L3 integration / L4 E2E: unchanged (237 / 16 / 80 / 18)
3 L-03 inherited failures unchanged; 1 phase10 transient flake unchanged; 0 ignored Playwright (was 2 in s28)

No new decisions; no new limitations. B.5 skip held (doc + spec polish; no wire/protocol/invariant change).

Closing — substrate is engineering-complete pre-customer at the test-pyramid layer.

**v0.3.0 shipped — Phase 18c-release closes this issue.** Commit: `39d85e5` on `development`; tag `v0.3.0` (annotated). **What landed in s29:** 1. **Un-skipped `tests/playwright/regression/enrol.spec.ts`** — three concrete fixes (none of which were the s28-theorised hydration race): log-grep path was relative to test cwd not repo root; Token-step submit button accessible name has a leading space from the icon glyph (regex `/^sign in$/i` → `/sign in/i`); `parsed.email` assertion replaced with `user_id > 0` + `role.length > 0` since `IdentityState::Authenticated.email` is intentionally `None` post-enrolment per `enrollment.rs:226`. Also added a defensive `clickAndWaitForRpc` helper that gates click+RPC on `page.waitForResponse` with a method-name post-data filter and 5-attempt retry — hydration-race insurance even though hydration wasn't the root cause. 2. **Un-skipped `tests/playwright/regression/breadcrumb.spec.ts`** — root cause was stale WASM dist (`target/dx/hero_assistance_ui_wasm/release/web/public` was Apr 30 18:50, predating s27's W8 Breadcrumb landing on May 1). The running `_ui --dist` served the prior `← Back to tickets` ad-hoc link instead of the typed `Breadcrumb` component, so `nav[aria-label="breadcrumb"]` never matched. Resolved by `make dist`; spec body unchanged. 3. **Landed 6 L7 SPA baseline PNGs** via `scripts/capture-spa-baselines.sh` against the running 4-port demo + `hero_browser_server` on `:8884`: - `tests/baselines/spa-enroll-{375,768,1920}.png` (EnrollmentModal) - `tests/baselines/spa-home-{375,768,1920}.png` (Anonymous CTA) Detail-page capture deferred per script comment — needs identity pre-seed that respects Dioxus 0.7 hydration order. 4. **Flipped 31 Browser?/MCP? cells** in `docs/dev/e2e_checklist.md` Sections A/C/D/G/L with `<test_file>:<test_name>` evidence per §Process invariants (every ✅ on a Wired/Browser/MCP cell points at a specific test or capture). Plus 8 Wired? enhancements with `tests/e2e_journey.sh:Phase<N>` evidence on Section L methods. Plus 6 new rows in Section M (M-15..M-20) for the SPA captures with MCP ✅ via `hero_browser` MCP capture. 5. **Updated `docs/dev/testing.md`** §2 layer counts: L5 9 active+2 skipped → **11 passing**; L7 14 baselines → **20 baselines**; total ~381 → ~389. **Test posture:** - L5 Playwright regression: **11 passing** (was 9 active + 2 skipped) - L6 Playwright adversarial: **7 passing** (unchanged) - L7 Visual regression: **20 baselines** (5 desktop manual + 9 askama-auto + 6 SPA-auto from s29) - L1 unit / L2 smoke / L3 integration / L4 E2E: unchanged (237 / 16 / 80 / 18) - 3 L-03 inherited failures unchanged; 1 phase10 transient flake unchanged; 0 ignored Playwright (was 2 in s28) **No new decisions; no new limitations.** B.5 skip held (doc + spec polish; no wire/protocol/invariant change). Closing — substrate is engineering-complete pre-customer at the test-pyramid layer.

mik-tf referenced this issue from a commit

2026-05-01 19:00:44 +00:00

chore(sessions): record session-29 manifest + CLAUDE.md/prompt.md handoff

mik-tf referenced this issue from a commit

2026-05-02 01:11:17 +00:00

feat(session-24): Phase 16a — UI polish foundation + D-18 + issues #3/#4/#5

mik-tf referenced this issue from a commit

2026-05-02 01:11:17 +00:00

chore(sessions): record session-24 manifest + CLAUDE.md/prompt.md handoff

mik-tf referenced this issue from a commit

2026-05-02 01:11:17 +00:00

docs(session-25): Phase 18a — vision + executable-spec skeleton + sections A-D