lhumina_code/hero_collab

Fork 0

UX polish iteration — dogfood-driven refinement of the Slack-parity surface #10

New issue

Open

opened 2026-04-17 21:44:15 +00:00 by sameh-farouk · 8 comments

sameh-farouk commented

2026-04-17 21:44:15 +00:00

Member

Context

Following closure of the Slack feature parity implementation plan (#9), hero_collab now covers roughly 85–90% of Slack's daily-use feature surface. The features work. What we haven't done is use the product for a week straight.

The recent work was feature-by-feature: ship reactions, ship threads, ship canvases, ship huddles. Each feature was verified in isolation — unit tests, integration tests, a browser smoke for the specific flow. What that approach doesn't catch is the friction that only shows up when you string several flows together in the rhythm a real user has: switch channels, read a thread, open a canvas, start a huddle, come back, find the message you lost in the scroll.

This iteration is for that.

Goal

Surface and fix UX friction that only appears during actual extended use of the product. Flow-level, not feature-level. The measure of success is "a new user can do a full workday in hero_collab without feeling like it's fighting them" — not a ticked checkbox list.

Approach

Usage-driven, not plan-driven.

Dogfood hero_collab for at least a week in normal workflow — real messages, real threads, real canvases, real DMs, multiple users if we can rope in testers.
Capture every moment of friction as it happens. "I expected X and got Y." "I had to click 3 times for what should be 1." "I didn't know that feature existed." "That error message didn't help me."
Each captured item becomes a scoped follow-up issue linked back here. Some get fixed in this iteration; some get deferred if they turn out to be larger than UX polish.
Close this issue when the friction log stabilises — i.e., a day of use produces zero new entries that look like UX rather than feature gaps or real bugs.

Intentionally NOT doing: a fixed checklist of UX items to audit. Checklists front-load the conclusions; the point here is to discover what actually hurts.

Likely focus areas (non-exhaustive)

Based on what the codebase already suggests might benefit from attention, without pre-committing:

Onboarding / empty states — first load, empty channel, empty search results, fresh workspace. What does a brand-new user see and do next?
Error messages — when an RPC fails, does the toast/banner actually help the user, or just dump a trace ID?
Loading + latency — skeletons, spinners, optimistic UI on message send, channel switch latency, canvas-open delay.
Keyboard shortcuts — discoverability (Ctrl+K / ⌘+K works, but does anyone know it exists?), cheatsheet, consistent behavior across panels.
Context menus and affordances — right-click reveals options, but are the most common actions visible without right-click?
Notification tuning — @mention notifications fire when the tab is backgrounded, but do they respect a "do not disturb" intent? Are they too aggressive?
Thread navigation — the panel works, but does "return to channel" flow feel right?
Attachment flow — upload progress, multi-file, drag-and-drop target clarity, error recovery.
Huddle UX — join latency, mute-on-join default, participant list visibility, the red "leave" affordance.
Canvas UX — creation flow, sharing flow, when to open in-pane vs. new tab, how cards surface in chat.
Multi-user friction — the user-picker flow for dev mode works but is awkward; real multi-user testing revealed that some flows expect a single logged-in user and break or misrender otherwise.
Responsive layout — does it survive a narrow browser window / second monitor at odd aspect?
Typography + density — is the message list readable without leaning in? Are timestamps + usernames + reactions + edits + attachments competing for visual weight?

None of these are commitments. The real list comes from using the product.

Non-goals

New features. If something missing is clearly a feature gap rather than UX polish (a Slack feature we never shipped), it goes into a separate issue, not this one.
Architecture refactors. The chat-app.js ES-module split is tracked separately; if UX work surfaces cleanly-separable chunks it may prompt a partial split, but we're not doing the full 3759-line refactor under this banner.
Backend work. Server-side bugs that surface during use get filed as their own issues. This issue tracks UX polish that lives in the UI layer.
Slack-copying for its own sake. If Slack does something and we don't, that's fine as long as what we do works and makes sense to a user.

What "done" looks like

Friction log stabilizes (a day of sustained use yields no new entries of the "this is awkward" variety).
Follow-up issues filed for every captured item, with resolution status: fixed-here, deferred-with-reason, or scope-escalated-to-new-issue.
Closure comment on this issue summarising the specific changes shipped.

#9 — Slack Feature Parity plan (closed)
plan/known-issues.md — items tracked as open at closure of #9; several (bootstrap ownership in proxy mode, ES-module split, admin-dashboard coverage) may intersect with items surfaced during this pass.

## Context Following closure of the [Slack feature parity implementation plan (#9)](https://forge.ourworld.tf/lhumina_code/hero_collab/issues/9#issuecomment-20111), hero_collab now covers roughly 85–90% of Slack's daily-use feature surface. The features work. What we haven't done is *use the product for a week straight*. The recent work was feature-by-feature: ship reactions, ship threads, ship canvases, ship huddles. Each feature was verified in isolation — unit tests, integration tests, a browser smoke for the specific flow. What that approach doesn't catch is the friction that only shows up when you string several flows together in the rhythm a real user has: switch channels, read a thread, open a canvas, start a huddle, come back, find the message you lost in the scroll. This iteration is for that. ## Goal Surface and fix UX friction that only appears during actual extended use of the product. Flow-level, not feature-level. The measure of success is "a new user can do a full workday in hero_collab without feeling like it's fighting them" — not a ticked checkbox list. ## Approach **Usage-driven, not plan-driven.** 1. Dogfood hero_collab for at least a week in normal workflow — real messages, real threads, real canvases, real DMs, multiple users if we can rope in testers. 2. Capture every moment of friction as it happens. "I expected X and got Y." "I had to click 3 times for what should be 1." "I didn't know that feature existed." "That error message didn't help me." 3. Each captured item becomes a scoped follow-up issue linked back here. Some get fixed in this iteration; some get deferred if they turn out to be larger than UX polish. 4. Close this issue when the friction log stabilises — i.e., a day of use produces zero new entries that look like UX rather than feature gaps or real bugs. Intentionally NOT doing: a fixed checklist of UX items to audit. Checklists front-load the conclusions; the point here is to discover what actually hurts. ## Likely focus areas (non-exhaustive) Based on what the codebase already suggests might benefit from attention, without pre-committing: - **Onboarding / empty states** — first load, empty channel, empty search results, fresh workspace. What does a brand-new user see and do next? - **Error messages** — when an RPC fails, does the toast/banner actually help the user, or just dump a trace ID? - **Loading + latency** — skeletons, spinners, optimistic UI on message send, channel switch latency, canvas-open delay. - **Keyboard shortcuts** — discoverability (Ctrl+K / ⌘+K works, but does anyone know it exists?), cheatsheet, consistent behavior across panels. - **Context menus and affordances** — right-click reveals options, but are the most common actions visible without right-click? - **Notification tuning** — @mention notifications fire when the tab is backgrounded, but do they respect a "do not disturb" intent? Are they too aggressive? - **Thread navigation** — the panel works, but does "return to channel" flow feel right? - **Attachment flow** — upload progress, multi-file, drag-and-drop target clarity, error recovery. - **Huddle UX** — join latency, mute-on-join default, participant list visibility, the red "leave" affordance. - **Canvas UX** — creation flow, sharing flow, when to open in-pane vs. new tab, how cards surface in chat. - **Multi-user friction** — the user-picker flow for dev mode works but is awkward; real multi-user testing revealed that some flows expect a single logged-in user and break or misrender otherwise. - **Responsive layout** — does it survive a narrow browser window / second monitor at odd aspect? - **Typography + density** — is the message list readable without leaning in? Are timestamps + usernames + reactions + edits + attachments competing for visual weight? None of these are commitments. The real list comes from using the product. ## Non-goals - **New features.** If something missing is clearly a feature gap rather than UX polish (a Slack feature we never shipped), it goes into a separate issue, not this one. - **Architecture refactors.** The chat-app.js ES-module split is tracked separately; if UX work surfaces cleanly-separable chunks it may prompt a partial split, but we're not doing the full 3759-line refactor under this banner. - **Backend work.** Server-side bugs that surface during use get filed as their own issues. This issue tracks UX polish that lives in the UI layer. - **Slack-copying for its own sake.** If Slack does something and we don't, that's fine as long as what we do works and makes sense to a user. ## What "done" looks like - Friction log stabilizes (a day of sustained use yields no new entries of the "this is awkward" variety). - Follow-up issues filed for every captured item, with resolution status: fixed-here, deferred-with-reason, or scope-escalated-to-new-issue. - Closure comment on this issue summarising the specific changes shipped. ## Related - [#9 — Slack Feature Parity plan (closed)](https://forge.ourworld.tf/lhumina_code/hero_collab/issues/9) - `plan/known-issues.md` — items tracked as open at closure of #9; several (bootstrap ownership in proxy mode, ES-module split, admin-dashboard coverage) may intersect with items surfaced during this pass.

sameh-farouk referenced this issue from a commit

2026-04-18 01:32:12 +00:00

fix(ui): two live-refresh bugs surfaced in dogfooding (#10)

sameh-farouk referenced this issue from a commit

2026-04-18 01:32:12 +00:00

fix(ui): four live-session bugs from dogfooding round 2 (#10)

sameh-farouk referenced this issue from a commit

2026-04-18 01:32:12 +00:00

fix(ui): consolidate connection status + mention notification via poll (#10)

sameh-farouk referenced this issue from a commit

2026-04-18 01:32:12 +00:00

fix(ui): DM display shows OTHER participant, not stored channel.name (#10)

sameh-farouk referenced this issue from a commit

2026-04-18 01:32:12 +00:00

fix(dm): cross-channel render bug + inline DM members in channel.list (#10)

sameh-farouk commented

2026-04-18 01:33:05 +00:00

Author

Member

Round 3 — DM symptoms + fixes

Reported:

DM name shows wrong on receiver side. Alice creates a DM with Bob → Bob's sidebar reads "Bob" instead of "Alice". (Channel row was rendering the stored channel.name, which is the creator-side label.)
Duplicate DM row on explicit + pick. Clicking + → picking an existing DM peer added a second row; it merged after a refresh.
DM briefly renders in the wrong channel. Alice DMs Bob while Bob is viewing #general → the DM flashes into #general, then disappears on refresh. Message is actually persisted in the DM; #general was rendering a stale WS event during the channel-switch race.
Dead delegated click handler (data-start-dm-user-id) left over from a prior "No DMs yet" design.
Typing indicator strip is very thin and easy to miss — noted, deferred.

Fixed (commits 017ab16, 0a17a52):

DM sidebar display now resolves the other participant per-viewer, regardless of who created the DM. Server-side, channel.list inlines members: [user_id, ...] for DM / group-DM channels; the client reads ch.members to pick the peer id and renders the peer's name + avatar. Removed the N-per-DM channel.member.list fan-out that the earlier client-only fix used.
startDm dedupes against an existing local DM row (peer-id match, not name match) before pushing a new entry, and before calling channel.find_dm.
message.created WS handler now drops events whose channel_id doesn't match the currently-viewed channel — closes the async-close race that let the prior channel's events render into the new view.
Removed dead data-start-dm-user-id branch from the DM-list delegate.

Deferred:

Typing indicator visual prominence.

## Round 3 — DM symptoms + fixes **Reported:** 1. *DM name shows wrong on receiver side.* Alice creates a DM with Bob → Bob's sidebar reads "Bob" instead of "Alice". (Channel row was rendering the stored `channel.name`, which is the creator-side label.) 2. *Duplicate DM row on explicit `+` pick.* Clicking + → picking an existing DM peer added a second row; it merged after a refresh. 3. *DM briefly renders in the wrong channel.* Alice DMs Bob while Bob is viewing #general → the DM flashes into #general, then disappears on refresh. Message is actually persisted in the DM; #general was rendering a stale WS event during the channel-switch race. 4. Dead delegated click handler (`data-start-dm-user-id`) left over from a prior "No DMs yet" design. 5. Typing indicator strip is very thin and easy to miss — noted, deferred. **Fixed (commits `017ab16`, `0a17a52`):** - DM sidebar display now resolves the **other participant** per-viewer, regardless of who created the DM. Server-side, `channel.list` inlines `members: [user_id, ...]` for DM / group-DM channels; the client reads `ch.members` to pick the peer id and renders the peer's name + avatar. Removed the N-per-DM `channel.member.list` fan-out that the earlier client-only fix used. - `startDm` dedupes against an existing local DM row (peer-id match, not name match) before pushing a new entry, and before calling `channel.find_dm`. - `message.created` WS handler now drops events whose `channel_id` doesn't match the currently-viewed channel — closes the async-close race that let the prior channel's events render into the new view. - Removed dead `data-start-dm-user-id` branch from the DM-list delegate. **Deferred:** - Typing indicator visual prominence.

sameh-farouk referenced this issue from a commit

2026-04-18 02:51:26 +00:00

fix(ui+server): round 4 — membership/auth/reactions/UX polish (#10)

sameh-farouk commented

2026-04-18 02:51:56 +00:00

Author

Member

Round 4 — dogfooding symptoms + fixes

Reported:

Sidebar leak. Bob saw every channel in the workspace (including Alice's private-ish ones); Browse showed all channels as "Joined"; Bob could post to channels he hadn't joined.
Upload "Authentication required" in --auth-mode=dev. Happened after switching to CLI-flag launch (env var drift between attachment.rs / document.rs / canvas.rs).
Photo-only message fails with field 'content': must not be empty or whitespace-only.
DM badges. Plain number beside name; not realtime — required refresh.
Pinned panel is read-only — no jump-to-message, no unpin.
Channel details panel. Avatar stretched full-width; role rendered as a second word; no Add People / Leave / Remove.
Huddle anchor shows green "Join" even when you're already in the call.
Thread reactions require refresh to reflect; main feed works live.
Create channel modal has Voice / Video options that do nothing (not Slack-shaped; were Discord-style placeholder scaffolding from an earlier infra commit).

Fixed (commit 2e8869d):

channel.list gained optional member_of; sidebar filters to joined; Browse still shows all public.
message.send membership gate now runs on the resolved user_id (not just caller_id), so dev mode enforces membership like proxy.
attachment.rs + document.rs is_dev_mode delegate to permissions::is_dev_mode OnceLock (same fix canvas.rs already had). COLLAB_AUTH_MODE env read was broken under hero_proc's clean-env spawn; argv path works now.
MessageSend.content is optional; handler requires content OR attachments and still length-caps content when present.
Badge styles moved into chat.html's inline <style> — static/css/chat.css is never loaded, so all my earlier edits to it were silent dead code. DM + channel rows now render proper pill + bold name on unread.
Background WS subscriptions per joined channel → unread bumps are realtime instead of waiting 30–120s. Flagged as shortcut; proper fix is a user-scoped multiplexed WS (task #110, scoped for Phase 6 hardening — the N-sockets-per-user model won't scale past small workspaces).
Pinned panel cards are clickable (jump + highlight in feed); each row has an unpin button that syncs local state + main-feed pin indicator.
Channel details panel rewired: 32×32 avatar circles + role pill (amber for admin/owner), Add People modal (search + per-row Add), Leave (confirms, drops sidebar, selects next), per-row Remove on private channels only (Slack-parity — public remove is a speed-bump without a real admin role system).
Huddle card swaps to red Leave when the viewer is in the call and re-renders live via HuddleManager.setOnStateChange.
Thread reactions live-update now. Added findMessageAnywhere(msgId) helper across toggleReaction, updateReactionsDOM, and the message.reacted WS handler — thread replies live in state.threadReplies, which the old path ignored.
Dropped Voice / Video Channel options from the end-user Create modal. Admin dashboard (hero_collab_app) still has them pending separate decision.

## Round 4 — dogfooding symptoms + fixes **Reported:** 1. *Sidebar leak.* Bob saw every channel in the workspace (including Alice's private-ish ones); Browse showed all channels as "Joined"; Bob could post to channels he hadn't joined. 2. *Upload "Authentication required"* in `--auth-mode=dev`. Happened after switching to CLI-flag launch (env var drift between attachment.rs / document.rs / canvas.rs). 3. *Photo-only message fails* with `field 'content': must not be empty or whitespace-only`. 4. *DM badges.* Plain number beside name; not realtime — required refresh. 5. *Pinned panel is read-only* — no jump-to-message, no unpin. 6. *Channel details panel.* Avatar stretched full-width; role rendered as a second word; no Add People / Leave / Remove. 7. *Huddle anchor* shows green "Join" even when you're already in the call. 8. *Thread reactions* require refresh to reflect; main feed works live. 9. *Create channel modal* has Voice / Video options that do nothing (not Slack-shaped; were Discord-style placeholder scaffolding from an earlier infra commit). **Fixed (commit `2e8869d`):** - `channel.list` gained optional `member_of`; sidebar filters to joined; Browse still shows all public. - `message.send` membership gate now runs on the resolved `user_id` (not just `caller_id`), so dev mode enforces membership like proxy. - `attachment.rs` + `document.rs` `is_dev_mode` delegate to `permissions::is_dev_mode` OnceLock (same fix canvas.rs already had). `COLLAB_AUTH_MODE` env read was broken under hero_proc's clean-env spawn; argv path works now. - `MessageSend.content` is optional; handler requires content OR attachments and still length-caps content when present. - Badge styles moved into chat.html's inline `<style>` — `static/css/chat.css` is never loaded, so all my earlier edits to it were silent dead code. DM + channel rows now render proper pill + bold name on unread. - Background WS subscriptions per joined channel → unread bumps are realtime instead of waiting 30–120s. **Flagged as shortcut**; proper fix is a user-scoped multiplexed WS (task #110, scoped for Phase 6 hardening — the N-sockets-per-user model won't scale past small workspaces). - Pinned panel cards are clickable (jump + highlight in feed); each row has an unpin button that syncs local state + main-feed pin indicator. - Channel details panel rewired: 32×32 avatar circles + role pill (amber for admin/owner), **Add People** modal (search + per-row Add), **Leave** (confirms, drops sidebar, selects next), per-row **Remove** on private channels only (Slack-parity — public remove is a speed-bump without a real admin role system). - Huddle card swaps to red **Leave** when the viewer is in the call and re-renders live via `HuddleManager.setOnStateChange`. - Thread reactions live-update now. Added `findMessageAnywhere(msgId)` helper across `toggleReaction`, `updateReactionsDOM`, and the `message.reacted` WS handler — thread replies live in `state.threadReplies`, which the old path ignored. - Dropped Voice / Video Channel options from the end-user Create modal. Admin dashboard (`hero_collab_app`) still has them pending separate decision.

sameh-farouk referenced this issue from a commit

2026-04-18 02:59:33 +00:00

fix(ui): round 5 — DM panel, canvas back-refresh, DM button scope (#10)

sameh-farouk commented

2026-04-18 02:59:59 +00:00

Author

Member

Round 5 — DM panel, canvas back-refresh, DM button scope

Reported:

DM Channel Details header shows MY name instead of the other person. Sub-header reads "Private channel · 2 members" which is nonsense for a 1:1 DM.
Canvas back button leaves chat-tab stale. Rename a canvas → click Back → sidebar still shows "Untitled canvas" until I manually refresh.
DM Channel Details panel shows Add People / Leave / Remove buttons that don't make sense for a DM.

Fixed (commit e908265):

DM Channel Details now resolves the peer name via state.dmPeers (same logic as the sidebar row). Icon → bi-chat-dots-fill. Sub-header → "Direct message" (1:1) or "Group DM · N members" (3+).
Canvas ↔ chat coordination via BroadcastChannel('hero-collab-canvas'). canvas-app posts canvas.changed on delete + on goBack (which also flushes any pending title edit via saveTitle first). chat-app subscribes once at load; any signal triggers loadCanvases(). No server round-trip for the notification. Falls back silently on browsers without the API (baseline = today's manual-refresh behavior).
Add / Leave / Remove buttons hidden when kind is dm or group_dm. Slack parity: 1:1 DMs have no member management, group DMs use "close conversation" (separate feature, not built). Per-row × Remove is scoped to private non-DM channels only.

Still tracked (deferred):

Task #110 — proper user-scoped WS. Current per-channel-fanout implementation works for MVP scale but won't scale past ~250 channels/user due to browser WS caps. Proper fix is /ws/user/{user_id} multiplexed relay in hero_collab_ui; ~150 server + 50 client lines; scheduled for Phase 6 hardening.
Typing-indicator visual prominence (thin strip above composer, easy to miss).

## Round 5 — DM panel, canvas back-refresh, DM button scope **Reported:** 1. *DM Channel Details header shows MY name* instead of the other person. Sub-header reads "Private channel · 2 members" which is nonsense for a 1:1 DM. 2. *Canvas back button leaves chat-tab stale.* Rename a canvas → click Back → sidebar still shows "Untitled canvas" until I manually refresh. 3. *DM Channel Details panel shows Add People / Leave / Remove buttons* that don't make sense for a DM. **Fixed (commit `e908265`):** - DM Channel Details now resolves the peer name via `state.dmPeers` (same logic as the sidebar row). Icon → `bi-chat-dots-fill`. Sub-header → "Direct message" (1:1) or "Group DM · N members" (3+). - Canvas ↔ chat coordination via `BroadcastChannel('hero-collab-canvas')`. canvas-app posts `canvas.changed` on delete + on goBack (which also flushes any pending title edit via `saveTitle` first). chat-app subscribes once at load; any signal triggers `loadCanvases()`. No server round-trip for the notification. Falls back silently on browsers without the API (baseline = today's manual-refresh behavior). - Add / Leave / Remove buttons hidden when `kind` is `dm` or `group_dm`. Slack parity: 1:1 DMs have no member management, group DMs use "close conversation" (separate feature, not built). Per-row × Remove is scoped to private non-DM channels only. **Still tracked (deferred):** - Task #110 — **proper user-scoped WS**. Current per-channel-fanout implementation works for MVP scale but won't scale past ~250 channels/user due to browser WS caps. Proper fix is `/ws/user/{user_id}` multiplexed relay in hero_collab_ui; ~150 server + 50 client lines; scheduled for Phase 6 hardening. - Typing-indicator visual prominence (thin strip above composer, easy to miss).

sameh-farouk commented

2026-04-20 02:40:42 +00:00

Author

Member

Canvas editor UX improvements landed on development (5e0d9c2):

Bubble menu — text styles, sub/super, color, font, alignment
Typography auto-replace (smart quotes, em-dashes, etc.)
Tables — Notion-style hover size picker + contextual toolbar (add/delete row-col, merge, header row)
Callouts + internal embeds for message/canvas references
Images — upload via button or drag-drop, with resize/align/replace/alt/delete via custom NodeView
Canvas-scoped attachment storage — new canvas_id column, HTTP byte route with access control, cascade on canvas delete, cleanup reaper protects canvas rows

Tests green (40 unit + 33 integration). Pushed.

**Canvas editor UX improvements** landed on `development` ([`5e0d9c2`](https://forge.ourworld.tf/lhumina_code/hero_collab/commit/5e0d9c2)): - **Bubble menu** — text styles, sub/super, color, font, alignment - **Typography auto-replace** (smart quotes, em-dashes, etc.) - **Tables** — Notion-style hover size picker + contextual toolbar (add/delete row-col, merge, header row) - **Callouts + internal embeds** for message/canvas references - **Images** — upload via button or drag-drop, with resize/align/replace/alt/delete via custom NodeView - **Canvas-scoped attachment storage** — new `canvas_id` column, HTTP byte route with access control, cascade on canvas delete, cleanup reaper protects canvas rows Tests green (40 unit + 33 integration). Pushed.

sameh-farouk commented

2026-04-26 19:14:42 +00:00

Author

Member

Canvas read-only viewer — client-side gating gap

Found while dogfooding: for users with read-only access to a canvas (channel guests / non-editors), the client still renders hover menus and editing toolbars. Clicking them is harmless — the server fails the request — but it's a confusing UX and looks broken.

Not a security issue. The server enforces correctly via check_canvas_access(write_required: true) at 8 sites in canvas.rs, so write attempts are rejected regardless of UI state.

Scope of the fix: role-gate the canvas UI affordances (toolbars, hover menus, drag handles) on the client based on the user's effective canvas role, mirroring how the server gates the RPCs.

Queueing for the next pass — not today.

### Canvas read-only viewer — client-side gating gap Found while dogfooding: for users with read-only access to a canvas (channel guests / non-editors), the client still renders hover menus and editing toolbars. Clicking them is harmless — the server fails the request — but it's a confusing UX and looks broken. **Not a security issue.** The server enforces correctly via `check_canvas_access(write_required: true)` at 8 sites in `canvas.rs`, so write attempts are rejected regardless of UI state. **Scope of the fix:** role-gate the canvas UI affordances (toolbars, hover menus, drag handles) on the client based on the user's effective canvas role, mirroring how the server gates the RPCs. Queueing for the next pass — not today.

sameh-farouk commented

2026-04-26 19:42:03 +00:00

Author

Member

Update: empirical test changes the framing — UX is worse than "harmless"

Followed up on this with a live test in proxy mode against 127.0.0.1:

Created a viewer_test user with viewer role on canvas 1 (sameh's canvas, populated with content).
Mapped 127.0.0.1/32, ::1/128, ::ffff:127.0.0.1/128 in proxy.user_networks to the viewer.
Hard-refreshed the browser (auto-logged in as the viewer via proxy IP-auth).

Server side — defends correctly (two layers, both fire)

WS canvas handler — routes.rs:1072 (hero_collab_ui): viewer's Sync(Update) and Sync(SyncStep2) frames are dropped before apply_update / broadcast / persist. is_editor is computed from the canvas role at WS-upgrade time (routes.rs:825).
canvas.save_state RPC — canvas.rs:740: write-required check_canvas_access returns -32002 View-only access.

Direct rpc.sock probes as viewer_test (X-Hero-User injection bypassed, simulating a malicious client):

Method	Result
`canvas.update` (title, is_public)	`-32002 View-only access: cannot edit canvas 1`
`canvas.save_state` (CRDT BLOB)	`-32002 View-only access: cannot edit canvas 1`
`canvas.share` / `canvas.unshare` / `canvas.update_role`	`-32002 View-only access`
`canvas.delete`	`-32002 Only the canvas creator can delete it`

DB verified unchanged after the probes and after the viewer's UI session: canvas_state.updated_at did not advance, canvases.title/is_public unchanged, canvas_collaborators intact.

UX side — strictly worse than "broken-looking buttons"

When the viewer clicks toolbar buttons (color, format, etc.), the local Yjs Doc accepts the txn and renders the change. The server-side WS handler then silently drops the broadcast, so the change is never persisted, never echoed to other clients, and disappears on refresh.

This is worse than disabled buttons. The viewer sees:

Click "bold" → text turns bold immediately.
Refresh → text is back to normal.

That reads as data loss ("the system saved my edit and then lost it"), not as read-only enforcement ("you don't have permission to edit"). For a collaborative tool this is the worse failure mode — it erodes trust in saves overall, not just on this canvas.

Implication for the fix

Scope is broader than "role-gate the toolbars":

Open the Yjs Doc in read-only mode for viewers (so local txns don't apply either) — this is the Yjs-idiomatic answer.
AND hide / dim the toolbars + hover menus + drag handles; surface a "View only" badge.
Both layers: read-only Doc prevents the data-loss illusion even if a toolbar slips through; the visual gating prevents the user from trying in the first place.

Still a UX-only bug — server is correct. But the urgency is higher than my original comment implied; queueing it for the next pass stands, but it should be near the top of the queue.

— Tested 2026-04-26.

### Update: empirical test changes the framing — UX is worse than "harmless" Followed up on this with a live test in proxy mode against `127.0.0.1`: - Created a `viewer_test` user with `viewer` role on canvas 1 (sameh's canvas, populated with content). - Mapped `127.0.0.1/32`, `::1/128`, `::ffff:127.0.0.1/128` in `proxy.user_networks` to the viewer. - Hard-refreshed the browser (auto-logged in as the viewer via proxy IP-auth). #### Server side — defends correctly (two layers, both fire) 1. **WS canvas handler** — `routes.rs:1072` (`hero_collab_ui`): viewer's `Sync(Update)` and `Sync(SyncStep2)` frames are dropped before `apply_update` / broadcast / persist. `is_editor` is computed from the canvas role at WS-upgrade time (`routes.rs:825`). 2. **`canvas.save_state` RPC** — `canvas.rs:740`: write-required `check_canvas_access` returns `-32002 View-only access`. Direct rpc.sock probes as `viewer_test` (X-Hero-User injection bypassed, simulating a malicious client): | Method | Result | |---|---| | `canvas.update` (title, is_public) | `-32002 View-only access: cannot edit canvas 1` | | `canvas.save_state` (CRDT BLOB) | `-32002 View-only access: cannot edit canvas 1` | | `canvas.share` / `canvas.unshare` / `canvas.update_role` | `-32002 View-only access` | | `canvas.delete` | `-32002 Only the canvas creator can delete it` | DB verified unchanged after the probes and after the viewer's UI session: `canvas_state.updated_at` did not advance, `canvases.title`/`is_public` unchanged, `canvas_collaborators` intact. #### UX side — strictly worse than "broken-looking buttons" When the viewer clicks toolbar buttons (color, format, etc.), the **local Yjs Doc accepts the txn and renders the change**. The server-side WS handler then silently drops the broadcast, so the change is never persisted, never echoed to other clients, and **disappears on refresh**. This is *worse than disabled buttons*. The viewer sees: 1. Click "bold" → text turns bold immediately. 2. Refresh → text is back to normal. That reads as **data loss** ("the system saved my edit and then lost it"), not as **read-only enforcement** ("you don't have permission to edit"). For a collaborative tool this is the worse failure mode — it erodes trust in saves overall, not just on this canvas. #### Implication for the fix Scope is broader than "role-gate the toolbars": - Open the Yjs Doc in **read-only mode** for viewers (so local txns don't apply either) — this is the Yjs-idiomatic answer. - AND hide / dim the toolbars + hover menus + drag handles; surface a "View only" badge. - Both layers: read-only Doc prevents the data-loss illusion even if a toolbar slips through; the visual gating prevents the user from trying in the first place. Still a UX-only bug — server is correct. But the urgency is higher than my original comment implied; queueing it for the next pass stands, but it should be near the top of the queue. — Tested 2026-04-26.

sameh-farouk referenced this issue

2026-04-27 16:34:04 +00:00

Canvas read-only viewer: data-loss illusion (local edits vanish on refresh) #26

sameh-farouk referenced this issue

2026-04-27 16:36:22 +00:00

fix(canvas): enforce read-only viewer mode at editor layer #27

sameh-farouk commented

2026-04-27 16:37:08 +00:00

Author

Member

Update on the canvas read-only viewer thread

The canvas read-only viewer issue from the comments above is now fixed and being tracked in #26 (PR #27).

Empirical verification with the merged code, end-to-end, in proxy mode against 127.0.0.1 mapped to a viewer_test user with viewer role on canvas 1:

Bubble menu doesn't appear on selection (setup IIFE skipped for viewers).
Top toolbar / share button / table-toolbar / popovers all hidden via body[data-canvas-mode="viewer"] CSS.
"View only" badge visible in the topbar.
Format/color/bold buttons (where reachable via console) blocked by filterTransaction ProseMirror plugin — even programmatic editor.chain().toggleBold().run() from DevTools no longer mutates the local Doc. Yjs sync transactions from collaborators flow through unaffected.
Live updates from editors visible in real time.
DB canvas_state.updated_at unchanged after click-spam → refresh shows no reverted state because there was no state to revert.

Two side-bugs surfaced in this thread, status:

loadChannels auto-creates #general for any zero-channel user (chat-app.js:962) — separate from canvas, not addressed by this PR. Worth filing as its own issue with the proposed fix from above (move default-channel seeding into workspace.create, remove the client fallback).
Stale hero_proxy_server binary (Apr 25 build, predating PR #30 + #33) — a workstation-environment issue that surfaced during testing, not a code bug. Resolved by rebuilding + reinstalling the proxy. Worth keeping in mind that "stack restart" should sometimes mean "rebuild + restart" after merging cross-repo PRs.

The original "queued for next pass" comment is now resolved.

### Update on the canvas read-only viewer thread The canvas read-only viewer issue from the comments above is now fixed and being tracked in #26 (PR #27). Empirical verification with the merged code, end-to-end, in proxy mode against `127.0.0.1` mapped to a `viewer_test` user with `viewer` role on canvas 1: - Bubble menu doesn't appear on selection (setup IIFE skipped for viewers). - Top toolbar / share button / table-toolbar / popovers all hidden via `body[data-canvas-mode="viewer"]` CSS. - "View only" badge visible in the topbar. - Format/color/bold buttons (where reachable via console) blocked by `filterTransaction` ProseMirror plugin — even programmatic `editor.chain().toggleBold().run()` from DevTools no longer mutates the local Doc. Yjs sync transactions from collaborators flow through unaffected. - Live updates from editors visible in real time. - DB `canvas_state.updated_at` unchanged after click-spam → refresh shows no reverted state because there was no state to revert. #### Two side-bugs surfaced in this thread, status: 1. **`loadChannels` auto-creates `#general` for any zero-channel user** (chat-app.js:962) — separate from canvas, not addressed by this PR. Worth filing as its own issue with the proposed fix from above (move default-channel seeding into `workspace.create`, remove the client fallback). 2. **Stale `hero_proxy_server` binary** (Apr 25 build, predating PR #30 + #33) — a workstation-environment issue that surfaced during testing, not a code bug. Resolved by rebuilding + reinstalling the proxy. Worth keeping in mind that "stack restart" should sometimes mean "rebuild + restart" after merging cross-repo PRs. The original "queued for next pass" comment is now resolved.

sameh-farouk referenced this issue

2026-04-27 16:48:38 +00:00

Canvas image GET returns 401 in proxy mode (X-Hero-User parsed as u64, not external_id) #28

sameh-farouk referenced this issue

2026-04-27 16:48:39 +00:00

fix(attachment): resolve string X-Hero-User in /api/attachment/{id} #29

sameh-farouk referenced this issue

2026-04-27 17:01:45 +00:00

DM creation blocked for non-admin workspace members (channel.create gate too strict) #30

sameh-farouk referenced this issue

2026-04-27 17:01:46 +00:00

Reaction count shows 2 instead of 1 for own message (optimistic-update vs WS-broadcast race) #31

sameh-farouk commented

2026-04-27 17:25:51 +00:00

Author

Member

Update — three more dogfooding bugs, all fixed and PRs filed

While testing the canvas read-only fix from earlier, three additional bugs surfaced. All have separate issues + PRs (kept scoped so each is reviewable independently):

#	Issue	PR	Symptom
#28	Canvas image GET 401 in proxy mode	#29	Every canvas image upload renders as a broken placeholder; `GET /api/attachment/{id}` returns 401
#30	DM creation blocked for non-admins	#32	Clicking a user in DM picker → `caller '<id>' lacks permission for 'channel.create'`
#31	Reaction count = 2 instead of 1	#33	Reacting to own message renders 2; client optimistic update + WS broadcast race

Common pattern

All three are honest behavioural bugs (not security issues): server is authoritative and correct in each case, but the client UI or a server-side gate disagrees with the actual contract. Pre-existing in main, surfaced by viewer-account dogfooding.

Root causes summary

#28 (attachment 401): attachment_http.rs parsed X-Hero-User as u64, dropping the proxy-mode email-string form. Now mirrors the JSON-RPC dispatcher's resolution chain (parse::<u64> → caller_id_cache → external_id → email/alias).
#30 (DM permission): channel.create admin-gated every kind including DMs. Now splits by kind: public/private remain admin-gated; DM/group_dm only require workspace membership.
#31 (reaction race): optimistic push after await rpc() was not idempotent against onMessageReacted's WS-driven authoritative replace. Now checks for presence before pushing.

The DM PR also drops two redundant channel.member.add calls in startDm that were leftover from before channel.create started auto-adding members for DMs — they were tripping the server's "user_id matches caller_id" guard and surfacing as a separate console error even though DMs worked.

Status of stack on the test workstation

hero_collab_server rebuilt + restarted with the attachment fix + DM permission fix.
hero_collab_ui rebuilt + restarted with the reaction idempotency fix + startDm cleanup.
All three behaviours verified end-to-end as viewer_test.

### Update — three more dogfooding bugs, all fixed and PRs filed While testing the canvas read-only fix from earlier, three additional bugs surfaced. All have separate issues + PRs (kept scoped so each is reviewable independently): | # | Issue | PR | Symptom | |---|---|---|---| | #28 | [Canvas image GET 401 in proxy mode](https://forge.ourworld.tf/lhumina_code/hero_collab/issues/28) | [#29](https://forge.ourworld.tf/lhumina_code/hero_collab/pulls/29) | Every canvas image upload renders as a broken placeholder; `GET /api/attachment/{id}` returns 401 | | #30 | [DM creation blocked for non-admins](https://forge.ourworld.tf/lhumina_code/hero_collab/issues/30) | [#32](https://forge.ourworld.tf/lhumina_code/hero_collab/pulls/32) | Clicking a user in DM picker → `caller '<id>' lacks permission for 'channel.create'` | | #31 | [Reaction count = 2 instead of 1](https://forge.ourworld.tf/lhumina_code/hero_collab/issues/31) | [#33](https://forge.ourworld.tf/lhumina_code/hero_collab/pulls/33) | Reacting to own message renders 2; client optimistic update + WS broadcast race | #### Common pattern All three are honest behavioural bugs (not security issues): server is authoritative and correct in each case, but the client UI or a server-side gate disagrees with the actual contract. Pre-existing in main, surfaced by viewer-account dogfooding. #### Root causes summary - **#28** (attachment 401): `attachment_http.rs` parsed `X-Hero-User` as `u64`, dropping the proxy-mode email-string form. Now mirrors the JSON-RPC dispatcher's resolution chain (`parse::<u64>` → `caller_id_cache` → `external_id` → `email`/`alias`). - **#30** (DM permission): `channel.create` admin-gated every kind including DMs. Now splits by kind: public/private remain admin-gated; DM/group_dm only require workspace membership. - **#31** (reaction race): optimistic `push` after `await rpc()` was not idempotent against `onMessageReacted`'s WS-driven authoritative replace. Now checks for presence before pushing. The DM PR also drops two redundant `channel.member.add` calls in `startDm` that were leftover from before `channel.create` started auto-adding members for DMs — they were tripping the server's "user_id matches caller_id" guard and surfacing as a separate console error even though DMs worked. #### Status of stack on the test workstation - `hero_collab_server` rebuilt + restarted with the attachment fix + DM permission fix. - `hero_collab_ui` rebuilt + restarted with the reaction idempotency fix + startDm cleanup. - All three behaviours verified end-to-end as `viewer_test`.