hero_router AI agent: chained workspace/board creation on hero_whiteboard fails #107

New issue

Closed

opened 2026-04-11 02:10:24 +00:00 by mik-tf · 2 comments

mik-tf commented

2026-04-11 02:10:24 +00:00

Owner

Problem

When the hero_router per-service dashboard's AI Agent tab is used against hero_whiteboard, chained operations that involve workspaces and boards are unreliable.

Reported by @kristof on 2026-04-09 with two concrete reproductions (screenshots in chat).

Reproduction 1 — FOREIGN KEY failure

Prompt:

create new whiteboard and make 5 boxes in different colors
do this in workspace "Test Board"

Observed: Agent completes "successfully" creating a whiteboard called Test Board, but then:

FOREIGN KEY constraint failed

when adding the 5 boxes. Whiteboard's SQL schema has boards.workspace_id REFERENCES workspaces(id) — the AI-generated Python passed a workspace_id that doesn't exist (it conflated "Test Board" as both workspace and board name, or hallucinated the id).

Reproduction 2 — Boxes land on wrong board

Follow-up prompt: i dont see them / is for board 2

Response: "Looks like the 10 colored boxes you requested were successfully added, but they were placed on 'Board 1' instead of 'Board 2'"

The agent had no cross-turn context; its Python re-resolved board names from scratch and picked the wrong id.

Root cause

This is not a routing/URL issue — the router fix from session 18 doesn't touch it. The failures are in:

hero_router/crates/hero_router/src/server/agent.rs — the prompt template and Python execution harness
The LLM prompt doesn't force a lookup-or-create pattern before inserting dependent rows
The executor runs Python per-request with no memory, so follow-up prompts lose context ("last board created")
hero_whiteboard RPC methods accept only numeric workspace_id / board_id, so string names from the AI get coerced to 0 / hallucinated

Fix levers (pick one or more)

Prompt engineering (easiest, low risk) — rewrite the agent's system prompt to:
- Always resolve a workspace/board by name first (workspace.list → filter by name → use id)
- Create the workspace if not found, THEN the board, THEN the objects
- Include 1–2 worked examples for the whiteboard domain
Convenience API on hero_whiteboard (medium) — add name-based helper methods that auto-resolve and auto-create:
- workspace.get_or_create(name) -> id
- board.create_by_name(workspace_name, board_name) -> id
- object.create_on_board(board_name, ...)
  Then the AI's naïve "do X in Test Board" just works.
Session state in hero_router agent (medium-hard) — keep a per-session cache of last_workspace_id, last_board_id so follow-up prompts inherit context instead of starting fresh.

Kristof also mentioned in chat: "the navigation is still off e.g. dropdown" and "workspaces have boards". This is a separate UI bug in the whiteboard island (hero_whiteboard_app / hero_whiteboard_ui) — dropdown not respecting the workspace → board hierarchy. Needs a separate investigation, possibly its own issue.

#104 (Dioxus SPA migration)
hero_whiteboard/crates/hero_whiteboard_server/src/db/queries.rs (schema enforces FK)
hero_whiteboard/crates/hero_whiteboard_server/src/migrations/004_u64_ids.sql (INTEGER ids, strict FKs)

Suggested order

Start with lever 1 (prompt engineering) — fast win, single-file change in hero_router/crates/hero_router/src/server/agent.rs. Measure with the exact two prompts above. If still flaky, add lever 2 for the whiteboard-specific convenience surface.

## Problem When the hero_router per-service dashboard's **AI Agent** tab is used against `hero_whiteboard`, chained operations that involve workspaces and boards are unreliable. Reported by @kristof on 2026-04-09 with two concrete reproductions (screenshots in chat). ### Reproduction 1 — FOREIGN KEY failure **Prompt:** ``` create new whiteboard and make 5 boxes in different colors do this in workspace "Test Board" ``` **Observed:** Agent completes "successfully" creating a whiteboard called `Test Board`, but then: ``` FOREIGN KEY constraint failed ``` when adding the 5 boxes. Whiteboard's SQL schema has `boards.workspace_id REFERENCES workspaces(id)` — the AI-generated Python passed a workspace_id that doesn't exist (it conflated "Test Board" as both workspace and board name, or hallucinated the id). ### Reproduction 2 — Boxes land on wrong board **Follow-up prompt:** `i dont see them / is for board 2` **Response:** *"Looks like the 10 colored boxes you requested were successfully added, but they were placed on 'Board 1' instead of 'Board 2'"* The agent had no cross-turn context; its Python re-resolved board names from scratch and picked the wrong id. ## Root cause This is not a routing/URL issue — the router fix from session 18 doesn't touch it. The failures are in: - **`hero_router/crates/hero_router/src/server/agent.rs`** — the prompt template and Python execution harness - The LLM prompt doesn't force a *lookup-or-create* pattern before inserting dependent rows - The executor runs Python per-request with no memory, so follow-up prompts lose context ("last board created") - `hero_whiteboard` RPC methods accept only numeric `workspace_id` / `board_id`, so string names from the AI get coerced to 0 / hallucinated ## Fix levers (pick one or more) 1. **Prompt engineering (easiest, low risk)** — rewrite the agent's system prompt to: - Always resolve a workspace/board by name first (`workspace.list` → filter by name → use id) - Create the workspace if not found, THEN the board, THEN the objects - Include 1–2 worked examples for the whiteboard domain 2. **Convenience API on hero_whiteboard** (medium) — add name-based helper methods that auto-resolve and auto-create: - `workspace.get_or_create(name) -> id` - `board.create_by_name(workspace_name, board_name) -> id` - `object.create_on_board(board_name, ...)` Then the AI's naïve "do X in Test Board" just works. 3. **Session state in hero_router agent** (medium-hard) — keep a per-session cache of `last_workspace_id`, `last_board_id` so follow-up prompts inherit context instead of starting fresh. ## Navigation/dropdown issue (separate) Kristof also mentioned in chat: *"the navigation is still off e.g. dropdown"* and *"workspaces have boards"*. This is a separate UI bug in the whiteboard island (hero_whiteboard_app / hero_whiteboard_ui) — dropdown not respecting the workspace → board hierarchy. Needs a separate investigation, possibly its own issue. ## Related - https://forge.ourworld.tf/lhumina_code/home/issues/104 (Dioxus SPA migration) - `hero_whiteboard/crates/hero_whiteboard_server/src/db/queries.rs` (schema enforces FK) - `hero_whiteboard/crates/hero_whiteboard_server/src/migrations/004_u64_ids.sql` (INTEGER ids, strict FKs) ## Suggested order Start with lever 1 (prompt engineering) — fast win, single-file change in `hero_router/crates/hero_router/src/server/agent.rs`. Measure with the exact two prompts above. If still flaky, add lever 2 for the whiteboard-specific convenience surface.

mik-tf commented

2026-04-11 03:20:13 +00:00

Author

Owner

Status — deferred to a dedicated session

Explored lever 1 (prompt engineering in hero_router/crates/hero_router/src/server/agent.rs) during the session 18 work. Drafted and applied a RESOURCE RESOLUTION rules block + a whiteboard-specific worked Python example. Reverted before shipping on the following grounds:

A whiteboard-specific worked example in a generic agent prompt pollutes every other service's system prompt and wastes tokens.
It would address Bug #1 (FK failure) in roughly 80% of cases but does nothing for Bug #2 (follow-up prompts place boxes on the wrong board, because each POST /:service/agent runs a fresh Python interpreter with zero memory of prior requests).
Kristof's third complaint — the whiteboard UI dropdown/navigation bugs — is a completely separate front-end investigation in hero_whiteboard_app.

Session 18 shipped the infrastructure around this (router /api/services JSON endpoint, IslandContext::forge_url() fix so the whiteboard island can actually reach its backend through the new socket-type router pattern). The underlying whiteboard plumbing is now reachable — the AI agent behaviour and UX polish are the remaining work.

Recommended scope for the next session

Per-service hints mechanism in RouterCache — add an optional service_hints: String to ServiceEntry and inject it into codegen_system only when building the prompt for that service. Whiteboard gets a lookup-or-create hint with a worked example; other services get nothing.
Bug #2 — cross-request memory: add a session id scoped conversation history in RouterCache so follow-up prompts on the same agent tab get replayed as chat turns to call_llm. Alternative: frontend-side context injection where the agent tab tracks last-created ids per service and prepends them to follow-up prompts.
Lever 2 — name-based convenience API on hero_whiteboard: workspace.get_or_create(name), board.create_by_name(workspace_name, board_name), object.create_on_board_by_name. Reduces the number of chained calls the AI has to get right and helps non-agent SDK users too.
Whiteboard UI dropdown/navigation bug — file as a separate issue once reproduced; needs a fresh look at hero_whiteboard_app and hero_whiteboard_ui components.
Full whiteboard UX redesign — per Kristof's feedback the current workspace → board → object model doesn't translate well to the current navigation. Scope this as a separate design pass before touching code.

Tracking this issue as the umbrella; split into smaller issues once the next session starts and the scope per item is clearer.

Signed-off-by: mik-tf

## Status — deferred to a dedicated session Explored lever 1 (prompt engineering in `hero_router/crates/hero_router/src/server/agent.rs`) during the session 18 work. Drafted and applied a `RESOURCE RESOLUTION` rules block + a whiteboard-specific worked Python example. **Reverted before shipping** on the following grounds: - A whiteboard-specific worked example in a generic agent prompt pollutes every other service's system prompt and wastes tokens. - It would address Bug #1 (FK failure) in roughly 80% of cases but does nothing for Bug #2 (follow-up prompts place boxes on the wrong board, because each POST `/:service/agent` runs a fresh Python interpreter with zero memory of prior requests). - Kristof's third complaint — the whiteboard UI dropdown/navigation bugs — is a completely separate front-end investigation in `hero_whiteboard_app`. Session 18 shipped the infrastructure around this (router `/api/services` JSON endpoint, `IslandContext::forge_url()` fix so the whiteboard island can actually reach its backend through the new socket-type router pattern). The underlying whiteboard plumbing is now reachable — the AI agent behaviour and UX polish are the remaining work. ## Recommended scope for the next session 1. **Per-service hints mechanism in `RouterCache`** — add an optional `service_hints: String` to `ServiceEntry` and inject it into `codegen_system` only when building the prompt for that service. Whiteboard gets a lookup-or-create hint with a worked example; other services get nothing. 2. **Bug #2 — cross-request memory**: add a session id scoped conversation history in `RouterCache` so follow-up prompts on the same agent tab get replayed as chat turns to `call_llm`. Alternative: frontend-side context injection where the agent tab tracks last-created ids per service and prepends them to follow-up prompts. 3. **Lever 2 — name-based convenience API on hero_whiteboard**: `workspace.get_or_create(name)`, `board.create_by_name(workspace_name, board_name)`, `object.create_on_board_by_name`. Reduces the number of chained calls the AI has to get right and helps non-agent SDK users too. 4. **Whiteboard UI dropdown/navigation bug** — file as a separate issue once reproduced; needs a fresh look at `hero_whiteboard_app` and `hero_whiteboard_ui` components. 5. **Full whiteboard UX redesign** — per Kristof's feedback the current workspace → board → object model doesn't translate well to the current navigation. Scope this as a separate design pass before touching code. Tracking this issue as the umbrella; split into smaller issues once the next session starts and the scope per item is clearer. Signed-off-by: mik-tf