Multimodal image attachments + history restore regressions on development HEAD #58

Closed
opened 2026-05-10 23:33:28 +00:00 by mik-tf · 1 comment
Owner

Two regressions surfaced during manual UI testing on origin/development (HEAD ff1fca7) after the recent generator.rs collapse + aibroker param-forwarding changes. Filing this so next session has a clear scope to investigate and fix.

Context — what is known to work

Validated end-to-end on local from-remote clean rebuild (see PR #57):

  • Single-slide text-only generation via CLI (hero_slides --run generate_slide <deck> 04_sacha → real Gemini Flash Image PNG, 19 s, 4.8 MB).
  • ADR-0007 hash-based staleness — is_stale=false after generate, content_edited=true after .md edit.
  • deck.list JSON carries first_slide (the field added in #57).
  • folder.pick returns ~/Documents allowlist fallback correctly.
  • Aibroker reads provider keys from hero_proc secrets context core at startup (Initialized 3 cloud providers, Loaded 35 models).
  • Browser admin UI at http://127.0.0.1:9988/hero_slides/admin/ renders deck listings + generated slide thumbnails correctly.

Regression 1 — image attachments are ignored, model fabricates

Symptom (reported by despiegk): generating a slide that references a logo image (inline ![logo](path/to/logo.png), theme image ref, or background image) produces a slide where the model invents a plausible-looking logo instead of placing the attached image.

Quote: "can't get the logo to show up... I think still wrong how we ask to generate images. doesn't work any more, used to. it just makes up logos so not working. we carefully need to check how we ask to generate images with links of logos in there or in context images. damned this all worked. need to carefully check how we pass data of image, goes wrong there. see it didn't get the image, made it up."

Likely failure mode: the multimodal request reaches OpenRouter with the framing prompt ("ATTACHED IMAGES — READ THE SLIDE CONTENT BELOW...") but with no image_url content parts attached, so the model receives "use them AS-IS" instructions with nothing to use, and hallucinates a logo from the surrounding prompt text.

Code paths to inspect (in order of likelihood):

  1. herolib (hero_lib) recent fix from despiegk — boss noted he made fixes "on broker, as well as herolib and in slides". The herolib change is the most likely culprit because it sits on the image-data wire: image_request().add_image_ref_with() + parse_all_images() build/parse the multimodal message. Need to inspect what shipped on hero_lib/development after our last pull.

  2. hero_slides 3b89ec2 refactor(generator): remove text-only fallback path, always use multimodal builder — collapsed generator.rs to a single client.image_request()....execute() builder path. Currently uses framed_prompt unconditionally. Behaviour change vs the prior has_any_images branch shouldn't affect WITH-attachments case, but worth verifying the for-loops over inline_attachments / theme_ref_attachments / background_images still actually call add_image_ref_with.

  3. hero_aibroker 2dca2ea fix(server): pass all caller-supplied image params through to provider — replaced hardcoded forwarding of size/n/modalities with a "copy every param except model/prompt" generic loop on the ai.image RPC. Less likely (ai.chat is the multimodal path), but a generic-loop param forwarder can drop fields it shouldn't if the iteration is wrong.

Reproduction recipe for next session:

  1. Pick or create a slide in examples/sample_deck with an inline image ref like ![logo](./content/somelogo.png).
  2. Run RUST_LOG=debug ~/hero/bin/hero_slides --run generate_slide examples/sample_deck <slide_name>.
  3. Tail hero_proc service logs hero_aibroker for the outgoing HTTP body to OpenRouter — confirm whether messages[0].content is a string or an array with {type: 'image_url', image_url: {url: 'data:image/png;base64,...'}} content parts.
  4. Compare against OpenRouter docs: https://openrouter.ai/docs/guides/overview/multimodal/image-generation
  5. Walk the chain backwards from wherever the image data disappears.

Regression 2 — history restore broken

Symptom (reported by despiegk): "hmmm restore of history does no longer work"

Suspected scope: the slide version restore endpoint (whichever RPC reads from output/.versions/<slide_name>/v<NNN>.png and copies the chosen version back to output/<slide_name>.png).

Code paths to inspect:

  • crates/hero_slides_server/src/rpc.rs — handlers for version listing + restore (search for versions, restore, .versions)
  • crates/hero_slides_lib/src/deck.rs — versioning helpers
  • Could be a symptom of the fallback_marker_path cleanup in 3b89ec2 if any restore path was reading that marker.

Reproduction recipe:

  1. Generate a slide twice (regenerate forces a new v<NNN>.png snapshot to .versions/).
  2. Try to restore the prior version via the UI (browser dashboard has a versions panel) or via JSON-RPC against ~/hero/var/sockets/hero_slides/rpc.sock.
  3. Observe the failure mode (error response, silent no-op, etc.).

Other untested paths flagged by despiegk for the next pass

"lets check other things — copy, duplicate... test a lot please"

The following are confirmed not exercised in either PR #56/#57 or this session's validation:

  • Slide copy / duplicate (whichever RPCs back the UI "copy" action)
  • Slide Wizard end-to-end on a fresh deck (boss exercised this manually but result wasn't checked for regressions vs prior behaviour)
  • PDF export (export_pdf built-in Rhai script + slides.pdf generation)
  • Multi-slide --generate <collection/deck> (whole-deck Generate All, not just single-slide)
  • Background folder selection (the wizard step that lets you pick which background subfolders apply to which slides)
  • Theme extraction (bg.extractTheme)
  • Per-slide image-model override (slide.setImageModel)

Where this came from

During this session we cleaned up a build break on hero_slides_server caused by a bad merge (7c3c42a left duplicate handle_folder_pick definitions), and Kristof in parallel did a deeper refactor of generator.rs + aibroker param forwarding + a herolib fix. Our PR #57 added first_slide field + --info JSON cleanup, squash-merged at commit ff1fca7. Validated text-only generation end-to-end but did NOT exercise the multimodal image-input or history-restore paths Kristof later flagged broken.

What works on the live local stack right now

  • ~/hero/bin/hero_slides_server running on ff1fca7 build
  • ~/hero/bin/hero_aibroker_server running on 2dca2ea build with hero_proc secrets loaded
  • Admin UI at http://127.0.0.1:9988/hero_slides/admin/
  • Boss's manual UI test confirmed: deck listing, slide regenerate (text-only), Slide Wizard step 3, slide presentation rendering at e.g. /hero_slides/ui/#collections/ourworld/decks/nile_people

Signed-off-by: mik-tf

Two regressions surfaced during manual UI testing on `origin/development` (HEAD `ff1fca7`) after the recent generator.rs collapse + aibroker param-forwarding changes. Filing this so next session has a clear scope to investigate and fix. ## Context — what is known to work Validated end-to-end on local from-remote clean rebuild (see PR #57): - Single-slide **text-only** generation via CLI (`hero_slides --run generate_slide <deck> 04_sacha` → real Gemini Flash Image PNG, 19 s, 4.8 MB). - ADR-0007 hash-based staleness — `is_stale=false` after generate, `content_edited=true` after `.md` edit. - `deck.list` JSON carries `first_slide` (the field added in #57). - `folder.pick` returns `~/Documents` allowlist fallback correctly. - Aibroker reads provider keys from `hero_proc` secrets context `core` at startup (`Initialized 3 cloud providers`, `Loaded 35 models`). - Browser admin UI at `http://127.0.0.1:9988/hero_slides/admin/` renders deck listings + generated slide thumbnails correctly. ## Regression 1 — image attachments are ignored, model fabricates **Symptom (reported by despiegk):** generating a slide that references a logo image (inline `![logo](path/to/logo.png)`, theme image ref, or background image) produces a slide where the model **invents** a plausible-looking logo instead of placing the attached image. Quote: *"can't get the logo to show up... I think still wrong how we ask to generate images. doesn't work any more, used to. it just makes up logos so not working. we carefully need to check how we ask to generate images with links of logos in there or in context images. damned this all worked. need to carefully check how we pass data of image, goes wrong there. see it didn't get the image, made it up."* **Likely failure mode:** the multimodal request reaches OpenRouter with the framing prompt ("ATTACHED IMAGES — READ THE SLIDE CONTENT BELOW...") but with **no image_url content parts attached**, so the model receives "use them AS-IS" instructions with nothing to use, and hallucinates a logo from the surrounding prompt text. **Code paths to inspect (in order of likelihood):** 1. **herolib (`hero_lib`) recent fix from despiegk** — boss noted he made fixes "on broker, as well as herolib and in slides". The herolib change is the most likely culprit because it sits on the image-data wire: `image_request().add_image_ref_with()` + `parse_all_images()` build/parse the multimodal message. Need to inspect what shipped on `hero_lib/development` after our last pull. 2. **`hero_slides` `3b89ec2 refactor(generator): remove text-only fallback path, always use multimodal builder`** — collapsed generator.rs to a single `client.image_request()....execute()` builder path. Currently uses `framed_prompt` unconditionally. Behaviour change vs the prior `has_any_images` branch shouldn't affect WITH-attachments case, but worth verifying the for-loops over `inline_attachments` / `theme_ref_attachments` / `background_images` still actually call `add_image_ref_with`. 3. **`hero_aibroker` `2dca2ea fix(server): pass all caller-supplied image params through to provider`** — replaced hardcoded forwarding of `size/n/modalities` with a "copy every param except model/prompt" generic loop on the `ai.image` RPC. Less likely (`ai.chat` is the multimodal path), but a generic-loop param forwarder can drop fields it shouldn't if the iteration is wrong. **Reproduction recipe for next session:** 1. Pick or create a slide in `examples/sample_deck` with an inline image ref like `![logo](./content/somelogo.png)`. 2. Run `RUST_LOG=debug ~/hero/bin/hero_slides --run generate_slide examples/sample_deck <slide_name>`. 3. Tail `hero_proc service logs hero_aibroker` for the outgoing HTTP body to OpenRouter — confirm whether `messages[0].content` is a string or an array with `{type: 'image_url', image_url: {url: 'data:image/png;base64,...'}}` content parts. 4. Compare against OpenRouter docs: https://openrouter.ai/docs/guides/overview/multimodal/image-generation 5. Walk the chain backwards from wherever the image data disappears. ## Regression 2 — history restore broken **Symptom (reported by despiegk):** *"hmmm restore of history does no longer work"* **Suspected scope:** the slide version restore endpoint (whichever RPC reads from `output/.versions/<slide_name>/v<NNN>.png` and copies the chosen version back to `output/<slide_name>.png`). **Code paths to inspect:** - `crates/hero_slides_server/src/rpc.rs` — handlers for version listing + restore (search for `versions`, `restore`, `.versions`) - `crates/hero_slides_lib/src/deck.rs` — versioning helpers - Could be a symptom of the `fallback_marker_path` cleanup in `3b89ec2` if any restore path was reading that marker. **Reproduction recipe:** 1. Generate a slide twice (regenerate forces a new `v<NNN>.png` snapshot to `.versions/`). 2. Try to restore the prior version via the UI (browser dashboard has a versions panel) or via JSON-RPC against `~/hero/var/sockets/hero_slides/rpc.sock`. 3. Observe the failure mode (error response, silent no-op, etc.). ## Other untested paths flagged by despiegk for the next pass *"lets check other things — copy, duplicate... test a lot please"* The following are confirmed not exercised in either PR #56/#57 or this session's validation: - Slide copy / duplicate (whichever RPCs back the UI "copy" action) - Slide Wizard end-to-end on a fresh deck (boss exercised this manually but result wasn't checked for regressions vs prior behaviour) - PDF export (`export_pdf` built-in Rhai script + `slides.pdf` generation) - Multi-slide `--generate <collection/deck>` (whole-deck Generate All, not just single-slide) - Background folder selection (the wizard step that lets you pick which background subfolders apply to which slides) - Theme extraction (`bg.extractTheme`) - Per-slide image-model override (`slide.setImageModel`) ## Where this came from During this session we cleaned up a build break on `hero_slides_server` caused by a bad merge (`7c3c42a` left duplicate `handle_folder_pick` definitions), and Kristof in parallel did a deeper refactor of generator.rs + aibroker param forwarding + a herolib fix. Our PR #57 added `first_slide` field + `--info` JSON cleanup, squash-merged at commit `ff1fca7`. Validated text-only generation end-to-end but did NOT exercise the multimodal image-input or history-restore paths Kristof later flagged broken. ## What works on the live local stack right now - `~/hero/bin/hero_slides_server` running on `ff1fca7` build - `~/hero/bin/hero_aibroker_server` running on `2dca2ea` build with hero_proc secrets loaded - Admin UI at `http://127.0.0.1:9988/hero_slides/admin/` - Boss's manual UI test confirmed: deck listing, slide regenerate (text-only), Slide Wizard step 3, slide presentation rendering at e.g. `/hero_slides/ui/#collections/ourworld/decks/nile_people` Signed-off-by: mik-tf
Author
Owner

Closing #58

Regression 2 (history restore broken) — fixed. Backend was always restoring the file correctly (verified via curl + on-disk md5). Visible failure was the dashboard restoreVersion toast reading result.new_version, which the handler never returned, so the toast displayed Restored vX as undefined. Fixed in PR #59 by extending the handler to list versions after restore and return the new tag. Verified end-to-end via Hero Browser MCP — toast now reads e.g. Restored v005 as v014.

Regression 1 (multimodal images ignored) — not reproducible on this HEAD. After pulling herolib f6621dee (feat(ai): extend ImageConfig with OpenRouter image-gen parameters — also flattens image_config from a nested key to top-level body) which was not yet local when this issue was filed, all three attachment paths render correctly:

  • Inline ![logo](path/to/png) refs — tested with synthetic ZORBLAXOMEGA logo. Output PNG renders the input image faithfully.
  • Theme [[refs]] in <deck>/themes/<name>/theme.toml — tested with synthetic KWENDARIA brand mark. Output PNG renders the input image faithfully.
  • Background images path — same add_image_ref_with() call shape as the other two; not separately re-tested.

Verified by temporarily instrumenting the broker ([s91-debug] log line on OpenRouterProvider::chat, since reverted) — the outbound HTTP body to OpenRouter contains messages[0].content[].image_url parts with valid data:image/png;base64,... URLs of the expected byte length. The model (google/gemini-3.1-flash-image-preview) renders the attached image correctly.

@despiegk — request: please cd lhumina_code/hero_lib && git pull then re-test multimodal generation on your end. If the issue is gone, this is closed; if it persists in your environment, reopen with the broker's outbound body for the failing call (the [s91-debug] instrumentation pattern at crates/hero_aibroker_lib/src/providers/openrouter.rs:368 makes this a one-liner to add back).

Out of scope (filed separately as #60)

While reproducing #58 in the dashboard, two unrelated noise sources surfaced — neither user-visible (UI catches in try/catch) but both pollute server logs on every dashboard load:

  • slide.getIncomingLinks — Method not found (UI calls 11x per dashboard load).
  • bg.listFolders — errors when <deck>/backgrounds/ does not exist (should return empty list).

Tracked at #60 for next session's consideration.

Signed-off-by: mik-tf

## Closing #58 **Regression 2 (history restore broken) — fixed.** Backend was always restoring the file correctly (verified via curl + on-disk md5). Visible failure was the dashboard `restoreVersion` toast reading `result.new_version`, which the handler never returned, so the toast displayed `Restored vX as undefined`. Fixed in [PR #59](https://forge.ourworld.tf/lhumina_code/hero_slides/pulls/59) by extending the handler to list versions after restore and return the new tag. Verified end-to-end via Hero Browser MCP — toast now reads e.g. `Restored v005 as v014`. **Regression 1 (multimodal images ignored) — not reproducible on this HEAD.** After pulling herolib `f6621dee` (`feat(ai): extend ImageConfig with OpenRouter image-gen parameters` — also flattens `image_config` from a nested key to top-level body) which was not yet local when this issue was filed, all three attachment paths render correctly: - Inline `![logo](path/to/png)` refs — tested with synthetic ZORBLAXOMEGA logo. Output PNG renders the input image faithfully. - Theme `[[refs]]` in `<deck>/themes/<name>/theme.toml` — tested with synthetic KWENDARIA brand mark. Output PNG renders the input image faithfully. - Background images path — same `add_image_ref_with()` call shape as the other two; not separately re-tested. Verified by temporarily instrumenting the broker (`[s91-debug]` log line on `OpenRouterProvider::chat`, since reverted) — the outbound HTTP body to OpenRouter contains `messages[0].content[].image_url` parts with valid `data:image/png;base64,...` URLs of the expected byte length. The model (`google/gemini-3.1-flash-image-preview`) renders the attached image correctly. @despiegk — request: please `cd lhumina_code/hero_lib && git pull` then re-test multimodal generation on your end. If the issue is gone, this is closed; if it persists in your environment, reopen with the broker's outbound body for the failing call (the `[s91-debug]` instrumentation pattern at `crates/hero_aibroker_lib/src/providers/openrouter.rs:368` makes this a one-liner to add back). ## Out of scope (filed separately as [#60](https://forge.ourworld.tf/lhumina_code/hero_slides/issues/60)) While reproducing #58 in the dashboard, two unrelated noise sources surfaced — neither user-visible (UI catches in `try/catch`) but both pollute server logs on every dashboard load: - `slide.getIncomingLinks` — Method not found (UI calls 11x per dashboard load). - `bg.listFolders` — errors when `<deck>/backgrounds/` does not exist (should return empty list). Tracked at #60 for next session's consideration. Signed-off-by: mik-tf
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_slides#58
No description provided.