Multimodal image attachments + history restore regressions on development HEAD #58
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Two regressions surfaced during manual UI testing on
origin/development(HEADff1fca7) after the recent generator.rs collapse + aibroker param-forwarding changes. Filing this so next session has a clear scope to investigate and fix.Context — what is known to work
Validated end-to-end on local from-remote clean rebuild (see PR #57):
hero_slides --run generate_slide <deck> 04_sacha→ real Gemini Flash Image PNG, 19 s, 4.8 MB).is_stale=falseafter generate,content_edited=trueafter.mdedit.deck.listJSON carriesfirst_slide(the field added in #57).folder.pickreturns~/Documentsallowlist fallback correctly.hero_procsecrets contextcoreat startup (Initialized 3 cloud providers,Loaded 35 models).http://127.0.0.1:9988/hero_slides/admin/renders deck listings + generated slide thumbnails correctly.Regression 1 — image attachments are ignored, model fabricates
Symptom (reported by despiegk): generating a slide that references a logo image (inline
, theme image ref, or background image) produces a slide where the model invents a plausible-looking logo instead of placing the attached image.Quote: "can't get the logo to show up... I think still wrong how we ask to generate images. doesn't work any more, used to. it just makes up logos so not working. we carefully need to check how we ask to generate images with links of logos in there or in context images. damned this all worked. need to carefully check how we pass data of image, goes wrong there. see it didn't get the image, made it up."
Likely failure mode: the multimodal request reaches OpenRouter with the framing prompt ("ATTACHED IMAGES — READ THE SLIDE CONTENT BELOW...") but with no image_url content parts attached, so the model receives "use them AS-IS" instructions with nothing to use, and hallucinates a logo from the surrounding prompt text.
Code paths to inspect (in order of likelihood):
herolib (
hero_lib) recent fix from despiegk — boss noted he made fixes "on broker, as well as herolib and in slides". The herolib change is the most likely culprit because it sits on the image-data wire:image_request().add_image_ref_with()+parse_all_images()build/parse the multimodal message. Need to inspect what shipped onhero_lib/developmentafter our last pull.hero_slides3b89ec2 refactor(generator): remove text-only fallback path, always use multimodal builder— collapsed generator.rs to a singleclient.image_request()....execute()builder path. Currently usesframed_promptunconditionally. Behaviour change vs the priorhas_any_imagesbranch shouldn't affect WITH-attachments case, but worth verifying the for-loops overinline_attachments/theme_ref_attachments/background_imagesstill actually calladd_image_ref_with.hero_aibroker2dca2ea fix(server): pass all caller-supplied image params through to provider— replaced hardcoded forwarding ofsize/n/modalitieswith a "copy every param except model/prompt" generic loop on theai.imageRPC. Less likely (ai.chatis the multimodal path), but a generic-loop param forwarder can drop fields it shouldn't if the iteration is wrong.Reproduction recipe for next session:
examples/sample_deckwith an inline image ref like.RUST_LOG=debug ~/hero/bin/hero_slides --run generate_slide examples/sample_deck <slide_name>.hero_proc service logs hero_aibrokerfor the outgoing HTTP body to OpenRouter — confirm whethermessages[0].contentis a string or an array with{type: 'image_url', image_url: {url: 'data:image/png;base64,...'}}content parts.Regression 2 — history restore broken
Symptom (reported by despiegk): "hmmm restore of history does no longer work"
Suspected scope: the slide version restore endpoint (whichever RPC reads from
output/.versions/<slide_name>/v<NNN>.pngand copies the chosen version back tooutput/<slide_name>.png).Code paths to inspect:
crates/hero_slides_server/src/rpc.rs— handlers for version listing + restore (search forversions,restore,.versions)crates/hero_slides_lib/src/deck.rs— versioning helpersfallback_marker_pathcleanup in3b89ec2if any restore path was reading that marker.Reproduction recipe:
v<NNN>.pngsnapshot to.versions/).~/hero/var/sockets/hero_slides/rpc.sock.Other untested paths flagged by despiegk for the next pass
"lets check other things — copy, duplicate... test a lot please"
The following are confirmed not exercised in either PR #56/#57 or this session's validation:
export_pdfbuilt-in Rhai script +slides.pdfgeneration)--generate <collection/deck>(whole-deck Generate All, not just single-slide)bg.extractTheme)slide.setImageModel)Where this came from
During this session we cleaned up a build break on
hero_slides_servercaused by a bad merge (7c3c42aleft duplicatehandle_folder_pickdefinitions), and Kristof in parallel did a deeper refactor of generator.rs + aibroker param forwarding + a herolib fix. Our PR #57 addedfirst_slidefield +--infoJSON cleanup, squash-merged at commitff1fca7. Validated text-only generation end-to-end but did NOT exercise the multimodal image-input or history-restore paths Kristof later flagged broken.What works on the live local stack right now
~/hero/bin/hero_slides_serverrunning onff1fca7build~/hero/bin/hero_aibroker_serverrunning on2dca2eabuild with hero_proc secrets loadedhttp://127.0.0.1:9988/hero_slides/admin//hero_slides/ui/#collections/ourworld/decks/nile_peopleSigned-off-by: mik-tf
Closing #58
Regression 2 (history restore broken) — fixed. Backend was always restoring the file correctly (verified via curl + on-disk md5). Visible failure was the dashboard
restoreVersiontoast readingresult.new_version, which the handler never returned, so the toast displayedRestored vX as undefined. Fixed in PR #59 by extending the handler to list versions after restore and return the new tag. Verified end-to-end via Hero Browser MCP — toast now reads e.g.Restored v005 as v014.Regression 1 (multimodal images ignored) — not reproducible on this HEAD. After pulling herolib
f6621dee(feat(ai): extend ImageConfig with OpenRouter image-gen parameters— also flattensimage_configfrom a nested key to top-level body) which was not yet local when this issue was filed, all three attachment paths render correctly:refs — tested with synthetic ZORBLAXOMEGA logo. Output PNG renders the input image faithfully.[[refs]]in<deck>/themes/<name>/theme.toml— tested with synthetic KWENDARIA brand mark. Output PNG renders the input image faithfully.add_image_ref_with()call shape as the other two; not separately re-tested.Verified by temporarily instrumenting the broker (
[s91-debug]log line onOpenRouterProvider::chat, since reverted) — the outbound HTTP body to OpenRouter containsmessages[0].content[].image_urlparts with validdata:image/png;base64,...URLs of the expected byte length. The model (google/gemini-3.1-flash-image-preview) renders the attached image correctly.@despiegk — request: please
cd lhumina_code/hero_lib && git pullthen re-test multimodal generation on your end. If the issue is gone, this is closed; if it persists in your environment, reopen with the broker's outbound body for the failing call (the[s91-debug]instrumentation pattern atcrates/hero_aibroker_lib/src/providers/openrouter.rs:368makes this a one-liner to add back).Out of scope (filed separately as #60)
While reproducing #58 in the dashboard, two unrelated noise sources surfaced — neither user-visible (UI catches in
try/catch) but both pollute server logs on every dashboard load:slide.getIncomingLinks— Method not found (UI calls 11x per dashboard load).bg.listFolders— errors when<deck>/backgrounds/does not exist (should return empty list).Tracked at #60 for next session's consideration.
Signed-off-by: mik-tf