hero_agent: LLM provider fallback cascade + error UX #93
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
When the primary LLM provider (OpenRouter) fails (e.g. HTTP 402 credits exhausted), the agent returns
"I'm here to help!"— a generic fallback that looks like a working response. The user has no idea the AI is broken and thinks it is just dumb.Discovered when OpenRouter credits hit $0 — agent appeared completely non-functional but gave no error indication.
Three fixes needed
1. Provider fallback cascade (high priority)
When the primary model/provider fails, automatically try the next one:
Groq offers free inference for llama-3.3-70b-versatile — the agent should always have a working fallback even when paid providers are exhausted. The model list already includes it, the agent just needs to cascade on provider errors (402, 429, 500, 503).
2. Error message passthrough (quick fix)
Stop masking errors as responses. Instead of
"I'm here to help!", return:This applies to both
quick_response()andagent_loop()inagent.rs.3. Dashboard LLM health indicator (nice to have)
Add LLM provider status to the agent dashboard stats bar:
LLM: OpenRouter ✓ | Groq ✓orLLM: OpenRouter ✗ (402) | Groq ✓Files to modify
agent.rsquick_response()andagent_loop()llm_client.rsroutes.rs/api/llm-statusendpoint or include in/api/statsWhy this matters
Signed-off-by: mik-tf
Fixed in v0.7.3-dev (https://forge.ourworld.tf/lhumina_code/hero_services/releases/tag/v0.7.3-dev). Deployed to herodev, visually verified via Hero Browser MCP. All E2E tests passing.