hero_agent: LLM provider fallback cascade + error UX

mik-tf commented

2026-03-26 00:31:34 +00:00

Owner

Problem

When the primary LLM provider (OpenRouter) fails (e.g. HTTP 402 credits exhausted), the agent returns "I'm here to help!" — a generic fallback that looks like a working response. The user has no idea the AI is broken and thinks it is just dumb.

Discovered when OpenRouter credits hit $0 — agent appeared completely non-functional but gave no error indication.

Three fixes needed

1. Provider fallback cascade (high priority)

When the primary model/provider fails, automatically try the next one:

claude-sonnet-4.5 (OpenRouter) → 402
  → claude-haiku-4.5 (OpenRouter) → 402
    → llama-3.3-70b-versatile (Groq) → 200 ✓ (free tier!)

Groq offers free inference for llama-3.3-70b-versatile — the agent should always have a working fallback even when paid providers are exhausted. The model list already includes it, the agent just needs to cascade on provider errors (402, 429, 500, 503).

2. Error message passthrough (quick fix)

Stop masking errors as responses. Instead of "I'm here to help!", return:

"Sorry, I couldn't reach the AI model — the API returned: Payment Required (HTTP 402). Please check your OpenRouter credits."

This applies to both quick_response() and agent_loop() in agent.rs.

3. Dashboard LLM health indicator (nice to have)

Add LLM provider status to the agent dashboard stats bar:

On startup or first request: test each provider with a minimal call
Show in stats: LLM: OpenRouter ✓ | Groq ✓ or LLM: OpenRouter ✗ (402) | Groq ✓
This makes provider issues immediately visible without needing to send a chat message

Files to modify

File	Change
`agent.rs`	Add retry loop with model fallback in `quick_response()` and `agent_loop()`
`llm_client.rs`	Detect retriable provider errors (402, 429, 500, 503) vs permanent errors
`routes.rs`	Add `/api/llm-status` endpoint or include in `/api/stats`

Why this matters

Free Groq fallback means the agent never goes fully offline — even with $0 on OpenRouter
Error passthrough means users can self-diagnose instead of filing bugs
Dashboard status means admins can see provider health at a glance

Signed-off-by: mik-tf

## Problem When the primary LLM provider (OpenRouter) fails (e.g. HTTP 402 credits exhausted), the agent returns `"I'm here to help!"` — a generic fallback that looks like a working response. The user has no idea the AI is broken and thinks it is just dumb. Discovered when OpenRouter credits hit $0 — agent appeared completely non-functional but gave no error indication. ## Three fixes needed ### 1. Provider fallback cascade (high priority) When the primary model/provider fails, automatically try the next one: ``` claude-sonnet-4.5 (OpenRouter) → 402 → claude-haiku-4.5 (OpenRouter) → 402 → llama-3.3-70b-versatile (Groq) → 200 ✓ (free tier!) ``` Groq offers free inference for llama-3.3-70b-versatile — the agent should always have a working fallback even when paid providers are exhausted. The model list already includes it, the agent just needs to cascade on provider errors (402, 429, 500, 503). ### 2. Error message passthrough (quick fix) Stop masking errors as responses. Instead of `"I'm here to help!"`, return: > "Sorry, I couldn't reach the AI model — the API returned: Payment Required (HTTP 402). Please check your OpenRouter credits." This applies to both `quick_response()` and `agent_loop()` in `agent.rs`. ### 3. Dashboard LLM health indicator (nice to have) Add LLM provider status to the agent dashboard stats bar: - On startup or first request: test each provider with a minimal call - Show in stats: `LLM: OpenRouter ✓ | Groq ✓` or `LLM: OpenRouter ✗ (402) | Groq ✓` - This makes provider issues immediately visible without needing to send a chat message ## Files to modify | File | Change | |------|--------| | `agent.rs` | Add retry loop with model fallback in `quick_response()` and `agent_loop()` | | `llm_client.rs` | Detect retriable provider errors (402, 429, 500, 503) vs permanent errors | | `routes.rs` | Add `/api/llm-status` endpoint or include in `/api/stats` | ## Why this matters - Free Groq fallback means the agent **never goes fully offline** — even with $0 on OpenRouter - Error passthrough means users can **self-diagnose** instead of filing bugs - Dashboard status means admins can **see provider health at a glance** Signed-off-by: mik-tf