aibroker: fail-fast at startup when configured providers have no usable API key #55

Open
opened 2026-05-02 01:57:19 +00:00 by mik-tf · 0 comments
Owner

Problem

hero_aibroker_server boots successfully even when none of the configured providers have a usable API key in the environment. The first user request then fails at request time with a 5xx and a confusing error chain — the operator only finds out the deploy was misconfigured when an end user hits the AI Assistant.

This is the failure mode an integrator (freezone) hit on a fresh VM deploy: aibroker started healthy, the unix socket bound, the backend poller connected — but every chat request returned an "instance arm requires aibroker_endpoint" / "no providers available" error. The root cause was a missing ALIBABA_MODELSTUDIO_API_KEY env var that aibroker silently passed through as an empty string.

Proposal

At hero_aibroker_server startup, after provider registration, count the providers that have a non-empty API key (or whatever credential the provider needs). If the count is zero AND the config registers at least one provider, abort with a non-zero exit code and a clear error message naming each unconfigured provider.

Suggested error format:

FATAL: hero_aibroker has 0 usable providers.
  - alibaba: configured but ALIBABA_MODELSTUDIO_API_KEY is unset
  - openai:  configured but OPENAI_API_KEY is unset
Set at least one provider key in the environment, or remove the
provider from config, then restart.

Why this matters

  • Misconfigurations surface at deploy time (visible to the operator) instead of request time (visible to end users).
  • Container orchestrators (docker-compose depends_on: condition: service_healthy, k8s liveness probes, systemd) can react correctly — today the container is "healthy" but functionally broken.
  • Mirrors the contract hero_indexer already enforces (refuses to bind its socket when its config is invalid).

Out of scope

  • Whether to fail-fast in test configurations (e.g. mock providers, CI smoke). A --allow-zero-providers flag or env opt-out would preserve those paths if the team wants to keep them.
### Problem `hero_aibroker_server` boots successfully even when none of the configured providers have a usable API key in the environment. The first user request then fails at request time with a 5xx and a confusing error chain — the operator only finds out the deploy was misconfigured when an end user hits the AI Assistant. This is the failure mode an integrator (freezone) hit on a fresh VM deploy: aibroker started healthy, the unix socket bound, the backend poller connected — but every chat request returned an "instance arm requires aibroker_endpoint" / "no providers available" error. The root cause was a missing `ALIBABA_MODELSTUDIO_API_KEY` env var that aibroker silently passed through as an empty string. ### Proposal At `hero_aibroker_server` startup, after provider registration, count the providers that have a non-empty API key (or whatever credential the provider needs). If the count is zero AND the config registers at least one provider, abort with a non-zero exit code and a clear error message naming each unconfigured provider. Suggested error format: ``` FATAL: hero_aibroker has 0 usable providers. - alibaba: configured but ALIBABA_MODELSTUDIO_API_KEY is unset - openai: configured but OPENAI_API_KEY is unset Set at least one provider key in the environment, or remove the provider from config, then restart. ``` ### Why this matters - Misconfigurations surface at deploy time (visible to the operator) instead of request time (visible to end users). - Container orchestrators (docker-compose `depends_on: condition: service_healthy`, k8s liveness probes, systemd) can react correctly — today the container is "healthy" but functionally broken. - Mirrors the contract `hero_indexer` already enforces (refuses to bind its socket when its config is invalid). ### Out of scope - Whether to fail-fast in *test* configurations (e.g. mock providers, CI smoke). A `--allow-zero-providers` flag or env opt-out would preserve those paths if the team wants to keep them.
mik-tf added this to the ACTIVE project 2026-05-06 17:31:59 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_aibroker#55
No description provided.