[arch] hero_aibroker has zero per-context awareness — billing/rate-limit global per-IP #54

Closed
opened 2026-05-01 04:09:19 +00:00 by mik-tf · 1 comment
Owner

Summary

hero_aibroker has zero per-context awareness. X-Hero-Context is not honoured anywhere in the broker; billing and rate-limiting are bucketed globally per source-IP rather than per (ip, context). For multi-tenant deployments this means LLM spend, rate limits, and quotas leak across contexts at the broker layer.

What the code does today

  • All 5 providers (openai, openrouter, groq, sambanova, alibaba) share one OpenAIProvider struct with different base URLs.
  • Default model is auto with ROUTING_STRATEGY=cheapest.
  • X-Hero-Context header is not parsed or threaded through any handler.
  • Three sockets exposed: rpc.sock (admin), rest.sock (OpenAI-compatible v1), web_v1.sock proxy variant. Plus hero_aibroker_ui/ui.sock.

Why this matters

Sovereignty + multi-tenancy is a load-bearing property of the Hero stack (hero_demo#52 vision). Context isolation needs to hold at every layer the user's data or actions cross. LLM calls cross the broker on every chat / tool invocation. Today, a high-spend context can starve other contexts because they share the per-IP rate bucket.

Proposed fix

  1. Parse X-Hero-Context at the REST/RPC entry handlers in hero_aibroker_server.
  2. Bucket usage / rate-limit / spend by (ip, context) rather than ip alone.
  3. Surface per-context spend in the admin UI (or on rpc.sock admin methods).
  4. Optionally: per-context provider routing (some contexts pinned to specific providers / models).

Severity

Design-level. Not a security boundary today (no isolation is claimed at the broker layer), but contradicts the sovereignty story the demo pitch leans on.

Cross-refs

  • Same-shape sovereignty gap on hero_embedder: discards X-Hero-Context (separate issue)
  • Same-shape on hero_indexer (separate issue)
  • hero_demo#52 — vision

Spotted during docs_hero Phase 1 source-grounded read (session 52). Reconciliation memo: memory/investigation_roadmap_reconciliation.md.

## Summary `hero_aibroker` has zero per-context awareness. `X-Hero-Context` is not honoured anywhere in the broker; billing and rate-limiting are bucketed globally per source-IP rather than per `(ip, context)`. For multi-tenant deployments this means LLM spend, rate limits, and quotas leak across contexts at the broker layer. ## What the code does today - All 5 providers (`openai`, `openrouter`, `groq`, `sambanova`, `alibaba`) share one `OpenAIProvider` struct with different base URLs. - Default model is `auto` with `ROUTING_STRATEGY=cheapest`. - `X-Hero-Context` header is not parsed or threaded through any handler. - Three sockets exposed: `rpc.sock` (admin), `rest.sock` (OpenAI-compatible v1), `web_v1.sock` proxy variant. Plus `hero_aibroker_ui/ui.sock`. ## Why this matters Sovereignty + multi-tenancy is a load-bearing property of the Hero stack ([hero_demo#52 vision](https://forge.ourworld.tf/lhumina_code/hero_demo/issues/52)). Context isolation needs to hold at every layer the user's data or actions cross. LLM calls cross the broker on every chat / tool invocation. Today, a high-spend context can starve other contexts because they share the per-IP rate bucket. ## Proposed fix 1. Parse `X-Hero-Context` at the REST/RPC entry handlers in `hero_aibroker_server`. 2. Bucket usage / rate-limit / spend by `(ip, context)` rather than `ip` alone. 3. Surface per-context spend in the admin UI (or on `rpc.sock` admin methods). 4. Optionally: per-context provider routing (some contexts pinned to specific providers / models). ## Severity Design-level. Not a security boundary today (no isolation is claimed at the broker layer), but contradicts the sovereignty story the demo pitch leans on. ## Cross-refs - Same-shape sovereignty gap on `hero_embedder`: discards `X-Hero-Context` (separate issue) - Same-shape on `hero_indexer` (separate issue) - [hero_demo#52 — vision](https://forge.ourworld.tf/lhumina_code/hero_demo/issues/52) Spotted during docs_hero Phase 1 source-grounded read (session 52). Reconciliation memo: `memory/investigation_roadmap_reconciliation.md`.
Owner

#54
I reworked the broker quite well
now dynamic models

about billing I believe it should be per hero-os
so source ipv6 ip address is ok

this means who hosts hero-os pays
for now prob good enough

because otherwise will add lots of complications

https://forge.ourworld.tf/lhumina_code/hero_aibroker/issues/54 I reworked the broker quite well now dynamic models about billing I believe it should be per hero-os so source ipv6 ip address is ok this means who hosts hero-os pays for now prob good enough because otherwise will add lots of complications
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_aibroker#54
No description provided.