Consolidate herolib_ai into hero_aibroker_sdk (single AI client) #63

Open
opened 2026-05-08 10:19:49 +00:00 by timur · 4 comments
Owner

Goal

Collapse the two parallel AI clients in the workspace into one. hero_aibroker_sdk becomes the only Rust AI client; herolib_ai (in hero_lib/crates/ai) is retired. Every Hero service that wants to do chat/voice/embeddings/images talks to hero_aibroker over UDS.

Why

Right now every Hero service that does AI has a choice:

  • herolib_ai::AiClient::from_env() — talks directly to providers (Groq / OpenRouter / SambaNova / DeepInfra) using *_API_KEY env vars, owns its own model catalog (Model::Llama3_3_70B, etc.), no routing/cost-tracking.
  • hero_aibroker_sdk::HeroAibrokerClient — talks to the broker daemon over UDS; broker owns provider auth, routing, fallback, cost tracking, MCP integration, the unified modelsconfig.yml.

Two clients that overlap. Concrete pain we just hit:

  • hero_books's "Ask the Librarian" used herolib_ai::AiClient::from_env() with Model::Llama3_3_70B. The broker had keys in ~/hero/var/hero_aibroker/.env (plural GROQ_API_KEYS); herolib_ai reads process env (singular GROQ_API_KEY); both empty in hero_books_server → "Model Llama 3.3 70B not available on any configured provider" banner in the UI.
  • Two model-id mappings to keep in sync: model.rs::Model::Llama3_3_70B vs broker's modelsconfig.yml.
  • Two cost/usage trackers: herolib_ai::usage vs metrics.{get,detailed} on the broker. Neither sees the other's traffic.
  • Adding a new provider means editing provider.rs AND the broker's provider list.

Single broker = single .env, single model catalog, single cost ledger.

Surface mapping

herolib_ai modules → broker target:

herolib_ai Broker today Action
client::AiClient::chat ai.chat (with response_format, tools, OpenRouter passthrough) use as-is via SDK
embedding::* ai.embed use as-is via SDK
transcription::* ai.transcribe use as-is via SDK
tts::* ai.tts use as-is via SDK
model::Model enum + ProviderMapping modelsconfig.yml + models.list / models.get drop the enum; consumers pass model alias strings (groq-strong, claude-sonnet, autocheapest, etc.)
provider::{Provider, ProviderConfig} broker manages this drop
usage::* metrics.{get,detailed} wrap as a typed view in the SDK
prompt::{PromptBuilder, VerifyFn} not in broker promote into hero_aibroker_sdk as a client-side ergonomic helper — no server change needed
image_generation::* gap add ai.images.generate RPC + SDK method

Net broker work:

  1. Add ai.images.generate (the only herolib_ai feature with no broker counterpart).
  2. (Optional, ergonomic) Add prompt::{PromptBuilder, VerifyFn} to hero_aibroker_sdk.

Net SDK work:
3. Provide a from_default_socket() constructor matching AiClient::from_env's zero-arg ergonomics — let client = HeroAibrokerClient::default()?;.
4. Provide a typed Model alias type (just &str newtype) + helper constants for the broker's pre-configured aliases (MODEL_LLAMA_33_70B = "groq-strong", etc.) so consumers don't all hardcode strings.

Migration

Per-consumer:

// before
use herolib_ai::{AiClient, Message, Model};
let client = AiClient::from_env();
client.chat(Model::Llama3_3_70B, messages)?;

// after
use hero_aibroker_sdk::{HeroAibrokerClient, ChatRequest, Message};
let client = HeroAibrokerClient::default()?;
client.chat(ChatRequest { model: "groq-strong".into(), messages, ..Default::default() }).await?;

Known consumers (non-exhaustive — grep -rln 'use herolib_ai\|herolib_ai::' .):

  • hero_books_server — ai-summary handler, ai-cleanup, transcription, MCP chat
  • hero_rpc_osis — embedded chat helper (already broken with newer herolib_ai API per earlier session notes)
  • Probably hero_indexer, hero_agent, hero_compute_* — needs an inventory pass
  • All the in-repo Rhai bindings under hero_lib_rhai that surface Model::*

Each migration is small (typically <30 LOC per file) but touches many repos.

Phasing

  1. Broker side (this repo)

    • Add ai.images.generate (port the OpenAI/together image-gen wrapper from herolib_ai/image_generation/).
    • Add from_default_socket() to hero_aibroker_sdk.
    • Optionally promote PromptBuilder into the SDK.
    • Document migration recipes in this repo's README + the SDK README.
  2. hero_lib — mark hero_lib/crates/ai as deprecated. Keep it building for now so consumers can migrate at their own pace.

  3. Consumer migration (one PR per repo, parallel)

    • hero_books — librarian + transcription + MCP. Already on a refactor branch (development_hero_memory), good first candidate.
    • Rest of the workspace, opportunistically.
  4. Removal — once consumers are off herolib_ai, drop the crate.

Open questions

  • Streaming: does broker ai.chat support SSE/streaming yet? herolib_ai's direct path does. If not, chat-streaming consumers block until streaming lands on the broker side.
  • Tool/function calling: broker's ai.chat already accepts tools per its openrpc.json. Need to confirm it works through SambaNova/Groq route translations.
  • Rhai bindings: hero_lib's Rhai bindings expose Model::*. Does Rhai consume hero_aibroker_sdk directly, or does it need a thin shim?

Out of scope

  • Replacing the broker with the herolib_ai pattern (broker stays).
  • Wire-format changes to ai.chat request/response (already OpenAI-compatible).
  • Multi-tenant or per-user billing — separate concern.

cc whoever owns hero_lib/crates/ai

## Goal Collapse the two parallel AI clients in the workspace into one. **`hero_aibroker_sdk` becomes the only Rust AI client**; `herolib_ai` (in `hero_lib/crates/ai`) is retired. Every Hero service that wants to do chat/voice/embeddings/images talks to `hero_aibroker` over UDS. ## Why Right now every Hero service that does AI has a choice: * **`herolib_ai::AiClient::from_env()`** — talks directly to providers (Groq / OpenRouter / SambaNova / DeepInfra) using `*_API_KEY` env vars, owns its own model catalog (`Model::Llama3_3_70B`, etc.), no routing/cost-tracking. * **`hero_aibroker_sdk::HeroAibrokerClient`** — talks to the broker daemon over UDS; broker owns provider auth, routing, fallback, cost tracking, MCP integration, the unified `modelsconfig.yml`. Two clients that overlap. Concrete pain we just hit: * `hero_books`'s "Ask the Librarian" used `herolib_ai::AiClient::from_env()` with `Model::Llama3_3_70B`. The broker had keys in `~/hero/var/hero_aibroker/.env` (plural `GROQ_API_KEYS`); `herolib_ai` reads process env (singular `GROQ_API_KEY`); both empty in `hero_books_server` → "Model Llama 3.3 70B not available on any configured provider" banner in the UI. * Two model-id mappings to keep in sync: `model.rs::Model::Llama3_3_70B` vs broker's `modelsconfig.yml`. * Two cost/usage trackers: `herolib_ai::usage` vs `metrics.{get,detailed}` on the broker. Neither sees the other's traffic. * Adding a new provider means editing `provider.rs` AND the broker's provider list. Single broker = single `.env`, single model catalog, single cost ledger. ## Surface mapping `herolib_ai` modules → broker target: | `herolib_ai` | Broker today | Action | |---|---|---| | `client::AiClient::chat` | `ai.chat` (with `response_format`, `tools`, OpenRouter passthrough) | use as-is via SDK | | `embedding::*` | `ai.embed` | use as-is via SDK | | `transcription::*` | `ai.transcribe` | use as-is via SDK | | `tts::*` | `ai.tts` | use as-is via SDK | | `model::Model` enum + `ProviderMapping` | `modelsconfig.yml` + `models.list` / `models.get` | drop the enum; consumers pass model alias strings (`groq-strong`, `claude-sonnet`, `autocheapest`, etc.) | | `provider::{Provider, ProviderConfig}` | broker manages this | drop | | `usage::*` | `metrics.{get,detailed}` | wrap as a typed view in the SDK | | `prompt::{PromptBuilder, VerifyFn}` | not in broker | **promote into `hero_aibroker_sdk`** as a client-side ergonomic helper — no server change needed | | `image_generation::*` | **gap** | add `ai.images.generate` RPC + SDK method | Net broker work: 1. Add **`ai.images.generate`** (the only herolib_ai feature with no broker counterpart). 2. (Optional, ergonomic) Add `prompt::{PromptBuilder, VerifyFn}` to `hero_aibroker_sdk`. Net SDK work: 3. Provide a `from_default_socket()` constructor matching `AiClient::from_env`'s zero-arg ergonomics — `let client = HeroAibrokerClient::default()?;`. 4. Provide a typed `Model` alias type (just `&str` newtype) + helper constants for the broker's pre-configured aliases (`MODEL_LLAMA_33_70B = "groq-strong"`, etc.) so consumers don't all hardcode strings. ## Migration Per-consumer: ```rust // before use herolib_ai::{AiClient, Message, Model}; let client = AiClient::from_env(); client.chat(Model::Llama3_3_70B, messages)?; // after use hero_aibroker_sdk::{HeroAibrokerClient, ChatRequest, Message}; let client = HeroAibrokerClient::default()?; client.chat(ChatRequest { model: "groq-strong".into(), messages, ..Default::default() }).await?; ``` Known consumers (non-exhaustive — `grep -rln 'use herolib_ai\|herolib_ai::' .`): * `hero_books_server` — ai-summary handler, ai-cleanup, transcription, MCP chat * `hero_rpc_osis` — embedded chat helper (already broken with newer `herolib_ai` API per earlier session notes) * Probably `hero_indexer`, `hero_agent`, `hero_compute_*` — needs an inventory pass * All the in-repo Rhai bindings under `hero_lib_rhai` that surface `Model::*` Each migration is small (typically <30 LOC per file) but touches many repos. ## Phasing 1. **Broker side** (this repo) * Add `ai.images.generate` (port the OpenAI/together image-gen wrapper from `herolib_ai/image_generation/`). * Add `from_default_socket()` to `hero_aibroker_sdk`. * Optionally promote `PromptBuilder` into the SDK. * Document migration recipes in this repo's README + the SDK README. 2. **`hero_lib`** — mark `hero_lib/crates/ai` as deprecated. Keep it building for now so consumers can migrate at their own pace. 3. **Consumer migration** (one PR per repo, parallel) * `hero_books` — librarian + transcription + MCP. Already on a refactor branch (`development_hero_memory`), good first candidate. * Rest of the workspace, opportunistically. 4. **Removal** — once consumers are off `herolib_ai`, drop the crate. ## Open questions * **Streaming**: does broker `ai.chat` support SSE/streaming yet? `herolib_ai`'s direct path does. If not, chat-streaming consumers block until streaming lands on the broker side. * **Tool/function calling**: broker's `ai.chat` already accepts `tools` per its openrpc.json. Need to confirm it works through SambaNova/Groq route translations. * **Rhai bindings**: `hero_lib`'s Rhai bindings expose `Model::*`. Does Rhai consume `hero_aibroker_sdk` directly, or does it need a thin shim? ## Out of scope * Replacing the broker with the herolib_ai pattern (broker stays). * Wire-format changes to `ai.chat` request/response (already OpenAI-compatible). * Multi-tenant or per-user billing — separate concern. cc whoever owns hero_lib/crates/ai
Author
Owner

Consumer inventory

grep -rln "use herolib_ai\|herolib_ai::" across ~/code/forge.ourworld.tf/lhumina_code/, excluding hero_lib/ itself and target/:

Repo Files
hero_researcher 15
hero_books 3
hero_rpc 2
hero_agent 1
hero_editor 1
hero_lib_rhai 1
hero_skills 1
hero_sync 1
hero_voice 1
hero_webbuilder 1

Total: 27 files across 10 repos. hero_researcher is the heaviest user (likely a multi-step agent loop that calls the client from many places); the rest are 1–3 file touches per repo, which fits the "small migration per consumer" estimate in the issue body.

hero_rpc_osis (mentioned in the original body) was a guess from an earlier session — it's actually hero_rpc (2 files) and not in the OSIS subset specifically.

hero_skills has 1 file — likely an example or doc snippet rather than runtime code; worth verifying it's not just markdown.

### Consumer inventory `grep -rln "use herolib_ai\|herolib_ai::"` across `~/code/forge.ourworld.tf/lhumina_code/`, excluding `hero_lib/` itself and `target/`: | Repo | Files | |---|---| | `hero_researcher` | 15 | | `hero_books` | 3 | | `hero_rpc` | 2 | | `hero_agent` | 1 | | `hero_editor` | 1 | | `hero_lib_rhai` | 1 | | `hero_skills` | 1 | | `hero_sync` | 1 | | `hero_voice` | 1 | | `hero_webbuilder` | 1 | **Total: 27 files across 10 repos.** `hero_researcher` is the heaviest user (likely a multi-step agent loop that calls the client from many places); the rest are 1–3 file touches per repo, which fits the "small migration per consumer" estimate in the issue body. `hero_rpc_osis` (mentioned in the original body) was a guess from an earlier session — it's actually `hero_rpc` (2 files) and not in the OSIS subset specifically. `hero_skills` has 1 file — likely an example or doc snippet rather than runtime code; worth verifying it's not just markdown.
Member

Blocks lhumina_code/hero_biz#15

The minimum viable prerequisite for hero_biz#15 — Route AiClient through local AI Broker is step 1 of this issue:

  • Add from_default_socket() constructor to HeroAibrokerClient
  • Add a non-streaming chat method that POSTs to rest.sock at /v1/chat/completions without forcing stream: true

hero_biz uses non-streaming chat exclusively (analyze_intent, generate_suggestions, all assistant calls). Until this lands, hero_biz#15 is fully blocked.

**Blocks** lhumina_code/hero_biz#15 The minimum viable prerequisite for [hero_biz#15 — Route AiClient through local AI Broker](https://forge.ourworld.tf/lhumina_code/hero_biz/issues/15) is step 1 of this issue: - Add `from_default_socket()` constructor to `HeroAibrokerClient` - Add a non-streaming chat method that POSTs to `rest.sock` at `/v1/chat/completions` without forcing `stream: true` `hero_biz` uses non-streaming chat exclusively (`analyze_intent`, `generate_suggestions`, all assistant calls). Until this lands, hero_biz#15 is fully blocked.
Author
Owner

Phase 1 (broker side) — first slice landed in #64

#64 covers two of the four phase-1 sub-tasks from the issue body:

  • hero_aibroker_sdk::default_socket_path() + AIBrokerAdminAPIClient::connect_default() / AIBrokerRawClient::connect_default(). Resolution order: HERO_AIBROKER_SOCKET$HERO_SOCKET_DIR/hero_aibroker/rpc.sock$HOME/hero/var/sockets/hero_aibroker/rpc.sock/tmp/.... Matches the convention already used elsewhere (e.g. hero_db's default_socket_path).
  • Migration recipes documented inline in hero_aibroker_sdk/src/lib.rs — chat shape, the alias-string model field replacing Model::*, embed/transcribe/tts/image, streaming, and AIBrokerRawClient for OpenRouter passthrough.

Status check on the rest of phase 1 against current development:

  • 🟡 ai.images.generate — broker already has ai.image (openrpc.json line 1127) routing through ai.chat against image-capable models, returning image_base64 + image_data_url. Compared to herolib_ai's image_generation module, the gap is: no aspect_ratio / image_size params, no text companion field, no model echoed in the response. Open question: do consumers actually need those, or is the leaner ai.image enough? hero_books-style use cases probably don't, but the Gemini 3.1 Flash extended ratios from herolib_ai would matter for image-first apps.
  • PromptBuilder + VerifyFn promotion to the SDK — still pending, marked optional in the issue. Reasonable to defer until a consumer actively needs it during migration (i.e. once hero_books or hero_researcher ports come up).

Suggested next step

Take a single consumer end-to-end through the migration before generalising. hero_books is the trigger case from the issue, has 3 files in the inventory, and an existing refactor branch — ideal first port. That migration will surface whether PromptBuilder actually needs to land in the SDK, and whether the ai.image gap is real.

cc @timur

### Phase 1 (broker side) — first slice landed in #64 [#64](https://forge.ourworld.tf/lhumina_code/hero_aibroker/pulls/64) covers two of the four phase-1 sub-tasks from the issue body: - ✅ `hero_aibroker_sdk::default_socket_path()` + `AIBrokerAdminAPIClient::connect_default()` / `AIBrokerRawClient::connect_default()`. Resolution order: `HERO_AIBROKER_SOCKET` → `$HERO_SOCKET_DIR/hero_aibroker/rpc.sock` → `$HOME/hero/var/sockets/hero_aibroker/rpc.sock` → `/tmp/...`. Matches the convention already used elsewhere (e.g. `hero_db`'s `default_socket_path`). - ✅ Migration recipes documented inline in `hero_aibroker_sdk/src/lib.rs` — chat shape, the alias-string `model` field replacing `Model::*`, embed/transcribe/tts/image, streaming, and `AIBrokerRawClient` for OpenRouter passthrough. Status check on the rest of phase 1 against current `development`: - 🟡 **`ai.images.generate`** — broker already has `ai.image` (`openrpc.json` line 1127) routing through `ai.chat` against image-capable models, returning `image_base64` + `image_data_url`. Compared to `herolib_ai`'s `image_generation` module, the gap is: no `aspect_ratio` / `image_size` params, no `text` companion field, no `model` echoed in the response. Open question: do consumers actually need those, or is the leaner `ai.image` enough? `hero_books`-style use cases probably don't, but the Gemini 3.1 Flash extended ratios from `herolib_ai` would matter for image-first apps. - ⏳ **`PromptBuilder` + `VerifyFn`** promotion to the SDK — still pending, marked optional in the issue. Reasonable to defer until a consumer actively needs it during migration (i.e. once `hero_books` or `hero_researcher` ports come up). ### Suggested next step Take a single consumer end-to-end through the migration before generalising. `hero_books` is the trigger case from the issue, has 3 files in the inventory, and an existing refactor branch — ideal first port. That migration will surface whether `PromptBuilder` actually needs to land in the SDK, and whether the `ai.image` gap is real. cc @timur
Author
Owner

Update: herolib_ai 0.6.0 is already broker-first

Looking at origin/development after merging hero_books PR #123, herolib_ai's most recent commit (bb69b40a refactor(ai): rewrite as broker-first architecture) has already done a lot of what this issue proposed:

  • AiClient is now async and wraps hero_aibroker_sdk::AIBrokerRawClientAiClient::default_socket().await? instead of AiClient::from_env().
  • Model::Llama3_3_70B enum + ProviderMapping table → gone, replaced by string-based Model::new("groq-strong") / ModelRef::parse(…). The catalog now lives in the broker's modelsconfig.yml.
  • Provider / ProviderConfig types are gone — broker owns provider auth.
  • The dual-auth bug we hit (broker's ~/hero/var/hero_aibroker/.env carrying GROQ_API_KEYS while herolib_ai couldn't see it) goes away in 0.6.0: there's no direct-to-provider path, all chat goes through the broker, only the broker needs keys.
  • Streaming chat is supported via chat_stream(req).await returning ChatChunkStream.

So the original framing of this issue (collapse two parallel clients) is partially obsolete. What remains:

Remaining work

  1. Consumer migration — bump herolib_ai = "0.5.0""0.6.0" in every consumer Cargo.toml and rewrite the callsites:

    // before
    let client = AiClient::from_env();
    client.chat(Model::Llama3_3_70B, messages)?;
    
    // after
    let client = AiClient::default_socket().await?;
    client.chat(Model::new("groq-strong"), messages).await?;
    

    Done in PR #123 for hero_books_server (5 callsites: 2× ai-summary, 1× transcribe, 1× MCP tool_ask, 1× cleanup chat). Done means the librarian in the UI now renders an actual summary instead of "Llama 3.3 70B not available on any configured provider".

  2. Decide the long-term shape — does herolib_ai stay as an ergonomic layer (Model catalog, PromptBuilder, VerifyFn, ImageGenerationRequest builder) on top of hero_aibroker_sdk, or do those helpers move into the SDK and herolib_ai retires? My read: the helpers are useful, but they make a thin wrapper crate justify itself. Either is fine — the duplication concern is gone now that there's only one transport.

  3. ai.images.generate — still missing in the broker. herolib_ai/image_generation/ builds the OpenAI-style request shape but the broker has no endpoint to receive it. (See original issue body.)

Per-repo migration backlog (post-0.6.0 bump)

Repo herolib_ai files Status
hero_books 3 migrated (PR #123)
hero_researcher 15 not started
hero_rpc 2 not started
hero_agent 1 not started
hero_editor 1 not started
hero_lib_rhai 1 not started
hero_skills 1 not started (likely doc-only)
hero_sync 1 not started
hero_voice 1 not started
hero_webbuilder 1 not started

Each is a small mechanical migration (sync → async, enum → string, error type tweaks). hero_researcher is the only chunky one.

### Update: herolib_ai 0.6.0 is already broker-first Looking at `origin/development` after merging hero_books PR #123, herolib_ai's most recent commit (`bb69b40a refactor(ai): rewrite as broker-first architecture`) has already done a lot of what this issue proposed: * `AiClient` is now async and wraps `hero_aibroker_sdk::AIBrokerRawClient` — `AiClient::default_socket().await?` instead of `AiClient::from_env()`. * `Model::Llama3_3_70B` enum + `ProviderMapping` table → gone, replaced by string-based `Model::new("groq-strong")` / `ModelRef::parse(…)`. The catalog now lives in the broker's `modelsconfig.yml`. * `Provider` / `ProviderConfig` types are gone — broker owns provider auth. * The dual-auth bug we hit (broker's `~/hero/var/hero_aibroker/.env` carrying `GROQ_API_KEYS` while `herolib_ai` couldn't see it) **goes away** in 0.6.0: there's no direct-to-provider path, all chat goes through the broker, only the broker needs keys. * Streaming chat is supported via `chat_stream(req).await` returning `ChatChunkStream`. So the original framing of this issue (collapse two parallel clients) is partially obsolete. What remains: ### Remaining work 1. **Consumer migration** — bump `herolib_ai = "0.5.0"` → `"0.6.0"` in every consumer Cargo.toml and rewrite the callsites: ```rust // before let client = AiClient::from_env(); client.chat(Model::Llama3_3_70B, messages)?; // after let client = AiClient::default_socket().await?; client.chat(Model::new("groq-strong"), messages).await?; ``` Done in PR #123 for `hero_books_server` (5 callsites: 2× ai-summary, 1× transcribe, 1× MCP `tool_ask`, 1× cleanup chat). Done means the *librarian* in the UI now renders an actual summary instead of "Llama 3.3 70B not available on any configured provider". 2. **Decide the long-term shape** — does `herolib_ai` stay as an ergonomic layer (Model catalog, PromptBuilder, VerifyFn, ImageGenerationRequest builder) on top of `hero_aibroker_sdk`, or do those helpers move into the SDK and `herolib_ai` retires? My read: the helpers are useful, but they make a thin wrapper crate justify itself. Either is fine — the duplication concern is gone now that there's only one transport. 3. **`ai.images.generate`** — still missing in the broker. `herolib_ai/image_generation/` builds the OpenAI-style request shape but the broker has no endpoint to receive it. (See original issue body.) ### Per-repo migration backlog (post-0.6.0 bump) | Repo | herolib_ai files | Status | |---|---|---| | `hero_books` | 3 | ✅ migrated (PR #123) | | `hero_researcher` | 15 | not started | | `hero_rpc` | 2 | not started | | `hero_agent` | 1 | not started | | `hero_editor` | 1 | not started | | `hero_lib_rhai` | 1 | not started | | `hero_skills` | 1 | not started (likely doc-only) | | `hero_sync` | 1 | not started | | `hero_voice` | 1 | not started | | `hero_webbuilder` | 1 | not started | Each is a small mechanical migration (sync → async, enum → string, error type tweaks). `hero_researcher` is the only chunky one.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
lhumina_code/hero_aibroker#63
No description provided.