Consolidate herolib_ai into hero_aibroker_sdk (single AI client) #63
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Blocks
#15 Route AiClient through local AI Broker instead of direct provider calls
lhumina_code/hero_biz
Reference
lhumina_code/hero_aibroker#63
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal
Collapse the two parallel AI clients in the workspace into one.
hero_aibroker_sdkbecomes the only Rust AI client;herolib_ai(inhero_lib/crates/ai) is retired. Every Hero service that wants to do chat/voice/embeddings/images talks tohero_aibrokerover UDS.Why
Right now every Hero service that does AI has a choice:
herolib_ai::AiClient::from_env()— talks directly to providers (Groq / OpenRouter / SambaNova / DeepInfra) using*_API_KEYenv vars, owns its own model catalog (Model::Llama3_3_70B, etc.), no routing/cost-tracking.hero_aibroker_sdk::HeroAibrokerClient— talks to the broker daemon over UDS; broker owns provider auth, routing, fallback, cost tracking, MCP integration, the unifiedmodelsconfig.yml.Two clients that overlap. Concrete pain we just hit:
hero_books's "Ask the Librarian" usedherolib_ai::AiClient::from_env()withModel::Llama3_3_70B. The broker had keys in~/hero/var/hero_aibroker/.env(pluralGROQ_API_KEYS);herolib_aireads process env (singularGROQ_API_KEY); both empty inhero_books_server→ "Model Llama 3.3 70B not available on any configured provider" banner in the UI.model.rs::Model::Llama3_3_70Bvs broker'smodelsconfig.yml.herolib_ai::usagevsmetrics.{get,detailed}on the broker. Neither sees the other's traffic.provider.rsAND the broker's provider list.Single broker = single
.env, single model catalog, single cost ledger.Surface mapping
herolib_aimodules → broker target:herolib_aiclient::AiClient::chatai.chat(withresponse_format,tools, OpenRouter passthrough)embedding::*ai.embedtranscription::*ai.transcribetts::*ai.ttsmodel::Modelenum +ProviderMappingmodelsconfig.yml+models.list/models.getgroq-strong,claude-sonnet,autocheapest, etc.)provider::{Provider, ProviderConfig}usage::*metrics.{get,detailed}prompt::{PromptBuilder, VerifyFn}hero_aibroker_sdkas a client-side ergonomic helper — no server change neededimage_generation::*ai.images.generateRPC + SDK methodNet broker work:
ai.images.generate(the only herolib_ai feature with no broker counterpart).prompt::{PromptBuilder, VerifyFn}tohero_aibroker_sdk.Net SDK work:
3. Provide a
from_default_socket()constructor matchingAiClient::from_env's zero-arg ergonomics —let client = HeroAibrokerClient::default()?;.4. Provide a typed
Modelalias type (just&strnewtype) + helper constants for the broker's pre-configured aliases (MODEL_LLAMA_33_70B = "groq-strong", etc.) so consumers don't all hardcode strings.Migration
Per-consumer:
Known consumers (non-exhaustive —
grep -rln 'use herolib_ai\|herolib_ai::' .):hero_books_server— ai-summary handler, ai-cleanup, transcription, MCP chathero_rpc_osis— embedded chat helper (already broken with newerherolib_aiAPI per earlier session notes)hero_indexer,hero_agent,hero_compute_*— needs an inventory passhero_lib_rhaithat surfaceModel::*Each migration is small (typically <30 LOC per file) but touches many repos.
Phasing
Broker side (this repo)
ai.images.generate(port the OpenAI/together image-gen wrapper fromherolib_ai/image_generation/).from_default_socket()tohero_aibroker_sdk.PromptBuilderinto the SDK.hero_lib— markhero_lib/crates/aias deprecated. Keep it building for now so consumers can migrate at their own pace.Consumer migration (one PR per repo, parallel)
hero_books— librarian + transcription + MCP. Already on a refactor branch (development_hero_memory), good first candidate.Removal — once consumers are off
herolib_ai, drop the crate.Open questions
ai.chatsupport SSE/streaming yet?herolib_ai's direct path does. If not, chat-streaming consumers block until streaming lands on the broker side.ai.chatalready acceptstoolsper its openrpc.json. Need to confirm it works through SambaNova/Groq route translations.hero_lib's Rhai bindings exposeModel::*. Does Rhai consumehero_aibroker_sdkdirectly, or does it need a thin shim?Out of scope
ai.chatrequest/response (already OpenAI-compatible).cc whoever owns hero_lib/crates/ai
Consumer inventory
grep -rln "use herolib_ai\|herolib_ai::"across~/code/forge.ourworld.tf/lhumina_code/, excludinghero_lib/itself andtarget/:hero_researcherhero_bookshero_rpchero_agenthero_editorhero_lib_rhaihero_skillshero_synchero_voicehero_webbuilderTotal: 27 files across 10 repos.
hero_researcheris the heaviest user (likely a multi-step agent loop that calls the client from many places); the rest are 1–3 file touches per repo, which fits the "small migration per consumer" estimate in the issue body.hero_rpc_osis(mentioned in the original body) was a guess from an earlier session — it's actuallyhero_rpc(2 files) and not in the OSIS subset specifically.hero_skillshas 1 file — likely an example or doc snippet rather than runtime code; worth verifying it's not just markdown.Blocks lhumina_code/hero_biz#15
The minimum viable prerequisite for hero_biz#15 — Route AiClient through local AI Broker is step 1 of this issue:
from_default_socket()constructor toHeroAibrokerClientrest.sockat/v1/chat/completionswithout forcingstream: truehero_bizuses non-streaming chat exclusively (analyze_intent,generate_suggestions, all assistant calls). Until this lands, hero_biz#15 is fully blocked.Phase 1 (broker side) — first slice landed in #64
#64 covers two of the four phase-1 sub-tasks from the issue body:
hero_aibroker_sdk::default_socket_path()+AIBrokerAdminAPIClient::connect_default()/AIBrokerRawClient::connect_default(). Resolution order:HERO_AIBROKER_SOCKET→$HERO_SOCKET_DIR/hero_aibroker/rpc.sock→$HOME/hero/var/sockets/hero_aibroker/rpc.sock→/tmp/.... Matches the convention already used elsewhere (e.g.hero_db'sdefault_socket_path).hero_aibroker_sdk/src/lib.rs— chat shape, the alias-stringmodelfield replacingModel::*, embed/transcribe/tts/image, streaming, andAIBrokerRawClientfor OpenRouter passthrough.Status check on the rest of phase 1 against current
development:ai.images.generate— broker already hasai.image(openrpc.jsonline 1127) routing throughai.chatagainst image-capable models, returningimage_base64+image_data_url. Compared toherolib_ai'simage_generationmodule, the gap is: noaspect_ratio/image_sizeparams, notextcompanion field, nomodelechoed in the response. Open question: do consumers actually need those, or is the leanerai.imageenough?hero_books-style use cases probably don't, but the Gemini 3.1 Flash extended ratios fromherolib_aiwould matter for image-first apps.PromptBuilder+VerifyFnpromotion to the SDK — still pending, marked optional in the issue. Reasonable to defer until a consumer actively needs it during migration (i.e. oncehero_booksorhero_researcherports come up).Suggested next step
Take a single consumer end-to-end through the migration before generalising.
hero_booksis the trigger case from the issue, has 3 files in the inventory, and an existing refactor branch — ideal first port. That migration will surface whetherPromptBuilderactually needs to land in the SDK, and whether theai.imagegap is real.cc @timur
Update: herolib_ai 0.6.0 is already broker-first
Looking at
origin/developmentafter merging hero_books PR #123, herolib_ai's most recent commit (bb69b40a refactor(ai): rewrite as broker-first architecture) has already done a lot of what this issue proposed:AiClientis now async and wrapshero_aibroker_sdk::AIBrokerRawClient—AiClient::default_socket().await?instead ofAiClient::from_env().Model::Llama3_3_70Benum +ProviderMappingtable → gone, replaced by string-basedModel::new("groq-strong")/ModelRef::parse(…). The catalog now lives in the broker'smodelsconfig.yml.Provider/ProviderConfigtypes are gone — broker owns provider auth.~/hero/var/hero_aibroker/.envcarryingGROQ_API_KEYSwhileherolib_aicouldn't see it) goes away in 0.6.0: there's no direct-to-provider path, all chat goes through the broker, only the broker needs keys.chat_stream(req).awaitreturningChatChunkStream.So the original framing of this issue (collapse two parallel clients) is partially obsolete. What remains:
Remaining work
Consumer migration — bump
herolib_ai = "0.5.0"→"0.6.0"in every consumer Cargo.toml and rewrite the callsites:Done in PR #123 for
hero_books_server(5 callsites: 2× ai-summary, 1× transcribe, 1× MCPtool_ask, 1× cleanup chat). Done means the librarian in the UI now renders an actual summary instead of "Llama 3.3 70B not available on any configured provider".Decide the long-term shape — does
herolib_aistay as an ergonomic layer (Model catalog, PromptBuilder, VerifyFn, ImageGenerationRequest builder) on top ofhero_aibroker_sdk, or do those helpers move into the SDK andherolib_airetires? My read: the helpers are useful, but they make a thin wrapper crate justify itself. Either is fine — the duplication concern is gone now that there's only one transport.ai.images.generate— still missing in the broker.herolib_ai/image_generation/builds the OpenAI-style request shape but the broker has no endpoint to receive it. (See original issue body.)Per-repo migration backlog (post-0.6.0 bump)
hero_bookshero_researcherhero_rpchero_agenthero_editorhero_lib_rhaihero_skillshero_synchero_voicehero_webbuilderEach is a small mechanical migration (sync → async, enum → string, error type tweaks).
hero_researcheris the only chunky one.