[nu-demo] CANONICAL — Intelligence pipeline documented end-to-end: hero_agent ↔ aibroker ↔ embedder ↔ hero_books ↔ docs_hero #27
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Scope
This issue consolidates everything we learned during the heronu demo build about how Hero OS's AI pipeline actually works in production. It's the single place a dev/ops person new to the stack should land before touching any of: triage classifier, tool routing, LLM dispatch, MCP wiring, library indexing, or semantic tool selection.
The equivalent user-facing doc has been committed as
docs_hero/collections/hero_os_guide/ai_pipeline.mdon thedevelopment_mik_nu_demobranch — so the AI Assistant can cite its own architecture viasearch_hero_docsonce indexed. Closed loop.Short version
1. Triage — the hidden branch point
Source:
hero_agent/crates/hero_agent/src/agent.rs— triage classifier decides whether the LLM gets called WITH or WITHOUT tools.triage = Knowledge: LLM receives only system prompt + message. No tools. Nosearch_hero_docs. The model answers from training priors.triage = Tools: LLM receives system prompt + message + tool list (after selection in §2). Tool calls are possible; grounding is possible.Demo bug (see lhumina_code/home#152): "What is Hero OS?" triages as Knowledge, so grounding never happens even when the rest of the stack is perfectly wired. The MANDATORY system prompt directive in
prompt.rsis only half of the fix — the triage classifier has to route hero-keyword messages to Tools.2. Tool selection — the key insight you MUST know
Source:
hero_agent/crates/hero_agent/src/semantic_router.rs+config.rs.Three routing modes, switchable via
HERO_AGENT_ROUTING_MODE:envvaluepatternsemanticalways_includeallowlisthybridWhen
mcp.jsonwires 7 MCP services (hero_osis, hero_books, hero_aibroker, hero_foundry, hero_indexer, hero_embedder, hero_proc), total tools ≈ 58 built-in + 107 MCP = 165. Pattern mode ships all 165, which triggers:Invalid 'tools': array too long. Expected maximum length 128, got 165(hard limit, documented).tools.60.custom.name: String should match pattern ^[a-zA-Z0-9_-]{1,128}$— MCP tool names likehero_osis.contact.listcontain dots, which violate the regex.Duplicate function declaration found: agent_run— adapter mangling collision.Every single LLM target failed because of this. See lhumina_code/home#153.
Why semantic routing is the architectural answer
On hero_agent startup,
SemanticRouter:hero_agent_tools+ MCP-discovered tools frommcp_client.tool_index_hash), skip re-embedding.hero_embedder(Q1 Fast, 384-dim BGE-small) and upsert into namespacehero_agent_tools.At runtime, each user message:
0.3) drops low-relevance matches.Result: tool count stays well under 128, only the most relevant tools get offered, tool-selection quality goes up (LLM isn't distracted by 150 irrelevant options).
The env vars that matter
HERO_AGENT_ROUTING_MODEpatternhybridHERO_AGENT_EMBEDDER_SOCKET$HOME/hero/var/sockets/hero_embedder/rpc.sockHERO_AGENT_SEMANTIC_TOP_K510HERO_AGENT_SEMANTIC_THRESHOLD0.30.25Applied on heronu via
hero_proc action set /home/driver/hero_agent_action.json.Still-remaining gap
Even with semantic routing, if any of the top-K tools is an MCP tool with dots in its name, Anthropic still rejects. Proper fix: sanitize
.→_when emitting tool names to the LLM, preserve original name for dispatch. 5 lines inmcp_client.rs+tool_router.rs. Should be bundled with lhumina_code/home#153.3. LLM dispatch — always via hero_aibroker
Source:
hero_agent/crates/hero_agent/src/llm_client.rs.hero_agent never calls a provider API directly in production. All traffic routes through
hero_aibroker's OpenAI-compatible REST at/v1/chat/completions.Why this matters: provider keys change, models get deprecated, rate limits need centralizing. aibroker owns all of that. One config file (
modelsconfig.yml) controls the provider-routing table, and every Hero service (agent, voice STT, TTS, future embeddings fallback) sees the same interface.Provider selection matters for demo quality:
tool_choicehints, MANDATORY system prompt directives — best for grounded answers.The hero_zero Docker precedent always sets
AIBROKER_API_ENDPOINT— this was the ONE env var most missing from the nu-shell service_agent.nu (which is itself missing — lhumina_code/home#135).4. Tool execution — built-in vs MCP
Built-in (hero_agent_tools crate, ~58 tools)
Names are regex-clean underscored. Dispatch is local — no network hop. Examples:
search_hero_docs— Unix socket tohero_books/rpc.sock, methodsearch.querylist_files/read_file/write_file— WebDAV viahero_foundry/rpc.sockosis_contact_list/osis_event_create/ ... — per-domain OSIS socketsrun_command— shell, sandbox-gated viaSafetyLevelpython_exec— ephemeral Python viauvbrowse_url— headless Chromium via hero_browser serviceMCP (via hero_router /mcp/{service_name})
Wired through
~/hero/var/agent/mcp.json:hero_router exposes
POST /mcp/{service}for every service whoserpc.sockserves an OpenRPC spec. On startup, hero_agent'smcp_clientdoes MCPinitialize, reads the tool list (derived from OpenRPC), and registers each tool.Tool names from MCP:
<namespace>.<verb>(e.g.collections.list,search.query). Valid JSON-RPC method names, invalid Anthropic function names. See §2.5. The grounding subloop —
search_hero_docsThe critical tool for Hero-related questions. Full trace:
Per-library search gap: the current tool hardcodes
namespace="hero". To searchgeomind/mycelium/ourworld, need either (a) optionalnamespace/libraryparam on the tool, or (b) a separate tool per library. The hero_bookssearch.queryRPC does take a namespace param; the agent-side tool just doesn't expose it.6. Library → embedder indexing (how docs get into the store)
On hero_books startup:
See lhumina_code/home#158 — pre-shipped
.ai/cache is currently bypassed becausecontent_hashis computed on the exported.mdbut the stored hash is from the source (pre-export) representation.7. Service topology matrix
cargo build -p hero_agent_server --releasecargo build -p hero_books_servercargo build -p hero_router8. Diagnostic commands
9. Cross-references — everything this issue touches
.ai/<page>.tomlcache10. The killer demo — what it should look like
With all the above working (hybrid routing on, Claude-backed aibroker, hero_books libraries indexed, triage override for hero-keywords, tool-name sanitizer):
search_hero_docs, Claude calls it, response cites(hero_os_guide, overview/quickstart/ai_pipeline)with ~1s latency.search_hero_docshitsai_pipeline.md(this very doc), response explains the pipeline from the doc's own text. Closed loop: the AI reads its own documentation.list_files, response shows webdav contents.11. What's on
development_mik_nu_demoDocs_hero branch
development_mik_nu_demo(NOT pushed) — commits:2c88aa2— refreshed architecture.md + services.md + overview.md + quickstart.md for the nu-shell / hero_proc / hero_agent era6deb8df— added ai_pipeline.md (this content in user-doc form)heronu VM state:
HERO_AGENT_ROUTING_MODE=hybrid+ top_k=10 + threshold=0.25 applied viahero_proc action setAIBROKER_API_ENDPOINT+HERO_AGENT_AIBROKER_MODELS+OSIS_URL/OSIS_CONTEXTin agent envSigned-off-by: mik-tf
Originally filed as home#159 on 2026-04-24 by mik-tf — moved to hero_demo as part of consolidating issue tracking.