[nu-demo] CANONICAL — Intelligence pipeline documented end-to-end: hero_agent ↔ aibroker ↔ embedder ↔ hero_books ↔ docs_hero #159

Closed
opened 2026-04-24 02:48:10 +00:00 by mik-tf · 1 comment
Owner

Scope

This issue consolidates everything we learned during the heronu demo build about how Hero OS's AI pipeline actually works in production. It's the single place a dev/ops person new to the stack should land before touching any of: triage classifier, tool routing, LLM dispatch, MCP wiring, library indexing, or semantic tool selection.

The equivalent user-facing doc has been committed as docs_hero/collections/hero_os_guide/ai_pipeline.md on the development_mik_nu_demo branch — so the AI Assistant can cite its own architecture via search_hero_docs once indexed. Closed loop.

Short version

┌──────────────────────────────────────────────────────────────────────────┐
│ Browser → POST /hero_agent/rpc method=agent.chat                        │
│                                                                          │
│   hero_agent_server                                                      │
│     1. triage         (Knowledge vs Tools)                               │
│     2. tool selection  pattern | semantic | hybrid (← via embedder)     │
│     3. LLM call        via hero_aibroker → Claude / GPT / Groq / OR     │
│     4. tool exec       built-in (58) OR MCP via hero_router /mcp/{svc}  │
│     5. answer → browser                                                  │
│                                                                          │
│   Grounding subloop: search_hero_docs →                                  │
│     hero_books.search.query →                                            │
│     hero_embedder.embed + cosine + rerank →                              │
│     docs_hero/docs_geomind/docs_mycelium/docs_owh Q&A corpus             │
└──────────────────────────────────────────────────────────────────────────┘

1. Triage — the hidden branch point

Source: hero_agent/crates/hero_agent/src/agent.rs — triage classifier decides whether the LLM gets called WITH or WITHOUT tools.

  • triage = Knowledge: LLM receives only system prompt + message. No tools. No search_hero_docs. The model answers from training priors.
  • triage = Tools: LLM receives system prompt + message + tool list (after selection in §2). Tool calls are possible; grounding is possible.

Demo bug (see #152): "What is Hero OS?" triages as Knowledge, so grounding never happens even when the rest of the stack is perfectly wired. The MANDATORY system prompt directive in prompt.rs is only half of the fix — the triage classifier has to route hero-keyword messages to Tools.

2. Tool selection — the key insight you MUST know

Source: hero_agent/crates/hero_agent/src/semantic_router.rs + config.rs.

Three routing modes, switchable via HERO_AGENT_ROUTING_MODE:

Mode Behavior env value
Pattern (default!) Ships ALL registered tools to the LLM pattern
Semantic Embeds user query, cosine-ranks tools, sends top-K semantic
Hybrid Semantic + always_include allowlist hybrid

When mcp.json wires 7 MCP services (hero_osis, hero_books, hero_aibroker, hero_foundry, hero_indexer, hero_embedder, hero_proc), total tools ≈ 58 built-in + 107 MCP = 165. Pattern mode ships all 165, which triggers:

  • OpenAI: Invalid 'tools': array too long. Expected maximum length 128, got 165 (hard limit, documented).
  • Anthropic Claude: tools.60.custom.name: String should match pattern ^[a-zA-Z0-9_-]{1,128}$ — MCP tool names like hero_osis.contact.list contain dots, which violate the regex.
  • Gemini-2.5-flash: Duplicate function declaration found: agent_run — adapter mangling collision.

Every single LLM target failed because of this. See #153.

Why semantic routing is the architectural answer

On hero_agent startup, SemanticRouter:

  1. Walks built-in tools from hero_agent_tools + MCP-discovered tools from mcp_client.
  2. For each tool, emits a text like:
    Tool: search_hero_docs
    Description: Search Hero OS documentation via hero_books. Returns relevant doc snippets...
    
  3. SHA-256 hashes the full tool index. If it matches the previously persisted hash (stored in embedder KVS under tool_index_hash), skip re-embedding.
  4. Else: batch-embed via hero_embedder (Q1 Fast, 384-dim BGE-small) and upsert into namespace hero_agent_tools.

At runtime, each user message:

  1. Is embedded identically (same model, same namespace).
  2. Cosine-searched against the tool index, returning top-K with scores.
  3. Threshold filter (default 0.3) drops low-relevance matches.
  4. The survivors — typically 5-10 — are the only tools shipped to the LLM.

Result: tool count stays well under 128, only the most relevant tools get offered, tool-selection quality goes up (LLM isn't distracted by 150 irrelevant options).

The env vars that matter

Var Default Recommended
HERO_AGENT_ROUTING_MODE pattern hybrid
HERO_AGENT_EMBEDDER_SOCKET $HOME/hero/var/sockets/hero_embedder/rpc.sock unchanged
HERO_AGENT_SEMANTIC_TOP_K 5 10
HERO_AGENT_SEMANTIC_THRESHOLD 0.3 0.25

Applied on heronu via hero_proc action set /home/driver/hero_agent_action.json.

Still-remaining gap

Even with semantic routing, if any of the top-K tools is an MCP tool with dots in its name, Anthropic still rejects. Proper fix: sanitize ._ when emitting tool names to the LLM, preserve original name for dispatch. 5 lines in mcp_client.rs + tool_router.rs. Should be bundled with #153.

3. LLM dispatch — always via hero_aibroker

Source: hero_agent/crates/hero_agent/src/llm_client.rs.

hero_agent never calls a provider API directly in production. All traffic routes through hero_aibroker's OpenAI-compatible REST at /v1/chat/completions.

Why this matters: provider keys change, models get deprecated, rate limits need centralizing. aibroker owns all of that. One config file (modelsconfig.yml) controls the provider-routing table, and every Hero service (agent, voice STT, TTS, future embeddings fallback) sees the same interface.

Provider selection matters for demo quality:

  • Claude (Haiku 4.5, Sonnet via anthropic): honors tool_choice hints, MANDATORY system prompt directives — best for grounded answers.
  • Gemini-Flash via OpenRouter: ignores tool-use preamble for questions it thinks it knows. If aibroker is down and hero_agent falls back to OpenRouter-direct with Gemini as default, answers hallucinate.
  • GPT-4o-mini: good tool-use but 128-tool cap.

The hero_zero Docker precedent always sets AIBROKER_API_ENDPOINT — this was the ONE env var most missing from the nu-shell service_agent.nu (which is itself missing — #135).

4. Tool execution — built-in vs MCP

Built-in (hero_agent_tools crate, ~58 tools)

Names are regex-clean underscored. Dispatch is local — no network hop. Examples:

  • search_hero_docs — Unix socket to hero_books/rpc.sock, method search.query
  • list_files / read_file / write_file — WebDAV via hero_foundry/rpc.sock
  • osis_contact_list / osis_event_create / ... — per-domain OSIS sockets
  • run_command — shell, sandbox-gated via SafetyLevel
  • python_exec — ephemeral Python via uv
  • browse_url — headless Chromium via hero_browser service

MCP (via hero_router /mcp/{service_name})

Wired through ~/hero/var/agent/mcp.json:

{
  "mcpServers": {
    "hero_books": {
      "url": "http://${HERO_ROUTER_HOST:-10.1.2.2}:${HERO_ROUTER_PORT:-9988}/mcp/hero_books",
      "transport": "http"
    },
    "hero_osis": { "url": "http://.../mcp/hero_osis", "transport": "http" }
  }
}

hero_router exposes POST /mcp/{service} for every service whose rpc.sock serves an OpenRPC spec. On startup, hero_agent's mcp_client does MCP initialize, reads the tool list (derived from OpenRPC), and registers each tool.

Tool names from MCP: <namespace>.<verb> (e.g. collections.list, search.query). Valid JSON-RPC method names, invalid Anthropic function names. See §2.

5. The grounding subloop — search_hero_docs

The critical tool for Hero-related questions. Full trace:

1. LLM emits: call search_hero_docs({"query": "what is hero os", "top_k": 5})
                               │
2. hero_agent_tools::search_hero_docs.execute()
                               │
3. POST to hero_books/rpc.sock, method=search.query
                               │
4. hero_books_server dispatches:
   a. hero_embedder.embed(query, Q1 Fast) → 384-dim vector
      └─ hero_embedderd TCP 127.0.0.1:8092 → ONNX BGE-small
   b. cosine search in namespace="hero" (or param-filtered)
      └─ redb-backed HNSW index, 163 docs in hero_os_guide
   c. hero_embedder.rerank(query, top-20 candidates, Q1)
      └─ cross-encoder reranker for final relevance scores
                               │
5. Return top-K Q&A pairs with { question, answer, page, score, content_hash }
                               │
6. hero_agent injects as tool_result, LLM composes final grounded answer
   citing (book, page) — e.g. "(hero_os_guide, overview)"

Per-library search gap: the current tool hardcodes namespace="hero". To search geomind/mycelium/ourworld, need either (a) optional namespace/library param on the tool, or (b) a separate tool per library. The hero_books search.query RPC does take a namespace param; the agent-side tool just doesn't expose it.

6. Library → embedder indexing (how docs get into the store)

On hero_books startup:

~/hero/var/books/libraries.txt  — one `name URL` per line
  │
  ├─ hero      https://forge.ourworld.tf/lhumina_code/docs_hero
  ├─ geomind   https://forge.ourworld.tf/geomind/docs_geomind
  ├─ mycelium  https://forge.ourworld.tf/coopcloud/docs_mycelium
  └─ ourworld  https://forge.ourworld.tf/ourworld/docs_owh
                               │
   For each entry:                                                        
     git clone/pull → /home/driver/code/docs_<name>/
                               │
   Scan collections/<coll>/*.md                                           
                               │
     For each page:                                                       
       ├─ Check .ai/<page>.toml (pre-shipped Q&A cache)
       │    ├─ content_hash matches? → reuse, no LLM call
       │    └─ mismatch? → re-extract via aibroker (slow!)
       │
       ├─ Upsert Q&A pairs → hero_embedder namespace=<library>
       └─ (if .vectors.bin exists) reuse pre-computed embeddings

See #158 — pre-shipped .ai/ cache is currently bypassed because content_hash is computed on the exported .md but the stored hash is from the source (pre-export) representation.

7. Service topology matrix

Component Role Socket / TCP Rebuild command
hero_agent AI orchestrator rpc.sock cargo build -p hero_agent_server --release
hero_aibroker LLM router rest.sock, ui.sock, rpc.sock (binary)
hero_embedder Vector DB + reranker rpc.sock + daemon TCP:8092 (binary, plus ONNX runtime install)
hero_books Doc indexer + search rpc.sock, ui.sock, web_admin.sock cargo build -p hero_books_server
hero_router TCP ingress :9988 TCP only cargo build -p hero_router
hero_foundry WebDAV + git forge rpc.sock (serves /api/files/{ctx}/{path}) (binary)
hero_osis 17 per-domain data hero_osis_/rpc.sock each (17 binaries)

8. Diagnostic commands

# Is the routing mode active?
cat /proc/$(pgrep -f hero_agent_server | head -1)/environ | tr '\0' '\n' | grep HERO_AGENT_ROUTING

# Are tools indexed into the embedder?
curl -s --unix-socket ~/hero/var/sockets/hero_embedder/rpc.sock -X POST http://localhost/rpc \
  -d '{"jsonrpc":"2.0","id":1,"method":"namespace.list","params":{}}'
# Look for `hero_agent_tools` namespace with >0 docs

# End-to-end grounded query:
curl -s --unix-socket ~/hero/var/sockets/hero_agent/rpc.sock -X POST http://localhost/rpc \
  -H 'Content-Type: application/json' \
  -H 'X-Hero-Context: default' \
  -d '{"jsonrpc":"2.0","id":1,"method":"agent.chat","params":{"message":"Call search_hero_docs with query=Hero OS pipeline and top_k=3, then summarize","model":"claude-haiku-4.5"}}'
# Expect response citing (hero_os_guide, ai_pipeline) or similar

# MCP tool discovery:
curl -s -X POST 'http://10.1.2.2:9988/mcp/hero_books' \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'

9. Cross-references — everything this issue touches

  • #148 — nu-demo architecture index (topology, deploy flow)
  • #149 — prompt.rs directive strengthening + rebuild verification
  • #150 — tool_choice=required support (needs turn-1-only gating; pragmatic patch caused infinite loops)
  • #152triage classifier routes hero questions to Knowledge path (the hallucination root cause)
  • #153tools payload sanitization (count > 128, dots-in-names regex, duplicates)
  • #156 — Photos seed storage_path leading slash
  • #157 — Books island URL double-slash + server-side slash-collapse middleware demo fix
  • #158hero_books re-runs LLM Q&A extraction despite pre-shipped .ai/<page>.toml cache
  • #125 — hero_router X-Hero-Context header clobbering
  • #135 — service_agent.nu missing entirely
  • #136 — OPENROUTER_API_KEYS plural name
  • #145 — hero_embedder blocking reqwest in tokio async

10. The killer demo — what it should look like

With all the above working (hybrid routing on, Claude-backed aibroker, hero_books libraries indexed, triage override for hero-keywords, tool-name sanitizer):

  • User: "What is Hero OS?" → triage→Tools, semantic pulls search_hero_docs, Claude calls it, response cites (hero_os_guide, overview/quickstart/ai_pipeline) with ~1s latency.
  • User: "How does the AI Assistant work?" → search_hero_docs hits ai_pipeline.md (this very doc), response explains the pipeline from the doc's own text. Closed loop: the AI reads its own documentation.
  • User: "List files in my geomind workspace" → triage→Tools, semantic pulls list_files, response shows webdav contents.
  • User: "Summarize the mycelium hosting plan" → triage→Tools, semantic pulls a namespace-aware search tool (once https://forge.ourworld.tf/lhumina_code/home/issues/XXX lands), response grounds in docs_mycelium.

11. What's on development_mik_nu_demo

Docs_hero branch development_mik_nu_demo (NOT pushed) — commits:

  • 2c88aa2 — refreshed architecture.md + services.md + overview.md + quickstart.md for the nu-shell / hero_proc / hero_agent era
  • 6deb8df — added ai_pipeline.md (this content in user-doc form)

heronu VM state:

  • prompt.rs MANDATORY directive compiled into hero_agent_server binary (strings count = 1 for MANDATORY / Hero Vired / sovereign digital)
  • HERO_AGENT_ROUTING_MODE=hybrid + top_k=10 + threshold=0.25 applied via hero_proc action set
  • AIBROKER_API_ENDPOINT + HERO_AGENT_AIBROKER_MODELS + OSIS_URL/OSIS_CONTEXT in agent env
  • mcp.json trimmed to hero_books only (pending name-sanitizer patch before widening)
  • 4 libraries cloned + indexed (hero=163, geomind=1733, mycelium/ourworld in progress)
  • WASM release + brotli sidecars (211MB → 1.8MB — #140)
  • 30 photo/video/song seed records rewritten to strip leading slash (#156)
  • 49 .docx files seeded into Office archipelago (default + geomind + incubaid + root + threefold contexts)

Signed-off-by: mik-tf

## Scope This issue consolidates everything we learned during the heronu demo build about how Hero OS's AI pipeline actually works in production. It's the single place a dev/ops person new to the stack should land before touching any of: triage classifier, tool routing, LLM dispatch, MCP wiring, library indexing, or semantic tool selection. The equivalent user-facing doc has been committed as [`docs_hero/collections/hero_os_guide/ai_pipeline.md`](https://forge.ourworld.tf/lhumina_code/docs_hero/src/branch/development_mik_nu_demo/collections/hero_os_guide/ai_pipeline.md) on the `development_mik_nu_demo` branch — so the AI Assistant can cite its own architecture via `search_hero_docs` once indexed. **Closed loop.** ## Short version ``` ┌──────────────────────────────────────────────────────────────────────────┐ │ Browser → POST /hero_agent/rpc method=agent.chat │ │ │ │ hero_agent_server │ │ 1. triage (Knowledge vs Tools) │ │ 2. tool selection pattern | semantic | hybrid (← via embedder) │ │ 3. LLM call via hero_aibroker → Claude / GPT / Groq / OR │ │ 4. tool exec built-in (58) OR MCP via hero_router /mcp/{svc} │ │ 5. answer → browser │ │ │ │ Grounding subloop: search_hero_docs → │ │ hero_books.search.query → │ │ hero_embedder.embed + cosine + rerank → │ │ docs_hero/docs_geomind/docs_mycelium/docs_owh Q&A corpus │ └──────────────────────────────────────────────────────────────────────────┘ ``` ## 1. Triage — the hidden branch point Source: `hero_agent/crates/hero_agent/src/agent.rs` — triage classifier decides whether the LLM gets called WITH or WITHOUT tools. - `triage = Knowledge`: LLM receives only system prompt + message. No tools. No `search_hero_docs`. The model answers from training priors. - `triage = Tools`: LLM receives system prompt + message + tool list (after selection in §2). Tool calls are possible; grounding is possible. **Demo bug** (see https://forge.ourworld.tf/lhumina_code/home/issues/152): "What is Hero OS?" triages as Knowledge, so grounding never happens even when the rest of the stack is perfectly wired. The MANDATORY system prompt directive in `prompt.rs` is only half of the fix — the triage classifier has to route hero-keyword messages to Tools. ## 2. Tool selection — the key insight you MUST know Source: `hero_agent/crates/hero_agent/src/semantic_router.rs` + `config.rs`. Three routing modes, switchable via `HERO_AGENT_ROUTING_MODE`: | Mode | Behavior | `env` value | |---|---|---| | Pattern (default!) | Ships ALL registered tools to the LLM | `pattern` | | Semantic | Embeds user query, cosine-ranks tools, sends top-K | `semantic` | | Hybrid | Semantic + `always_include` allowlist | `hybrid` | When `mcp.json` wires 7 MCP services (hero_osis, hero_books, hero_aibroker, hero_foundry, hero_indexer, hero_embedder, hero_proc), total tools ≈ 58 built-in + 107 MCP = **165**. Pattern mode ships all 165, which triggers: - **OpenAI**: `Invalid 'tools': array too long. Expected maximum length 128, got 165` (hard limit, documented). - **Anthropic Claude**: `tools.60.custom.name: String should match pattern ^[a-zA-Z0-9_-]{1,128}$` — MCP tool names like `hero_osis.contact.list` contain dots, which violate the regex. - **Gemini-2.5-flash**: `Duplicate function declaration found: agent_run` — adapter mangling collision. **Every single LLM target failed** because of this. See https://forge.ourworld.tf/lhumina_code/home/issues/153. ### Why semantic routing is the architectural answer On hero_agent startup, `SemanticRouter`: 1. Walks built-in tools from `hero_agent_tools` + MCP-discovered tools from `mcp_client`. 2. For each tool, emits a text like: ``` Tool: search_hero_docs Description: Search Hero OS documentation via hero_books. Returns relevant doc snippets... ``` 3. SHA-256 hashes the full tool index. If it matches the previously persisted hash (stored in embedder KVS under `tool_index_hash`), skip re-embedding. 4. Else: batch-embed via `hero_embedder` (Q1 Fast, 384-dim BGE-small) and upsert into namespace `hero_agent_tools`. At runtime, each user message: 1. Is embedded identically (same model, same namespace). 2. Cosine-searched against the tool index, returning top-K with scores. 3. Threshold filter (default `0.3`) drops low-relevance matches. 4. The survivors — typically 5-10 — are the only tools shipped to the LLM. **Result:** tool count stays well under 128, only the most relevant tools get offered, tool-selection quality goes up (LLM isn't distracted by 150 irrelevant options). ### The env vars that matter | Var | Default | Recommended | |---|---|---| | `HERO_AGENT_ROUTING_MODE` | `pattern` | `hybrid` | | `HERO_AGENT_EMBEDDER_SOCKET` | `$HOME/hero/var/sockets/hero_embedder/rpc.sock` | unchanged | | `HERO_AGENT_SEMANTIC_TOP_K` | `5` | `10` | | `HERO_AGENT_SEMANTIC_THRESHOLD` | `0.3` | `0.25` | Applied on heronu via `hero_proc action set /home/driver/hero_agent_action.json`. ### Still-remaining gap Even with semantic routing, if any of the top-K tools is an MCP tool with dots in its name, Anthropic still rejects. Proper fix: sanitize `.` → `_` when emitting tool names to the LLM, preserve original name for dispatch. 5 lines in `mcp_client.rs` + `tool_router.rs`. Should be bundled with https://forge.ourworld.tf/lhumina_code/home/issues/153. ## 3. LLM dispatch — always via hero_aibroker Source: `hero_agent/crates/hero_agent/src/llm_client.rs`. hero_agent **never calls a provider API directly in production**. All traffic routes through `hero_aibroker`'s OpenAI-compatible REST at `/v1/chat/completions`. Why this matters: provider keys change, models get deprecated, rate limits need centralizing. aibroker owns all of that. One config file (`modelsconfig.yml`) controls the provider-routing table, and every Hero service (agent, voice STT, TTS, future embeddings fallback) sees the same interface. **Provider selection matters for demo quality:** - **Claude (Haiku 4.5, Sonnet via anthropic)**: honors `tool_choice` hints, MANDATORY system prompt directives — best for grounded answers. - **Gemini-Flash via OpenRouter**: ignores tool-use preamble for questions it thinks it knows. If aibroker is down and hero_agent falls back to OpenRouter-direct with Gemini as default, answers hallucinate. - **GPT-4o-mini**: good tool-use but 128-tool cap. The hero_zero Docker precedent always sets `AIBROKER_API_ENDPOINT` — this was the ONE env var most missing from the nu-shell service_agent.nu (which is itself missing — https://forge.ourworld.tf/lhumina_code/home/issues/135). ## 4. Tool execution — built-in vs MCP ### Built-in (hero_agent_tools crate, ~58 tools) Names are regex-clean underscored. Dispatch is local — no network hop. Examples: - `search_hero_docs` — Unix socket to `hero_books/rpc.sock`, method `search.query` - `list_files` / `read_file` / `write_file` — WebDAV via `hero_foundry/rpc.sock` - `osis_contact_list` / `osis_event_create` / ... — per-domain OSIS sockets - `run_command` — shell, sandbox-gated via `SafetyLevel` - `python_exec` — ephemeral Python via `uv` - `browse_url` — headless Chromium via hero_browser service ### MCP (via hero_router /mcp/{service_name}) Wired through `~/hero/var/agent/mcp.json`: ```json { "mcpServers": { "hero_books": { "url": "http://${HERO_ROUTER_HOST:-10.1.2.2}:${HERO_ROUTER_PORT:-9988}/mcp/hero_books", "transport": "http" }, "hero_osis": { "url": "http://.../mcp/hero_osis", "transport": "http" } } } ``` hero_router exposes `POST /mcp/{service}` for every service whose `rpc.sock` serves an OpenRPC spec. On startup, hero_agent's `mcp_client` does MCP `initialize`, reads the tool list (derived from OpenRPC), and registers each tool. Tool names from MCP: `<namespace>.<verb>` (e.g. `collections.list`, `search.query`). Valid JSON-RPC method names, invalid Anthropic function names. See §2. ## 5. The grounding subloop — `search_hero_docs` The critical tool for Hero-related questions. Full trace: ``` 1. LLM emits: call search_hero_docs({"query": "what is hero os", "top_k": 5}) │ 2. hero_agent_tools::search_hero_docs.execute() │ 3. POST to hero_books/rpc.sock, method=search.query │ 4. hero_books_server dispatches: a. hero_embedder.embed(query, Q1 Fast) → 384-dim vector └─ hero_embedderd TCP 127.0.0.1:8092 → ONNX BGE-small b. cosine search in namespace="hero" (or param-filtered) └─ redb-backed HNSW index, 163 docs in hero_os_guide c. hero_embedder.rerank(query, top-20 candidates, Q1) └─ cross-encoder reranker for final relevance scores │ 5. Return top-K Q&A pairs with { question, answer, page, score, content_hash } │ 6. hero_agent injects as tool_result, LLM composes final grounded answer citing (book, page) — e.g. "(hero_os_guide, overview)" ``` **Per-library search gap**: the current tool hardcodes `namespace="hero"`. To search `geomind`/`mycelium`/`ourworld`, need either (a) optional `namespace`/`library` param on the tool, or (b) a separate tool per library. The hero_books `search.query` RPC does take a namespace param; the agent-side tool just doesn't expose it. ## 6. Library → embedder indexing (how docs get into the store) On hero_books startup: ``` ~/hero/var/books/libraries.txt — one `name URL` per line │ ├─ hero https://forge.ourworld.tf/lhumina_code/docs_hero ├─ geomind https://forge.ourworld.tf/geomind/docs_geomind ├─ mycelium https://forge.ourworld.tf/coopcloud/docs_mycelium └─ ourworld https://forge.ourworld.tf/ourworld/docs_owh │ For each entry: git clone/pull → /home/driver/code/docs_<name>/ │ Scan collections/<coll>/*.md │ For each page: ├─ Check .ai/<page>.toml (pre-shipped Q&A cache) │ ├─ content_hash matches? → reuse, no LLM call │ └─ mismatch? → re-extract via aibroker (slow!) │ ├─ Upsert Q&A pairs → hero_embedder namespace=<library> └─ (if .vectors.bin exists) reuse pre-computed embeddings ``` See https://forge.ourworld.tf/lhumina_code/home/issues/158 — pre-shipped `.ai/` cache is currently bypassed because `content_hash` is computed on the exported `.md` but the stored hash is from the source (pre-export) representation. ## 7. Service topology matrix | Component | Role | Socket / TCP | Rebuild command | |---|---|---|---| | hero_agent | AI orchestrator | rpc.sock | `cargo build -p hero_agent_server --release` | | hero_aibroker | LLM router | rest.sock, ui.sock, rpc.sock | (binary) | | hero_embedder | Vector DB + reranker | rpc.sock + daemon TCP:8092 | (binary, plus ONNX runtime install) | | hero_books | Doc indexer + search | rpc.sock, ui.sock, web_admin.sock | `cargo build -p hero_books_server` | | hero_router | TCP ingress :9988 | TCP only | `cargo build -p hero_router` | | hero_foundry | WebDAV + git forge | rpc.sock (serves /api/files/{ctx}/{path}) | (binary) | | hero_osis | 17 per-domain data | hero_osis_<domain>/rpc.sock each | (17 binaries) | ## 8. Diagnostic commands ```bash # Is the routing mode active? cat /proc/$(pgrep -f hero_agent_server | head -1)/environ | tr '\0' '\n' | grep HERO_AGENT_ROUTING # Are tools indexed into the embedder? curl -s --unix-socket ~/hero/var/sockets/hero_embedder/rpc.sock -X POST http://localhost/rpc \ -d '{"jsonrpc":"2.0","id":1,"method":"namespace.list","params":{}}' # Look for `hero_agent_tools` namespace with >0 docs # End-to-end grounded query: curl -s --unix-socket ~/hero/var/sockets/hero_agent/rpc.sock -X POST http://localhost/rpc \ -H 'Content-Type: application/json' \ -H 'X-Hero-Context: default' \ -d '{"jsonrpc":"2.0","id":1,"method":"agent.chat","params":{"message":"Call search_hero_docs with query=Hero OS pipeline and top_k=3, then summarize","model":"claude-haiku-4.5"}}' # Expect response citing (hero_os_guide, ai_pipeline) or similar # MCP tool discovery: curl -s -X POST 'http://10.1.2.2:9988/mcp/hero_books' \ -H 'Content-Type: application/json' \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' ``` ## 9. Cross-references — everything this issue touches - https://forge.ourworld.tf/lhumina_code/home/issues/148 — nu-demo architecture index (topology, deploy flow) - https://forge.ourworld.tf/lhumina_code/home/issues/149 — prompt.rs directive strengthening + rebuild verification - https://forge.ourworld.tf/lhumina_code/home/issues/150 — tool_choice=required support (needs turn-1-only gating; pragmatic patch caused infinite loops) - https://forge.ourworld.tf/lhumina_code/home/issues/152 — **triage classifier routes hero questions to Knowledge path (the hallucination root cause)** - https://forge.ourworld.tf/lhumina_code/home/issues/153 — **tools payload sanitization (count > 128, dots-in-names regex, duplicates)** - https://forge.ourworld.tf/lhumina_code/home/issues/156 — Photos seed storage_path leading slash - https://forge.ourworld.tf/lhumina_code/home/issues/157 — Books island URL double-slash + server-side slash-collapse middleware demo fix - https://forge.ourworld.tf/lhumina_code/home/issues/158 — **hero_books re-runs LLM Q&A extraction despite pre-shipped `.ai/<page>.toml` cache** - https://forge.ourworld.tf/lhumina_code/home/issues/125 — hero_router X-Hero-Context header clobbering - https://forge.ourworld.tf/lhumina_code/home/issues/135 — service_agent.nu missing entirely - https://forge.ourworld.tf/lhumina_code/home/issues/136 — OPENROUTER_API_KEYS plural name - https://forge.ourworld.tf/lhumina_code/home/issues/145 — hero_embedder blocking reqwest in tokio async ## 10. The killer demo — what it should look like With all the above working (hybrid routing on, Claude-backed aibroker, hero_books libraries indexed, triage override for hero-keywords, tool-name sanitizer): - User: "What is Hero OS?" → triage→Tools, semantic pulls `search_hero_docs`, Claude calls it, response cites `(hero_os_guide, overview/quickstart/ai_pipeline)` with ~1s latency. - User: "How does the AI Assistant work?" → `search_hero_docs` hits `ai_pipeline.md` (this very doc), response explains the pipeline from the doc's own text. **Closed loop: the AI reads its own documentation.** - User: "List files in my geomind workspace" → triage→Tools, semantic pulls `list_files`, response shows webdav contents. - User: "Summarize the mycelium hosting plan" → triage→Tools, semantic pulls a namespace-aware search tool (once https://forge.ourworld.tf/lhumina_code/home/issues/XXX lands), response grounds in docs_mycelium. ## 11. What's on `development_mik_nu_demo` Docs_hero branch `development_mik_nu_demo` (NOT pushed) — commits: - `2c88aa2` — refreshed architecture.md + services.md + overview.md + quickstart.md for the nu-shell / hero_proc / hero_agent era - `6deb8df` — added ai_pipeline.md (this content in user-doc form) heronu VM state: - prompt.rs MANDATORY directive compiled into hero_agent_server binary (strings count = 1 for MANDATORY / Hero Vired / sovereign digital) - `HERO_AGENT_ROUTING_MODE=hybrid` + top_k=10 + threshold=0.25 applied via `hero_proc action set` - `AIBROKER_API_ENDPOINT` + `HERO_AGENT_AIBROKER_MODELS` + `OSIS_URL`/`OSIS_CONTEXT` in agent env - mcp.json trimmed to hero_books only (pending name-sanitizer patch before widening) - 4 libraries cloned + indexed (hero=163, geomind=1733, mycelium/ourworld in progress) - WASM release + brotli sidecars (211MB → 1.8MB — https://forge.ourworld.tf/lhumina_code/home/issues/140) - 30 photo/video/song seed records rewritten to strip leading slash (https://forge.ourworld.tf/lhumina_code/home/issues/156) - 49 .docx files seeded into Office archipelago (default + geomind + incubaid + root + threefold contexts) Signed-off-by: mik-tf
Author
Owner

Moved to hero_demo#27 — see lhumina_code/hero_demo#27

Moved to hero_demo#27 — see https://forge.ourworld.tf/lhumina_code/hero_demo/issues/27
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/home#159
No description provided.