[arch] Curated MCP tool surface + semantic discovery via embedder/indexer (stop the brute-force) #15

Closed
opened 2026-05-01 02:05:20 +00:00 by mik-tf · 6 comments
Owner

Premise

The AI Assistant on herodemo can already query hero_biz and run Python — but the how is fragile and visibly ungainly. Live screenshot evidence from 2026-04-30:

"Let me try this... Let me check the environment for the socket path... Let me list available services... Let me search for the RPC socket... Let me check the service configuration... I see — the hero_biz service has a UI socket. Let me search for the RPC socket or check the service configuration. I notice there's hero_osis_business. Let me try that instead. Let me query it. Let me try using Python to interact with the socket. Let me check what methods are available... Perfect! I can see there's a contact.list method..."

That's ~30 lines of trial-and-error before the agent lands on a working call. It got there. But it brute-forced through socket paths and method names instead of having a tight, well-defined tool surface.

The pieces to fix this are already built:

  • hero_aibroker — LLM provider routing, MCP host
  • hero_embedder — semantic embeddings (already used by books library)
  • hero_indexer — triage + ranking
  • hero_agent — orchestrator

This issue is the wiring. We're not building infra; we're connecting infra that exists.

Vision

Every domain in Hero OS exposes a curated MCP tool surface with semantic descriptions. The agent does not enumerate sockets, does not guess RPC method names, does not try Python as a fallback. It calls one tool per intent.

When the agent receives a prompt:

  1. Embed the prompt via hero_embedder.
  2. Query hero_indexer for the top-N matching tool descriptions.
  3. The agent reads the descriptions, picks the right tool, calls it.
  4. hero_aibroker handles the LLM round-trip.
  5. Per-context routing is automatic — the tool client carries X-Hero-Context: <current> based on the user's active context. The agent doesn't have to think about it.

Tool surface (per domain)

Initial set, ship in priority order:

Phase 1 — biz (the demo headliner)

biz.list_contacts(limit?, search?)
biz.list_persons(limit?, search?)
biz.list_companies(limit?, search?)
biz.list_deals(stage?, limit?)
biz.list_opportunities(stage?, limit?)
biz.add_contact(name, email?, company?, person_sid?)
biz.add_person(name, email?)
biz.add_company(name, domain?)
biz.add_deal(title, stage?, amount?, person_sid?)
biz.update_deal(sid, stage?, amount?)
biz.summarize_pipeline()      # convenience: aggregate counts + AI summary

Each tool has a structured description:

  • One-line purpose
  • Parameter schema with semantic names (not RPC field names)
  • Example prompts that should trigger it
  • Per-context behaviour (always reads/writes the active context)

Phase 2 — calendar / projects / tasks

calendar.list_events(date?, range?)
calendar.create_event(title, datetime, attendees?)
projects.list_active()
projects.create_task(project, title, due?)
tasks.list_today()

Phase 3 — content (books / files / photos / videos)

books.search(query, library?)
books.summarize_book(book_sid)
books.summarize_library(library_sid)
files.list_path(path, context?)
files.read_doc(path)
photos.list_recent(limit?)
videos.list_recent(limit?)

Phase 4 — OS-level (hero_route style)

os.open_island(island_id)               # routes the OS shell
os.switch_context(context_name)
os.set_theme(mode)                      # dark/light
os.run_python(script)                   # uv-isolated subprocess (already partially working)

Architecture

                 ┌──────────────┐
   user prompt ──▶  hero_agent   │
                 └───────┬──────┘
                         │ embed prompt
                         ▼
                 ┌──────────────┐
                 │ hero_embedder│
                 └───────┬──────┘
                         │ vectors
                         ▼
                 ┌──────────────┐
                 │ hero_indexer │ ←── tool registry (MCP descriptions, embeddings cached)
                 └───────┬──────┘
                         │ ranked tool list
                         ▼
                 ┌──────────────┐
                 │  hero_agent  │ ← LLM picks tool + args via hero_aibroker
                 └───────┬──────┘
                         │ tool call (X-Hero-Context auto-injected)
                         ▼
                 ┌──────────────┐
                 │ hero_osis_*  │  (or hero_books, hero_foundry, ...)
                 └──────────────┘

Tool registration mechanism

Each service ships a mcp_tools.toml (or generates one from its OpenRPC schema) describing its curated tool surface. On startup, hero_indexer:

  1. Reads each service's tool registry.
  2. Embeds each tool's description via hero_embedder.
  3. Stores the (tool, embedding, service-endpoint) triple in a queryable index.
  4. Exposes mcp.discover(prompt) which returns the top-N matching tools.

When a new service is added: drop in its mcp_tools.toml, re-embed, agent sees it. No agent-side code change.

Per-context routing

Every tool call carries the active context as a header. The user's "current context" is tracked by hero_os shell and pushed into the agent's session state. The agent's tool clients read this state and automatically inject the header. The LLM never sees or thinks about contexts — it sees biz.list_contacts(limit=5) and the right context is wired in transparently.

Acceptance

  • Phase 1 biz tools shipped and registered with hero_indexer.
  • AI Assistant on herodemo can answer "who are my contacts?" with a single clean tool call (no scratchpad of trial-and-error).
  • AI Assistant can answer "add Alice Smith from Acme" by calling biz.add_contact exactly once.
  • Switching context in the OS shell automatically changes which contacts the AI returns (no manual context juggling in the prompt).
  • mcp.discover RPC works against herodemo and returns ranked tools for arbitrary prompts.
  • Adding a new tool to a service's mcp_tools.toml and re-running the indexer makes it agent-visible without any hero_agent code change.
  • Phase 2-4 tools follow the same pattern; ship as time allows.

Why this matters

This is the single biggest leverage point for AI Assistant quality alongside docs_hero coverage. Every prompt the agent answers depends on it picking the right action. With curated tools + semantic discovery, the agent looks like it understands the OS. Without them, it looks like a smart-but-confused junior who's never seen this codebase.

Cross-references

Signed-off-by: mik-tf

## Premise The AI Assistant on herodemo can already query hero_biz and run Python — but the *how* is fragile and visibly ungainly. Live screenshot evidence from 2026-04-30: > "Let me try this... Let me check the environment for the socket path... Let me list available services... Let me search for the RPC socket... Let me check the service configuration... I see — the hero_biz service has a UI socket. Let me search for the RPC socket or check the service configuration. I notice there's `hero_osis_business`. Let me try that instead. Let me query it. Let me try using Python to interact with the socket. Let me check what methods are available... Perfect! I can see there's a `contact.list` method..." That's ~30 lines of trial-and-error before the agent lands on a working call. It got there. But it brute-forced through socket paths and method names instead of having a tight, well-defined tool surface. The pieces to fix this are **already built**: - **hero_aibroker** — LLM provider routing, MCP host - **hero_embedder** — semantic embeddings (already used by books library) - **hero_indexer** — triage + ranking - **hero_agent** — orchestrator This issue is the wiring. We're not building infra; we're connecting infra that exists. ## Vision Every domain in Hero OS exposes a **curated MCP tool surface** with semantic descriptions. The agent does not enumerate sockets, does not guess RPC method names, does not try Python as a fallback. It calls one tool per intent. When the agent receives a prompt: 1. Embed the prompt via **hero_embedder**. 2. Query **hero_indexer** for the top-N matching tool descriptions. 3. The agent reads the descriptions, picks the right tool, calls it. 4. **hero_aibroker** handles the LLM round-trip. 5. Per-context routing is **automatic** — the tool client carries `X-Hero-Context: <current>` based on the user's active context. The agent doesn't have to think about it. ## Tool surface (per domain) Initial set, ship in priority order: ### Phase 1 — biz (the demo headliner) ``` biz.list_contacts(limit?, search?) biz.list_persons(limit?, search?) biz.list_companies(limit?, search?) biz.list_deals(stage?, limit?) biz.list_opportunities(stage?, limit?) biz.add_contact(name, email?, company?, person_sid?) biz.add_person(name, email?) biz.add_company(name, domain?) biz.add_deal(title, stage?, amount?, person_sid?) biz.update_deal(sid, stage?, amount?) biz.summarize_pipeline() # convenience: aggregate counts + AI summary ``` Each tool has a structured description: - One-line purpose - Parameter schema with semantic names (not RPC field names) - Example prompts that should trigger it - Per-context behaviour (always reads/writes the active context) ### Phase 2 — calendar / projects / tasks ``` calendar.list_events(date?, range?) calendar.create_event(title, datetime, attendees?) projects.list_active() projects.create_task(project, title, due?) tasks.list_today() ``` ### Phase 3 — content (books / files / photos / videos) ``` books.search(query, library?) books.summarize_book(book_sid) books.summarize_library(library_sid) files.list_path(path, context?) files.read_doc(path) photos.list_recent(limit?) videos.list_recent(limit?) ``` ### Phase 4 — OS-level (hero_route style) ``` os.open_island(island_id) # routes the OS shell os.switch_context(context_name) os.set_theme(mode) # dark/light os.run_python(script) # uv-isolated subprocess (already partially working) ``` ## Architecture ``` ┌──────────────┐ user prompt ──▶ hero_agent │ └───────┬──────┘ │ embed prompt ▼ ┌──────────────┐ │ hero_embedder│ └───────┬──────┘ │ vectors ▼ ┌──────────────┐ │ hero_indexer │ ←── tool registry (MCP descriptions, embeddings cached) └───────┬──────┘ │ ranked tool list ▼ ┌──────────────┐ │ hero_agent │ ← LLM picks tool + args via hero_aibroker └───────┬──────┘ │ tool call (X-Hero-Context auto-injected) ▼ ┌──────────────┐ │ hero_osis_* │ (or hero_books, hero_foundry, ...) └──────────────┘ ``` ## Tool registration mechanism Each service ships a `mcp_tools.toml` (or generates one from its OpenRPC schema) describing its curated tool surface. On startup, hero_indexer: 1. Reads each service's tool registry. 2. Embeds each tool's description via hero_embedder. 3. Stores the (tool, embedding, service-endpoint) triple in a queryable index. 4. Exposes `mcp.discover(prompt)` which returns the top-N matching tools. When a new service is added: drop in its `mcp_tools.toml`, re-embed, agent sees it. No agent-side code change. ## Per-context routing Every tool call carries the active context as a header. The user's "current context" is tracked by hero_os shell and pushed into the agent's session state. The agent's tool clients read this state and automatically inject the header. The LLM never sees or thinks about contexts — it sees `biz.list_contacts(limit=5)` and the right context is wired in transparently. ## Acceptance - [ ] Phase 1 biz tools shipped and registered with hero_indexer. - [ ] AI Assistant on herodemo can answer "who are my contacts?" with a single clean tool call (no scratchpad of trial-and-error). - [ ] AI Assistant can answer "add Alice Smith from Acme" by calling `biz.add_contact` exactly once. - [ ] Switching context in the OS shell automatically changes which contacts the AI returns (no manual context juggling in the prompt). - [ ] `mcp.discover` RPC works against herodemo and returns ranked tools for arbitrary prompts. - [ ] Adding a new tool to a service's `mcp_tools.toml` and re-running the indexer makes it agent-visible without any hero_agent code change. - [ ] Phase 2-4 tools follow the same pattern; ship as time allows. ## Why this matters This is the **single biggest leverage point for AI Assistant quality** alongside docs_hero coverage. Every prompt the agent answers depends on it picking the right action. With curated tools + semantic discovery, the agent looks like it understands the OS. Without them, it looks like a smart-but-confused junior who's never seen this codebase. ## Cross-references - Vision: https://forge.ourworld.tf/lhumina_code/hero_demo/issues/52 - 24-hour demo plan: https://forge.ourworld.tf/lhumina_code/hero_demo/issues/53 (calls for Phase 1 in the first 4 hours) - docs_hero coverage: filed alongside (the agent grounds on docs_hero — this issue is the action layer; docs_hero is the knowledge layer) - Per-context routing: https://forge.ourworld.tf/lhumina_code/hero_osis/issues/43 (architectural — contexts as data) Signed-off-by: mik-tf
mik-tf self-assigned this 2026-05-01 02:05:20 +00:00
Author
Owner

Read AND write — the closed loop is the point

The Phase 1-4 tool list above already mixes *.list_* (read) and *.add_* / *.update_* (write) tools, but worth calling out the architectural loop explicitly because the read+write closed loop is what makes Hero OS feel intelligent, not just informative.

The closed loop

   user prompt
       │
       ▼
   hero_agent  ─── reads ──────► hero_indexer (ranked tools)
       │                            ▲
       ▼                            │
   biz.add_contact(...)             │
       │                            │
       ▼                            │
   hero_osis_business               │
       │ (writes record)            │
       ▼                            │
   hero_embedder (re-embeds)        │
       │                            │
       └──── new vectors ───────────┘

When the agent writes to OSIS:

  1. The record lands in hero_osis_business (or whatever domain).
  2. hero_embedder picks it up on the next index cycle and produces vectors for the new content.
  3. hero_indexer updates its rankings — both for "what tools to call" (tool-discovery layer) and "what content matches a query" (content-retrieval layer).
  4. Next time the user asks a related question, the agent sees its own write reflected back through the index. "You just added Alice. Here are your contacts now: …Alice."

This loop is the difference between an OS that has an AI bolted on and an OS that is operated by AI. Without write-side tools wired into the same indexer pipeline, the agent would have to remember its own writes in conversation context, which is fragile and resets every session. With the loop, the system itself is the memory.

Write-side requirements per domain

Every domain that exposes *.list_* tools should also expose corresponding *.add_* / *.update_* / *.delete_* tools where the underlying RPC supports them:

  • hero_osis_business: contact, person, company, deal, opportunity, instrument, contract, transaction — all CRUD via existing RPCs
  • hero_osis_calendar: events, registrations
  • hero_osis_projects: projects, tasks, issues
  • hero_osis_files (via hero_foundry): create folder, upload, rename, delete — webdav already supports all of these via /api/files/{ctx}/{path} PUT/POST/DELETE
  • hero_osis_communication: messages, channels (where applicable)
  • hero_osis_settings: user preferences

Read-only OK for: hero_books (libraries are external repos cloned via libraries.txt), hero_indexer (it indexes others), hero_embedder (it serves embeddings).

Per-context write isolation

Every write tool carries X-Hero-Context automatically, exactly like reads. A write to biz.add_contact while in Geomind context creates the contact in Geomind only. There's no "global add" — that would break the multi-tenancy story.

Confirmation UX

Writes through the AI Assistant should always emit a brief confirmation that the user can read and rollback if needed:

"I added Alice Smith from Acme to your Geomind contacts. (Undo)"

The undo path uses the corresponding *.delete_* tool. Tracked in the Ambient AI issue (#16) as part of the widget UX.

Acceptance addition

Add to the existing acceptance:

  • Every read tool in Phase 1-4 has its write counterpart shipped together (where the underlying RPC supports it).
  • On herodemo, the agent can biz.add_contact and the next biz.list_contacts query reflects the new contact (closed loop verified end-to-end).
  • hero_embedder re-embeds new records within one index cycle (default 60s).
  • hero_indexer's tool ranking + content ranking both pick up new writes without restart.

Signed-off-by: mik-tf

## Read AND write — the closed loop is the point The Phase 1-4 tool list above already mixes `*.list_*` (read) and `*.add_*` / `*.update_*` (write) tools, but worth calling out the architectural loop explicitly because **the read+write closed loop is what makes Hero OS feel intelligent**, not just informative. ### The closed loop ``` user prompt │ ▼ hero_agent ─── reads ──────► hero_indexer (ranked tools) │ ▲ ▼ │ biz.add_contact(...) │ │ │ ▼ │ hero_osis_business │ │ (writes record) │ ▼ │ hero_embedder (re-embeds) │ │ │ └──── new vectors ───────────┘ ``` When the agent **writes** to OSIS: 1. The record lands in `hero_osis_business` (or whatever domain). 2. **hero_embedder** picks it up on the next index cycle and produces vectors for the new content. 3. **hero_indexer** updates its rankings — both for "what tools to call" (tool-discovery layer) and "what content matches a query" (content-retrieval layer). 4. Next time the user asks a related question, the agent sees its own write reflected back through the index. *"You just added Alice. Here are your contacts now: …Alice."* This loop is **the difference between an OS that has an AI bolted on and an OS that is operated by AI**. Without write-side tools wired into the same indexer pipeline, the agent would have to remember its own writes in conversation context, which is fragile and resets every session. With the loop, the system itself is the memory. ### Write-side requirements per domain Every domain that exposes `*.list_*` tools should also expose corresponding `*.add_*` / `*.update_*` / `*.delete_*` tools where the underlying RPC supports them: - **hero_osis_business**: contact, person, company, deal, opportunity, instrument, contract, transaction — all CRUD via existing RPCs - **hero_osis_calendar**: events, registrations - **hero_osis_projects**: projects, tasks, issues - **hero_osis_files** (via hero_foundry): create folder, upload, rename, delete — webdav already supports all of these via `/api/files/{ctx}/{path}` PUT/POST/DELETE - **hero_osis_communication**: messages, channels (where applicable) - **hero_osis_settings**: user preferences Read-only OK for: hero_books (libraries are external repos cloned via libraries.txt), hero_indexer (it indexes others), hero_embedder (it serves embeddings). ### Per-context write isolation Every write tool carries `X-Hero-Context` automatically, exactly like reads. A write to `biz.add_contact` while in Geomind context creates the contact in Geomind only. There's no "global add" — that would break the multi-tenancy story. ### Confirmation UX Writes through the AI Assistant should always emit a brief confirmation that the user can read and rollback if needed: > "I added Alice Smith from Acme to your Geomind contacts. (Undo)" The undo path uses the corresponding `*.delete_*` tool. Tracked in the **Ambient AI** issue (https://forge.ourworld.tf/lhumina_code/hero_agent/issues/16) as part of the widget UX. ### Acceptance addition Add to the existing acceptance: - [ ] Every read tool in Phase 1-4 has its write counterpart shipped together (where the underlying RPC supports it). - [ ] On herodemo, the agent can `biz.add_contact` and the next `biz.list_contacts` query reflects the new contact (closed loop verified end-to-end). - [ ] hero_embedder re-embeds new records within one index cycle (default 60s). - [ ] hero_indexer's tool ranking + content ranking both pick up new writes without restart. Signed-off-by: mik-tf
Author
Owner

Update from source-grounded read (session 52)

After reading hero_router/crates/hero_router/src/server/mcp.rs + the hero_agent MCP client, the original framing of this issue ("agent brute-forces socket paths; need to build curated MCP per-service") is largely already implemented and needs reshaping.

What already exists:

  • hero_router exposes every healthy service as an MCP server at POST /mcp/{service_name} (mcp.rs:30). Tools auto-derived from the service's OpenRPC via openrpc_to_mcp_tools (mcp.rs:130).
  • agent_run is auto-appended as a free Python-LLM tool on every service (mcp.rs:132-158) — natural-language → generated Python → executed against the service.
  • claude mcp add --transport http <slug> <endpoint> snippet rendered in the per-service Quick Setup card (mcp.rs:591); .mcp.json snippet at mcp.rs:607-631. Claude Code is a first-class consumer.
  • hero_agent uses an explicit ~/hero/var/agent/mcp.json registry with three transports (socket/url/command); url is preferred and routes through hero_router. X-Hero-Context + claims propagate to outbound MCP HTTP via tokio::task_local! FORWARDED_HEADERS (hero_agent/.../mcp_client.rs:14-19, :683-690). The "brute-force socket guessing" framing is not what the code does.

What's still real work, differently shaped:

  1. Tool-description quality. Auto-derived descriptions from OSchema are weak. OSchema annotations / per-method description fields would lift agent tool-selection precision.
  2. Tool-selection ranking when the surface is N services × M tools. hero_agent already has Specific/Semantic/Hybrid modes; needs evaluation against real workloads.
  3. Per-context curation. Which auto-MCP tools surface to which context — sovereignty + UX gate.
  4. Write-side gating. Read methods (*.list, *.get, *.find) are safe to auto-expose. Write methods (*.set, *.delete) need confirmation gates / per-context allowlists before going live to the agent.

Suggested next step: rescope this issue body to those four (or split into four). The "build MCP" framing is misleading — flag for closure of the build-MCP narrative, retain (1)–(4) as the live work.

Reconciliation memo for session 52: memory/investigation_roadmap_reconciliation.md (private workspace).

## Update from source-grounded read (session 52) After reading [`hero_router/crates/hero_router/src/server/mcp.rs`](https://forge.ourworld.tf/lhumina_code/hero_router/src/branch/development/crates/hero_router/src/server/mcp.rs) + the hero_agent MCP client, the original framing of this issue ("agent brute-forces socket paths; need to build curated MCP per-service") is largely already implemented and needs reshaping. **What already exists:** - `hero_router` exposes every healthy service as an MCP server at `POST /mcp/{service_name}` (`mcp.rs:30`). Tools auto-derived from the service's OpenRPC via `openrpc_to_mcp_tools` (`mcp.rs:130`). - `agent_run` is auto-appended as a free Python-LLM tool on every service (`mcp.rs:132-158`) — natural-language → generated Python → executed against the service. - `claude mcp add --transport http <slug> <endpoint>` snippet rendered in the per-service Quick Setup card (`mcp.rs:591`); `.mcp.json` snippet at `mcp.rs:607-631`. Claude Code is a first-class consumer. - hero_agent uses an explicit `~/hero/var/agent/mcp.json` registry with three transports (`socket`/`url`/`command`); `url` is preferred and routes through hero_router. X-Hero-Context + claims propagate to outbound MCP HTTP via `tokio::task_local!` `FORWARDED_HEADERS` (`hero_agent/.../mcp_client.rs:14-19, :683-690`). The "brute-force socket guessing" framing is not what the code does. **What's still real work, differently shaped:** 1. **Tool-description quality.** Auto-derived descriptions from OSchema are weak. OSchema annotations / per-method `description` fields would lift agent tool-selection precision. 2. **Tool-selection ranking** when the surface is N services × M tools. hero_agent already has Specific/Semantic/Hybrid modes; needs evaluation against real workloads. 3. **Per-context curation.** Which auto-MCP tools surface to which context — sovereignty + UX gate. 4. **Write-side gating.** Read methods (`*.list`, `*.get`, `*.find`) are safe to auto-expose. Write methods (`*.set`, `*.delete`) need confirmation gates / per-context allowlists before going live to the agent. **Suggested next step:** rescope this issue body to those four (or split into four). The "build MCP" framing is misleading — flag for closure of the build-MCP narrative, retain (1)–(4) as the live work. Reconciliation memo for session 52: `memory/investigation_roadmap_reconciliation.md` (private workspace).
Author
Owner

Status update — 2026-05-01 live audit on herodemo

After working through the AI Assistant slowness on the demo (trace: ~10 tool iterations / ~30-60s for "who are my contacts in hero_biz"), I audited the actual MCP wiring on the live VM. The infrastructure is more mature than this issue's original framing — the gap is narrower than building "curated tool surface from scratch."

What's already built

hero_router/crates/hero_router/src/server/mcp.rs implements a complete MCP gateway:

  • POST /mcp/{service_name} per service — full MCP protocol (initialize, tools/list, tools/call)
  • Auto-derives the tool catalog from each service's OpenRPC spec (no separate mcp_tools.toml needed for the deriver path)
  • Per-service inspector pages render the catalog at /service/<svc>/mcp
  • Logging via router.mcp_logs RPC + UI

So the registration mechanism this issue proposed (mcp_tools.toml → hero_indexer embedding → semantic discovery) is one valid path, but the simpler path (let router auto-derive from OpenRPC) is already live.

Wiring gap — empirical, on herodemo VM 2026-05-01

Gap A — hero_agent only registers ONE MCP server. Live config:

// /home/driver/hero/var/agent/mcp.json
{
  "mcpServers": {
    "hero_books": { "url": "http://.../mcp/hero_books", "transport": "http" }
  }
}

hero_biz / hero_calendar / hero_photos / hero_videos / hero_collab / hero_slides — none registered. The agent has no awareness those gateway endpoints exist, so it falls back to search_hero_docs + shell_run to reverse-engineer everything. One file edit fixes this.

Gap B — OpenRPC specs only expose agent_run, not domain ops. Live tools/list counts via the router gateway:

Service MCP tool count
hero_books 1 (agent_run)
hero_biz 1 (agent_run)
hero_collab 1 (agent_run)
hero_slides 1 (agent_run)
hero_calendar / photos / videos / archipelagos 0

So even if the agent had hero_biz registered as an MCP server, the only tool it would see is the generic agent_run shell-exec sandbox — not biz.list_contacts, biz.add_contact etc. Those domain operations are not in the OpenRPC schemas the gateway derives from.

Revised scope — narrower than this issue's Phase 1-4 framing

The original Phase 1 spec listed 11 biz tools. To make the agent fast, we don't need a parallel mcp_tools.toml registry or semantic discovery via embedder/indexer as a precondition. We need:

(a) OpenRPC schema additions per service. For hero_biz: add biz.list_contacts, biz.list_persons, biz.list_companies, biz.list_deals, biz.add_contact etc. to the OpenRPC spec. The router auto-exposes them as MCP tools the next time tools/list is called.

(b) Extend hero_agent's mcp.json to register all per-domain MCP endpoints. ~24 lines of JSON.

That's it for the "agent answers in ~3-5s instead of ~30-60s" outcome. The semantic-discovery layer (embedder + indexer with mcp.discover(prompt)) is still architecturally desirable for the "agent picks the right tool from 100+ tools" scaling story, but is not a blocker for the demo-level speedup. With ~30-50 tools across the active services, the LLM's native tool-selection (just from the tools/list JSON) handles selection well within Claude Sonnet's tool-use budget.

Effort estimate (revised)

  • Gap A fix — extend mcp.json for the agent: ~30 min including a sanity test against each endpoint.
  • Gap B fix per domain — add list/add/update operations to OpenRPC spec, regenerate codegen, verify tools/list returns them: ~2-3 hr/domain. For Phase 1 (biz alone) that's a half-day. For all of biz + calendar + projects + tasks + photos + videos + books-search: 2-3 days of focused work.
  • Per-context routing via X-Hero-Context header: confirm the router's MCP proxy forwards the header end-to-end (it should already — it just proxies the JSON-RPC body); if not, ~1 hr to add header forwarding.

So the demo-shipping version of this issue (agent picks domain tools cleanly) is ~3 days end-to-end if focused, not the multi-week build implied by the original framing. The semantic-discovery + mcp_tools.toml registry can ship in a follow-up issue once the simpler path is proven.

Adversarial caveats

  • Tool-list bloat: if all 24 services each expose 5-10 RPCs, the agent's tools/list payload grows. Modern Claude handles this fine up to ~100 tools, but past that the LLM's selection accuracy drops. At which point semantic pre-filtering via embedder (the original plan) becomes load-bearing. So the original Phase 1-4 architecture is still the right end-state — just not the precondition for the demo win.
  • Context propagation: X-Hero-Context must reach the underlying service's RPC handler. Verify the router's mcp_handler doesn't strip incoming headers (most proxies preserve them; quick check needed).
  • Schema drift: if RPCs are added to OpenRPC but the agent_run filter logic in router/mcp.rs explicitly excludes them, the auto-derive won't pick them up. Easy to verify by re-running tools/list after adding one method.

Recommend keeping the existing Phase 1-4 plan as the long-term roadmap, but adding a "Phase 0 — minimal viable wiring" section above it covering the two gaps. That sequencing ships the demo speedup in ~3 days, then lets Phase 1+ inherit the working baseline.

Signed-off-by: mik-tf

## Status update — 2026-05-01 live audit on herodemo After working through the AI Assistant slowness on the demo (trace: ~10 tool iterations / ~30-60s for "who are my contacts in hero_biz"), I audited the actual MCP wiring on the live VM. The infrastructure is more mature than this issue's original framing — the gap is narrower than building "curated tool surface from scratch." ### What's already built `hero_router/crates/hero_router/src/server/mcp.rs` implements a complete MCP gateway: - `POST /mcp/{service_name}` per service — full MCP protocol (`initialize`, `tools/list`, `tools/call`) - Auto-derives the tool catalog from each service's **OpenRPC spec** (no separate `mcp_tools.toml` needed for the deriver path) - Per-service inspector pages render the catalog at `/service/<svc>/mcp` - Logging via `router.mcp_logs` RPC + UI So the registration mechanism this issue proposed (`mcp_tools.toml` → hero_indexer embedding → semantic discovery) is **one valid path**, but the simpler path (let router auto-derive from OpenRPC) is already live. ### Wiring gap — empirical, on herodemo VM 2026-05-01 **Gap A — `hero_agent` only registers ONE MCP server.** Live config: ```json // /home/driver/hero/var/agent/mcp.json { "mcpServers": { "hero_books": { "url": "http://.../mcp/hero_books", "transport": "http" } } } ``` hero_biz / hero_calendar / hero_photos / hero_videos / hero_collab / hero_slides — none registered. The agent has no awareness those gateway endpoints exist, so it falls back to `search_hero_docs` + `shell_run` to reverse-engineer everything. **One file edit fixes this.** **Gap B — OpenRPC specs only expose `agent_run`, not domain ops.** Live `tools/list` counts via the router gateway: | Service | MCP tool count | |---|---| | hero_books | 1 (`agent_run`) | | hero_biz | 1 (`agent_run`) | | hero_collab | 1 (`agent_run`) | | hero_slides | 1 (`agent_run`) | | hero_calendar / photos / videos / archipelagos | 0 | So even if the agent had hero_biz registered as an MCP server, the only tool it would see is the generic `agent_run` shell-exec sandbox — **not** `biz.list_contacts`, `biz.add_contact` etc. Those domain operations are not in the OpenRPC schemas the gateway derives from. ### Revised scope — narrower than this issue's Phase 1-4 framing The original Phase 1 spec listed 11 biz tools. To make the agent fast, we don't need a parallel `mcp_tools.toml` registry or semantic discovery via embedder/indexer **as a precondition**. We need: **(a) OpenRPC schema additions per service.** For hero_biz: add `biz.list_contacts`, `biz.list_persons`, `biz.list_companies`, `biz.list_deals`, `biz.add_contact` etc. to the OpenRPC spec. The router auto-exposes them as MCP tools the next time `tools/list` is called. **(b) Extend `hero_agent`'s `mcp.json` to register all per-domain MCP endpoints.** ~24 lines of JSON. That's it for the "agent answers in ~3-5s instead of ~30-60s" outcome. The semantic-discovery layer (embedder + indexer with `mcp.discover(prompt)`) is still architecturally desirable for the "agent picks the right tool from 100+ tools" scaling story, but is **not a blocker** for the demo-level speedup. With ~30-50 tools across the active services, the LLM's native tool-selection (just from the `tools/list` JSON) handles selection well within Claude Sonnet's tool-use budget. ### Effort estimate (revised) - **Gap A fix** — extend `mcp.json` for the agent: ~30 min including a sanity test against each endpoint. - **Gap B fix per domain** — add list/add/update operations to OpenRPC spec, regenerate codegen, verify `tools/list` returns them: ~2-3 hr/domain. For Phase 1 (biz alone) that's a half-day. For all of biz + calendar + projects + tasks + photos + videos + books-search: 2-3 days of focused work. - **Per-context routing via `X-Hero-Context` header**: confirm the router's MCP proxy forwards the header end-to-end (it should already — it just proxies the JSON-RPC body); if not, ~1 hr to add header forwarding. So the demo-shipping version of this issue (agent picks domain tools cleanly) is **~3 days end-to-end** if focused, not the multi-week build implied by the original framing. The semantic-discovery + `mcp_tools.toml` registry can ship in a follow-up issue once the simpler path is proven. ### Adversarial caveats - **Tool-list bloat**: if all 24 services each expose 5-10 RPCs, the agent's `tools/list` payload grows. Modern Claude handles this fine up to ~100 tools, but past that the LLM's selection accuracy drops. At which point semantic pre-filtering via embedder (the original plan) becomes load-bearing. So the original Phase 1-4 architecture is still the right end-state — just not the precondition for the demo win. - **Context propagation**: `X-Hero-Context` must reach the underlying service's RPC handler. Verify the router's mcp_handler doesn't strip incoming headers (most proxies preserve them; quick check needed). - **Schema drift**: if RPCs are added to OpenRPC but the `agent_run` filter logic in router/mcp.rs explicitly excludes them, the auto-derive won't pick them up. Easy to verify by re-running `tools/list` after adding one method. Recommend keeping the existing Phase 1-4 plan as the long-term roadmap, but adding a "Phase 0 — minimal viable wiring" section above it covering the two gaps. That sequencing ships the demo speedup in ~3 days, then lets Phase 1+ inherit the working baseline. Signed-off-by: mik-tf
Author
Owner

Architectural principle to make load-bearing

Every user-facing capability in Hero OS must be exposed via the RPC layer first. From there it flows automatically into MCP via the router gateway.

This is the contract — not a nice-to-have. The router's MCP gateway already auto-derives tools/list from each service's OpenRPC. So "RPC-complete" implies "MCP-complete" implies "agent-callable" with zero additional plumbing. Anything that bypasses RPC (e.g., direct webhook handlers, server-side templates that call internal Rust APIs but never expose them as RPC, magic shell scripts the UI invokes) is a contract violation that breaks the agent's ability to perform that action.

This sharpens this issue's framing: the work is not "build a curated agent tool surface." It's "audit every service's user-facing capabilities and make sure each one has a corresponding RPC method." Once that's done, the agent's tool catalog is automatic.

Inventory of the gap (live observation, herodemo 2026-05-01)

Each row is "what the user can do in the OS shell" vs "what's currently exposed as RPC method".

Service UI capability RPC method exists? MCP-visible (auto-derived)?
hero_biz List contacts in Biz iframe UI loads them → so handler exists in some form no biz.list_contacts in OpenRPC
hero_biz Add a contact UI POST works
hero_biz List companies / deals / pipeline UI works
hero_calendar List events for the active context UI works
hero_calendar Create event UI works
hero_photos List recent photos Archipelago renders
hero_videos List recent videos Archipelago renders
hero_books Search the library books.search is there
hero_books Summarize a book / library UI works (we just verified AI summary on demo)
hero_collab List channels in the active context UI loads them
hero_collab Send a message to a channel UI works
hero_slides List decks UI works
hero_office Open a doc for editing UI works (post-fix 2026-05-01)
hero_voice Run STT on an upload Wake-word loop uses it
hero_os Switch context URL-driven no os.switch_context
hero_os Open an island URL-driven no os.open_island

So the majority of user-facing capabilities are currently behind UI handlers that don't have a public RPC method. The handlers exist (the UI calls them); they're just not in the OpenRPC schema, so the router's auto-deriver doesn't surface them, so the agent can't see them.

This is the single load-bearing fix. Once each handler has an RPC method declared, MCP coverage falls out for free.

Before tackling any "Phase 1-4" tool surface design (the original framing), do a flat audit pass that answers, per service, three questions:

  1. What handlers exist that the UI actually uses? (read the iframe / WASM call sites; or search for the routes the UI hits)
  2. Of those, which are exposed as named RPC methods today? (read OpenRPC + grep)
  3. Of the un-exposed ones, which need to be RPC-callable for an AI agent to do anything useful?

The output is a per-service punch list. That punch list IS the work.

Phasing — per-service, NOT per-tool-category

The original Phase 1-4 framing organized by tool category (biz tools / calendar tools / content tools / OS tools). Reorganize by service, because each service's punch list is owned by one team / one repo, and the natural unit of work is "make hero_biz RPC-complete" not "ship 11 biz tools across multiple repos."

Suggested ordering (effort × demo value):

  • hero_biz — half day. Highest demo value (CRM is the canonical "AI assistant does business stuff" demo). RPC additions: list_contacts, list_persons, list_companies, list_deals, list_opportunities, add_contact, add_person, add_company, add_deal, update_deal, summarize_pipeline.
  • hero_books — 2 hr. Low effort because most is there. RPC additions: summarize_book, summarize_library. books.search already exists. Plus: confirm agent_run is the only thing in tools/list and figure out why the OpenRPC-derived methods aren't surfacing (this is a router or codegen bug, not a books bug).
  • hero_calendar — half day. RPC additions: list_events, create_event, update_event, delete_event.
  • hero_photos / hero_videos — half day combined. RPC additions: list_recent, list_album, get_metadata.
  • hero_collab — half day. RPC additions: list_channels, list_messages, post_message. Also fixes the FD leak observed 2026-05-01 if we audit the WS handlers in the same pass.
  • hero_voice — half day. RPC additions: transcribe(audio_blob), tts(text, voice?). Likely already mostly there since the wake-word loop uses internal versions.
  • hero_office / hero_slides / hero_whiteboard — half day each. RPC additions to surface the file-list and create operations (the editor open is a different concern — that's a UI navigation, not really an agent action).
  • hero_os shell — half day. The trickiest one because hero_os is a Dioxus shell, not a backend service. RPC additions go through hero_router or a new os.* endpoint set: os.switch_context, os.open_island, os.set_theme. This makes "open my photos" voice-controllable.

Total: ~5 working days if all owners are aligned and OpenRPC codegen is unblocked. Can be parallelized across people if multiple services are owned independently.

Phase 1 — semantic discovery (still relevant, defer)

The original issue's mcp.discover(prompt) via embedder + indexer becomes load-bearing once tools/list exceeds ~50-100 tools (Claude's tool-selection accuracy degrades past that point). At ~5-10 tools/service × 8 services = 40-80 tools, we're at the ceiling. So the semantic-discovery layer is the next issue to ship, but only after Phase 0 lands.

Defer the mcp_tools.toml registry too — OpenRPC carries enough description metadata for now. Re-evaluate when we hit the tool-count ceiling.

Acceptance for Phase 0

  • Every service in the per-service list above has all listed RPC methods in its OpenRPC schema.
  • curl /mcp/<svc> tools/list returns the full set for each service (not just agent_run).
  • hero_agent's mcp.json registers all per-domain MCP endpoints.
  • Live test: "who are my contacts in hero_biz" answers in < 5s with one tool call (no shell_run / file_read fallback).
  • Live test: "list my photos from this week" answers via photos.list_recent directly.
  • Live test: switching context in the OS shell changes which contacts the agent returns (X-Hero-Context propagates end-to-end).

Cross-references

  • hero_router source — the MCP gateway that auto-derives from OpenRPC. Already built.
  • hero_osis#43 — silent context fallback. Phase 0 acceptance test #6 depends on this being fixed for the X-Hero-Context propagation to be observable.
  • home#203 — codegen-drift policy. Relevant when adding RPC methods at the generator layer.
  • The original Phase 1-4 framing in the issue body remains valid as the long-term tool-surface vision; Phase 0 above is the prerequisite wiring pass.

Signed-off-by: mik-tf

## Architectural principle to make load-bearing > **Every user-facing capability in Hero OS must be exposed via the RPC layer first. From there it flows automatically into MCP via the router gateway.** This is the contract — not a nice-to-have. The router's MCP gateway already auto-derives `tools/list` from each service's OpenRPC. So "RPC-complete" implies "MCP-complete" implies "agent-callable" with zero additional plumbing. Anything that bypasses RPC (e.g., direct webhook handlers, server-side templates that call internal Rust APIs but never expose them as RPC, magic shell scripts the UI invokes) is a contract violation that breaks the agent's ability to perform that action. This sharpens this issue's framing: the work is **not** "build a curated agent tool surface." It's "audit every service's user-facing capabilities and make sure each one has a corresponding RPC method." Once that's done, the agent's tool catalog is automatic. ## Inventory of the gap (live observation, herodemo 2026-05-01) Each row is "what the user can do in the OS shell" vs "what's currently exposed as RPC method". | Service | UI capability | RPC method exists? | MCP-visible (auto-derived)? | |---|---|---|---| | hero_biz | List contacts in Biz iframe | UI loads them → so handler exists in some form | ❌ no `biz.list_contacts` in OpenRPC | | hero_biz | Add a contact | UI POST works | ❌ | | hero_biz | List companies / deals / pipeline | UI works | ❌ | | hero_calendar | List events for the active context | UI works | ❌ | | hero_calendar | Create event | UI works | ❌ | | hero_photos | List recent photos | Archipelago renders | ❌ | | hero_videos | List recent videos | Archipelago renders | ❌ | | hero_books | Search the library | ✅ `books.search` is there | ✅ | | hero_books | Summarize a book / library | UI works (we just verified AI summary on demo) | ❌ | | hero_collab | List channels in the active context | UI loads them | ❌ | | hero_collab | Send a message to a channel | UI works | ❌ | | hero_slides | List decks | UI works | ❌ | | hero_office | Open a doc for editing | UI works (post-fix 2026-05-01) | ❌ | | hero_voice | Run STT on an upload | Wake-word loop uses it | ❌ | | hero_os | Switch context | URL-driven | ❌ no `os.switch_context` | | hero_os | Open an island | URL-driven | ❌ no `os.open_island` | So the **majority** of user-facing capabilities are currently behind UI handlers that **don't have a public RPC method**. The handlers exist (the UI calls them); they're just not in the OpenRPC schema, so the router's auto-deriver doesn't surface them, so the agent can't see them. **This is the single load-bearing fix.** Once each handler has an RPC method declared, MCP coverage falls out for free. ## Phase 0 — Wiring sweep (recommended sequencing) Before tackling any "Phase 1-4" tool surface design (the original framing), do a flat audit pass that answers, per service, three questions: 1. **What handlers exist that the UI actually uses?** (read the iframe / WASM call sites; or search for the routes the UI hits) 2. **Of those, which are exposed as named RPC methods today?** (read OpenRPC + grep) 3. **Of the un-exposed ones, which need to be RPC-callable for an AI agent to do anything useful?** The output is a per-service punch list. That punch list IS the work. ## Phasing — per-service, NOT per-tool-category The original Phase 1-4 framing organized by tool category (biz tools / calendar tools / content tools / OS tools). Reorganize by service, because each service's punch list is owned by one team / one repo, and the natural unit of work is "make hero_biz RPC-complete" not "ship 11 biz tools across multiple repos." Suggested ordering (effort × demo value): - **hero_biz** — half day. Highest demo value (CRM is the canonical "AI assistant does business stuff" demo). RPC additions: `list_contacts`, `list_persons`, `list_companies`, `list_deals`, `list_opportunities`, `add_contact`, `add_person`, `add_company`, `add_deal`, `update_deal`, `summarize_pipeline`. - **hero_books** — 2 hr. Low effort because most is there. RPC additions: `summarize_book`, `summarize_library`. `books.search` already exists. **Plus**: confirm `agent_run` is the *only* thing in `tools/list` and figure out why the OpenRPC-derived methods aren't surfacing (this is a router or codegen bug, not a books bug). - **hero_calendar** — half day. RPC additions: `list_events`, `create_event`, `update_event`, `delete_event`. - **hero_photos / hero_videos** — half day combined. RPC additions: `list_recent`, `list_album`, `get_metadata`. - **hero_collab** — half day. RPC additions: `list_channels`, `list_messages`, `post_message`. Also fixes the FD leak observed 2026-05-01 if we audit the WS handlers in the same pass. - **hero_voice** — half day. RPC additions: `transcribe(audio_blob)`, `tts(text, voice?)`. Likely already mostly there since the wake-word loop uses internal versions. - **hero_office / hero_slides / hero_whiteboard** — half day each. RPC additions to surface the file-list and create operations (the editor open is a different concern — that's a UI navigation, not really an agent action). - **hero_os shell** — half day. The trickiest one because hero_os is a Dioxus shell, not a backend service. RPC additions go through hero_router or a new `os.*` endpoint set: `os.switch_context`, `os.open_island`, `os.set_theme`. This makes "open my photos" voice-controllable. **Total: ~5 working days** if all owners are aligned and OpenRPC codegen is unblocked. Can be parallelized across people if multiple services are owned independently. ## Phase 1 — semantic discovery (still relevant, defer) The original issue's `mcp.discover(prompt)` via embedder + indexer becomes load-bearing once `tools/list` exceeds ~50-100 tools (Claude's tool-selection accuracy degrades past that point). At ~5-10 tools/service × 8 services = 40-80 tools, we're at the ceiling. So the semantic-discovery layer is the **next** issue to ship, but only after Phase 0 lands. Defer the `mcp_tools.toml` registry too — OpenRPC carries enough description metadata for now. Re-evaluate when we hit the tool-count ceiling. ## Acceptance for Phase 0 - [ ] Every service in the per-service list above has all listed RPC methods in its OpenRPC schema. - [ ] `curl /mcp/<svc>` `tools/list` returns the full set for each service (not just `agent_run`). - [ ] hero_agent's `mcp.json` registers all per-domain MCP endpoints. - [ ] Live test: "who are my contacts in hero_biz" answers in < 5s with one tool call (no shell_run / file_read fallback). - [ ] Live test: "list my photos from this week" answers via `photos.list_recent` directly. - [ ] Live test: switching context in the OS shell changes which contacts the agent returns (X-Hero-Context propagates end-to-end). ## Cross-references - [hero_router source](https://forge.ourworld.tf/lhumina_code/hero_router) — the MCP gateway that auto-derives from OpenRPC. Already built. - [hero_osis#43](https://forge.ourworld.tf/lhumina_code/hero_osis/issues/43) — silent context fallback. Phase 0 acceptance test #6 depends on this being fixed for the X-Hero-Context propagation to be observable. - [home#203](https://forge.ourworld.tf/lhumina_code/home/issues/203) — codegen-drift policy. Relevant when adding RPC methods at the generator layer. - The original Phase 1-4 framing in the issue body remains valid as the long-term tool-surface vision; Phase 0 above is the **prerequisite wiring pass**. Signed-off-by: mik-tf
Author
Owner

Addendum — confirming the deriver is unconditional

Reviewed hero_router/crates/hero_router/src/server/mcp.rs:409 (fn openrpc_to_mcp_tools):

pub fn openrpc_to_mcp_tools(spec: &Value) -> Vec<Value> {
    let methods = match spec.get("methods").and_then(|m| m.as_array()) {
        Some(arr) => arr,
        None => return vec![],
    };
    methods.iter().filter_map(|method| { ... }).collect()
}

No filter, no allowlist, no mcp: true annotation requirement. It iterates every entry in the OpenRPC spec's methods array and emits one MCP tool per entry. The mapping is purely structural: name → tool name, summary/description → tool description, params → JSON Schema for inputSchema.

So the live observation that tools/list returns only agent_run per service has exactly one explanation: the OpenRPC specs themselves only declare agent_run. The domain operations (biz.list_contacts, calendar.list_events, photos.list_recent, etc.) are missing from the schemas — likely because those handlers are served as ad-hoc HTTP UI routes from each *_ui server, not as RPC methods on the OServer (*_server's rpc.sock).

This sharpens the Phase 0 punch list: per service, the work is (a) move handlers from "plain HTTP UI route" to "RPC method on rpc.sock with an OpenRPC schema entry", or (b) if the handler is already on rpc.sock, just declare it in the OpenRPC schema. Once that's done, the router's auto-deriver picks it up the next time tools/list is called — zero changes to hero_router needed.

The contract is correct as built. Phase 0 is "fill the schemas to use the contract."

Signed-off-by: mik-tf

## Addendum — confirming the deriver is unconditional Reviewed `hero_router/crates/hero_router/src/server/mcp.rs:409` (`fn openrpc_to_mcp_tools`): ```rust pub fn openrpc_to_mcp_tools(spec: &Value) -> Vec<Value> { let methods = match spec.get("methods").and_then(|m| m.as_array()) { Some(arr) => arr, None => return vec![], }; methods.iter().filter_map(|method| { ... }).collect() } ``` **No filter, no allowlist, no `mcp: true` annotation requirement.** It iterates every entry in the OpenRPC spec's `methods` array and emits one MCP tool per entry. The mapping is purely structural: name → tool name, summary/description → tool description, params → JSON Schema for `inputSchema`. So the live observation that `tools/list` returns only `agent_run` per service has exactly one explanation: **the OpenRPC specs themselves only declare `agent_run`.** The domain operations (`biz.list_contacts`, `calendar.list_events`, `photos.list_recent`, etc.) are missing from the schemas — likely because those handlers are served as ad-hoc HTTP UI routes from each `*_ui` server, not as RPC methods on the OServer (`*_server`'s `rpc.sock`). This sharpens the Phase 0 punch list: per service, the work is **(a)** move handlers from "plain HTTP UI route" to "RPC method on rpc.sock with an OpenRPC schema entry", or **(b)** if the handler is already on rpc.sock, just declare it in the OpenRPC schema. Once that's done, the router's auto-deriver picks it up the next time `tools/list` is called — zero changes to hero_router needed. The contract is correct as built. Phase 0 is "fill the schemas to use the contract." Signed-off-by: mik-tf
Owner
its not like this see https://forge.ourworld.tf/lhumina_code/hero_router/src/branch/development/docs/agentic_calling.md how we want it
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_agent#15
No description provided.