lhumina_code/hero_books

Fork 0

feat: delegate ingest/search/ontology to hero_memory; migrate herolib_ai to 0.6.0 #123

Merged

timur merged 13 commits from development_hero_memory into development

2026-05-08 14:46:26 +00:00

timur commented

2026-05-08 14:46:01 +00:00

Owner

Summary

The hero_books refactor branch landed across this session: ~13.5k LOC removed, ingest+search+ontology delegated to hero_memory, the dead embedder probe replaced, the in-process Q&A pipeline deleted, and the herolib_ai callsites migrated to its new 0.6.0 broker-first async API (which itself is what hero_aibroker#63 proposes for the long term).

End-to-end verified through the JSON-RPC API and the Dioxus UI:

server.health reports memoryConnected: true
memory.{ingest,search,qaSearch,extractOntology,stats} work
search.query (legacy endpoint) routes through memory.search
The "Ask the Librarian" feature renders summaries instead of the "Llama 3.3 70B not available" error

cargo check --workspace clean. Auto-prepare runs at startup over all on-disk libraries (scan + convert only — no LLM at boot).

What's deleted

hero_books_lib::ontology/ (~1.9k LOC) — replaced by memory.extractOntology
hero_books_lib::vectorsdk/ + embedder/ (~2.7k LOC) — replaced by memory.search
hero_books_lib::ai::{processor,topics,metadata} + book::{processor,ai_book} (~2.4k LOC) — replaced by memory.ingest
hero_books_lib::convert/ (~180 LOC) — replaced by memory.ingest's convert.run
Startup auto-Q&A + auto-index in axum_server.rs (~430 LOC)
Legacy import pipeline + a string of dead helpers in web/server.rs

What's added

hero_books_lib::memory_client::MemoryClient — raw JSON-RPC wrapper for hero_memory
web/rpc.rs::handle_memory_* — five new RPCs (memory.ingest, memory.search, memory.qaSearch, memory.extractOntology, memory.stats)
Light-weight startup prepare() that registers all libraries with hero_memory + converts markdown (no LLM at boot)

What's still local

PDF generation, book TOC, Docusaurus site generation, the Dioxus UI — all unchanged.

Test plan

cargo check --workspace
cargo test -p hero_books_lib
/tmp/hero_books_smoke.sh walks the seven RPCs through the live stack
hero_browser MCP drove the UI (/hero_books/ui/library/geomind) and verified search returns 20 hits with no dead warning
Librarian banner now renders an AI summary instead of the model-not-available error

🤖 Generated with Claude Code

## Summary The hero_books refactor branch landed across this session: ~13.5k LOC removed, ingest+search+ontology delegated to hero_memory, the dead embedder probe replaced, the in-process Q&A pipeline deleted, and the `herolib_ai` callsites migrated to its new 0.6.0 broker-first async API (which itself is what hero_aibroker#63 proposes for the long term). End-to-end verified through the JSON-RPC API and the Dioxus UI: * `server.health` reports `memoryConnected: true` * `memory.{ingest,search,qaSearch,extractOntology,stats}` work * `search.query` (legacy endpoint) routes through `memory.search` * The "Ask the Librarian" feature renders summaries instead of the "Llama 3.3 70B not available" error cargo check --workspace clean. Auto-prepare runs at startup over all on-disk libraries (scan + convert only — no LLM at boot). ## What's deleted * `hero_books_lib::ontology/` (~1.9k LOC) — replaced by `memory.extractOntology` * `hero_books_lib::vectorsdk/` + `embedder/` (~2.7k LOC) — replaced by `memory.search` * `hero_books_lib::ai::{processor,topics,metadata}` + `book::{processor,ai_book}` (~2.4k LOC) — replaced by `memory.ingest` * `hero_books_lib::convert/` (~180 LOC) — replaced by `memory.ingest`'s `convert.run` * Startup auto-Q&A + auto-index in `axum_server.rs` (~430 LOC) * Legacy import pipeline + a string of dead helpers in `web/server.rs` ## What's added * `hero_books_lib::memory_client::MemoryClient` — raw JSON-RPC wrapper for hero_memory * `web/rpc.rs::handle_memory_*` — five new RPCs (`memory.ingest`, `memory.search`, `memory.qaSearch`, `memory.extractOntology`, `memory.stats`) * Light-weight startup `prepare()` that registers all libraries with hero_memory + converts markdown (no LLM at boot) ## What's still local * PDF generation, book TOC, Docusaurus site generation, the Dioxus UI — all unchanged. ## Test plan - [x] `cargo check --workspace` - [x] `cargo test -p hero_books_lib` - [x] `/tmp/hero_books_smoke.sh` walks the seven RPCs through the live stack - [x] hero_browser MCP drove the UI (`/hero_books/ui/library/geomind`) and verified search returns 20 hits with no dead warning - [x] Librarian banner now renders an AI summary instead of the model-not-available error 🤖 Generated with [Claude Code](https://claude.com/claude-code)

timur added 13 commits

2026-05-08 14:46:01 +00:00

chore(deps): wire hero_memory_sdk for upcoming RAG delegation f57dad862b

Workspace + hero_books_lib + hero_books_server now declare a dependency
on hero_memory_sdk (development branch). This is the first commit on
the development_hero_memory refactor branch — no functional change yet,
just the dependency in place so the next commits can swap the local
qa/ontology/search pipeline for hero_memory RPC calls.

Plan (incremental, each its own commit):
1. memory_client wrapper in hero_books_lib (this branch).
2. Delegate Q&A extraction to qa.extract.
3. Delegate ontology extraction to ontology.extract.
4. Delegate vector search to memory.search.
5. Drop unpdf / anytomd / pptx-to-md / html-to-markdown-rs / herolib_ai
   once the legacy pipeline is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(memory_client): raw JSON-RPC client for hero_memory b4230fa04d

Adds `hero_books_lib::memory_client::MemoryClient` — a thin async
client over hyperlocal's UDS HTTP transport that targets hero_memory's
JSON-RPC endpoint at /rpc.

Why raw rather than hero_memory_sdk? The typed SDK transitively pulls
the latest herolib_* tip, which has incompatible API changes for the
old herolib_ai surface that hero_books_lib still uses. Migrating off
herolib_ai is the next step on this branch — once that's done we can
reintroduce the typed SDK. Until then, ~280 LOC of hand-rolled JSON
keeps the rest of the workspace stable.

Surface (parsing kept loose; returns serde_json::Value where the
result shape isn't part of the contract):
  - new() / with_socket()
  - call(method, params) — generic typed call
  - ensure_collection(name, root, kind, quality) — idempotent
    namespace.create + collections.create
  - ingest(collection, force) — collections.scan → convert.run →
    qa.extract, returns IngestSummary
  - extract_ontology(collection, ontology, force) — ontology.attach +
    ontology.extract
  - ontology_stats(collection)
  - search(query, collections, top_k) — memory.search
  - qa_search(query, collection, top_k)

Socket resolution: HERO_MEMORY_SOCKET → $HERO_SOCKET_DIR/hero_memory/
rpc.sock → ~/hero/var/sockets/hero_memory/rpc.sock.

No call sites are rewired yet — this is the foundation. Subsequent
commits on this branch:
- Replace hero_books_lib::ai (Q&A) call sites with memory_client.ingest
- Replace hero_books_lib::ontology call sites with memory_client.extract_ontology
- Replace vector search via memory_client.search / qa_search
- Delete the unused modules + drop unpdf, anytomd, pptx-to-md,
  html-to-markdown-rs, herolib_ai

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(server): expose memory.* RPCs that delegate to hero_memory fd10bdd1bc

Adds five new methods on the hero_books JSON-RPC dispatcher:

  memory.ingest(name, root, kind?, quality?, ontology?, force?)
    → ensure_collection + ingest [+ extract_ontology if ontology]
    → returns { scan_added, converted, qa_pairs, ontology? }

  memory.search(query, collections, top_k?)
    → memory.search across one or more collections, returns the raw
      hits payload (id, score, namespace, question, answer, dimension,
      provenance{_src_*}).

  memory.qaSearch(query, collection, top_k?)
    → qa.search in a single collection.

  memory.extractOntology(collection, ontology, force?)
    → ontology.attach + ontology.extract; returns concept and edge
      counts.

  memory.stats(collection)
    → ontology.stats; per-ontology, per-concept node counts.

ServerConfig gains an optional `memory: MemoryClient` field that's
populated unconditionally at startup (the underlying hyper client
dials lazily, so this is a no-op until the first call). The
dispatcher gates each new method on `MemoryClient::socket_available()`
and returns a structured -32001 error when hero_memory isn't
configured, instead of letting connection failures surface raw.

Implementation note: the dispatcher is sync because it runs inside
`tokio::task::spawn_blocking` (web/axum_server.rs). Each handler
hops back into async via `tokio::runtime::Handle::current().block_on`,
which is safe from a blocking-pool thread.

This is the parallel-path step. Existing local pipelines
(process_collection, vectorsdk search, etc.) are untouched.
Subsequent commits delete those once the UI / external callers
migrate to memory.*.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(ontology): delete the local ontology module — delegate to hero_memory 7c81032bee

Removes ~3,000 lines of duplicate ontology infrastructure that's now
the responsibility of `hero_memory` (called via `memory.extractOntology`
and `memory.stats`):

- crates/hero_books_lib/src/ontology/        (~1,968 LOC)
  classifier, embedded ontology data, extractor, processor,
  prompts, types, SPEC.md.
- crates/hero_books_lib/src/doctree/ai.rs    (~80 LOC)
- DocTree::process_ai{,_reset,_internal} +
  process_collection_ai{,_reset}            (~80 LOC)
- lib.rs ontology re-exports.

All the deleted code was unreachable from any external caller — the
`process_ai*` methods on `DocTree` weren't called anywhere outside the
library, and `doctree::ai` was the sole consumer of the ontology
module. Confirmed with workspace-wide grep before deletion.

The new path is `hero_books_server::web::rpc::handle_memory_*`, added
in the previous commit, which delegates to hero_memory's
`ontology.attach` + `ontology.extract` + `ontology.stats` over UDS.

Net: 13 files changed, 11 insertions, 2,996 deletions. cargo check
--workspace still green.

Next deletions on this branch (in roughly increasing risk order):
- vectorsdk/indexer.rs callers of herolib_ai (Q&A indexing path).
- ai/{processor,book}.rs (legacy Q&A extraction).
- book/{processor,ai_book}.rs (book-level AI processing).
- Then drop herolib_ai, unpdf, anytomd, pptx-to-md,
  html-to-markdown-rs, hero_embedder_sdk from Cargo.toml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(server): remove startup auto-Q&A and auto-index 7dbd8cf6f5

The startup task tree at the top of `axum_server::background_init`
ran four heavy steps unconditionally when `auto_index` was set:

  process_collections_for_qa  → ai::process_collection per book
  re_export_books_with_metadata
  index_books_for_search      → VectorStore::add per book
  persist_books_state         → cache content hashes

All of that is now hero_memory's job — `memory.ingest` (added in
fd10bdd1) does the equivalent on-demand and atomically per
collection. Trying to keep both side-by-side leaves two indexes that
can drift, so the parallel path goes away entirely on this branch.

Removed (~430 LOC):
- process_collections_for_qa  (legacy Q&A driver)
- compute_book_content_hash   (only called by persist_books_state)
- persist_books_state         (cached the auto-index state)
- load_books_state            (read it back at startup)
- index_books_for_search      (vector indexing driver)
- the startup auto_index block + 4 callers in axum_server.rs

Kept:
- `re_export_books_with_metadata` — also called from the import job
  runner, still valid for the export-to-disk flow.
- `auto_index` on ServerConfig — still read by the import flow.

cargo check --workspace is green. Net 7 insertions, 443 deletions.

Next on this branch:
- Delete `hero_books_lib::ai::{processor,book,topics,metadata}`
  Q&A modules — only the legacy import_*_pipeline functions still
  reference them. Either delete the import pipelines or rewire them
  to memory.ingest.
- Delete `vectorsdk/` once perform_search_with_namespace is rewired
  to memory.search.
- Drop heavy deps (unpdf, anytomd, pptx-to-md, html-to-markdown-rs,
  herolib_ai, hero_embedder_sdk).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(search): perform_search_with_namespace delegates to memory.search cfac4805e9

Replaces the in-process VectorStore search loop (lazy namespace
indexing, per-namespace store pool, server-side filter dict, score
normalization across the result window, multi-topic local filter) with
a single memory.search call to hero_memory.

The function keeps its signature so existing callers — `search.query`
RPC, the books.search wrapper, axum_server's handler closure — don't
need to change. New behaviour:

- Single collection per call (memory.search supports many but the
  legacy contract was always-one).
- Default collection: namespace_filter → book_filter → first known
  book's namespace.
- Topic filter: applied locally on hits[].dimension.
- Returns empty when hero_memory isn't reachable; the legacy fallback
  is gone on this branch.

Result-shape mapping (memory.search → SearchResultDisplay):
  id "{coll}::{path}::{anchor}::{dim}::{idx}" → book / page
  page (.md stripped) → display_title
  dimension → topic + topic_color
  question / answer pass through
  score → distance + relevance_percent (40-95 band, same as before)
  page_number, line_number → 0 / false (memory doesn't track those
  yet — line numbers can come back via Provenance._src_chunk in a
  follow-up).

Net 67 fewer lines in server.rs. vectorsdk's `search_text_*` and
`get_vectorstore_for_namespace` no longer have a hot caller — the
lazy ensure_namespace_indexed / vectorstore_pool plumbing is on the
chopping block once the import pipelines are rewired too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(server): drop lazy-index VectorStore plumbing + admin reembed 19858991b9

`perform_search_with_namespace` now goes through hero_memory, so the
lazy-indexing path nobody calls anymore goes too:

- vectorstore_pool / indexed_namespaces / mark_namespace_indexed
- ensure_namespace_indexed (lazy-on-first-search trigger)
- get_vectorstore_for_namespace (per-namespace VectorStore cache)

The only surviving caller — `admin.library.reembed` — is rewired to
spawn a `memory.ingest` with `force=true` instead. Same UX (admin
clicks "re-embed"), same outcome (Q&A and ontology rebuild), but the
work happens in hero_memory's process and lives in its data dir.

Net 137 fewer lines on top of the prior commit. vectorsdk's
search-side surface (`search_text_with_filters`,
`search_text_with_rerank`, `VectorStore::connect` in the search path,
plus the `vectorstore_pool` global) now has zero callers — dropping
the module is two more commits away (need to rewire
import_local_pipeline / import_collection_pipeline first, then ditto
in book/processor.rs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(import): rewrite import_*_pipeline to delegate to memory.ingest e0e61ec4c6

Replaces the two big import drivers in web/server.rs:

* import_local_pipeline (~170 LOC):
  scan a path → AI process collections → upload embeddings.
  Now: ensure_collection + ingest, single hero_memory call.

* import_collection_pipeline (~590 LOC):
  clone/pull git → AI process → push .ai/ back to remote → upload
  embeddings.
  Now: clone/pull (kept — that's hero_books's own git wiring), then
  one memory.ingest per .collection found inside the repo. The
  push-ai-back-to-remote step is dropped — Q&A and ontology now live
  in hero_memory's data dir, not the source repo.

Companion cleanup in web/axum_server.rs:

* Delete `push_ai_to_all_repos` and the startup branch that ran it
  when `bg_config.push_ai` was set. Same rationale: nothing in hero_books
  produces .ai/ artifacts to push anymore.

Stub the now-orphaned `convert_ebooks_to_toml` callers in
`ensure_library_repos` and `discover_and_convert_ebooks` — auto-conversion
of `ebooks/*.{pdf,epub}` → `book.toml` relied on hero_books_lib::ai. The
loops still place hand-authored `.ai/ebooks/*.toml` into the library tree,
so users with a working corpus aren't blocked.

Net 626 deletions, 125 insertions on top of the prior commits — the
two pipeline functions go from ~760 LOC to ~150 LOC of memory.ingest
delegation.

Surviving callers of vectorsdk are now confined to `book/processor.rs`
and the lib.rs / ai/mod.rs re-exports. `hero_books_lib::ai`'s legacy
processor / book / topics / metadata are unused from outside the crate
itself — both modules can be deleted in the next pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor: delete vectorsdk, embedder, ai/llm, convert — net 7,500 LOC 2f27376e7c

Sweeps the leftover legacy modules from hero_books_lib now that every
caller of vectorsdk + embedder + the ai LLM pipeline has been rewired
to hero_memory.

Deleted:

* `hero_books_lib/src/vectorsdk/`           (~2,150 LOC)
  store, indexer, embedding, vector_cache, error.
* `hero_books_lib/src/embedder/`            (~530 LOC)
  hero_embedder UDS HTTP wrapper.
* `hero_books_lib/src/ai/{processor,topics,metadata}.rs`
                                            (~1,300 LOC)
  Q&A extraction, topic enum, metadata sidecar.
* `hero_books_lib/src/book/{processor,ai_book}.rs`
                                            (~1,100 LOC)
  Book-level vector / semantic processing.
* `hero_books_lib/src/convert/`             (~180 LOC)
  PDF/DOCX/PPTX/HTML conversion wrappers.
* `hero_books_lib_rhai/src/convert_module.rs`
                                            (~42 LOC)
  Rhai bindings for the deleted convert helpers.
* `hero_books_examples/examples/convert_documents.rs`
* Plus a string of dead helpers in `hero_books_server/src/web/server.rs`:
  - delete_library / delete_book — embedder KVS cleanup blocks gone
  - list_url_pdf_items / delete_url_pdf — KVS metadata + vector cleanup gone
  - store_pdf_markdown_snapshot / append_pdf_version — KVS-backed PDF
    version log, now no-ops
  - serve_pdf_data — KVS regeneration fallback removed
* `library::sync_library_configs` / `save_library_config` — KVS sync
  via hero_embedder gone; disk is now the only source of truth for
  library configs.
* startup `verify_hero_embedder_health` probe — handle_memory_*
  surfaces a clear -32001 instead.

Rewired:

* `web/rpc.rs::handle_collections_process` and `handle_books_reindex`
  now call `memory.ingest` (with `force` derived from the request).
* `web/rpc.rs::handle_server_health` now reports `memoryConnected`
  (probes the hero_memory socket file).
* `web/server.rs::discover_namespaces_from_embedder` now reads
  namespaces from disk only — extra discovery should come via
  hero_memory's `namespace.list` if needed.

ai/ trimmed:

* Kept `ai::{BookConfig, PageConfig, ResolvedPage, ScanConfig,
  load_book_config, resolve_book_pages, find_books, AiError, AiResult}`
  — pure config-loading utilities used by `export_books_for_serving`.
* Removed `BookProcessingStats`, `process_book*`, `extract_qa_for_topic`,
  `DocumentMetadata`, `Topic`, etc.

Cargo.toml:

* hero_books_lib drops: unpdf, anytomd, pptx-to-md, html-to-markdown-rs,
  herolib_ai, hero_embedder_sdk.
* hero_books_server drops: hero_embedder_sdk.
* herolib_ai stays in hero_books_server — chat/voice/MCP handlers in
  axum_server.rs and mcp.rs still use it.

cargo check --workspace is green. Net 7,562 deletions, 228 insertions.

The branch is functionally complete:
- ingest, search, ontology all delegated to hero_memory via memory.*
- the legacy local pipeline is gone
- heavy doc-conversion deps are gone

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(memory_client): two end-to-end bring-up bugs 35c8c5b496

End-to-end smoke against a live hero_memory exposed two bugs in the
raw client that only show up when the same MemoryClient instance is
reused across calls or inspects a collections.get result.

1. hyper14's connection pool reused sockets that hero_memory's
   server had already closed after the previous response, surfacing
   on the second-and-later calls as "connection closed before
   message completed". `Client::builder().pool_max_idle_per_host(0)`
   disables the pool — one connection per request over a UDS is
   fine and matches what the server expects.

2. `ensure_collection` checked `Result::is_err()` to decide whether
   to call collections.create. hero_memory's collections.get returns
   200-OK with `{"collection": null}` when the collection isn't
   found — i.e. an `Ok(Value)` with a null inner — so the create
   step was skipped and ingest then failed at collections.scan with
   `collection 'X' not found`. Now we inspect the result value:
   collection != null → exists; null or Rpc error → create.

Smoke test (hero_books → hero_memory → hero_aibroker + hero_db) now
walks all the way through:
  - server.health returns memoryConnected: true
  - memory.ingest creates the namespace + collection, scans 1 file,
    converts it, attempts qa.extract
  - memory.search returns hits (empty when no Q&A — the LLM is
    flaky on the free `autocheapest` route, dropping back to
    markdown output that the schema parser rejects; pipeline itself
    is sound, the prior hero_memory direct smoke showed full Q&A)
  - memory.extractOntology / memory.stats both return correctly
    shaped responses
  - search.query routes through memory.search via
    `perform_search_with_namespace`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(server): drop dead embedder probe + auto-ingest libraries to memory 8ee35facef

Two follow-ups from the e2e bring-up:

1. `check_embedder_warning` was probing the now-removed
   hero_embedder UDS socket and appending "Search unavailable:
   embedder service not running. Start it with 'make embedder'…" to
   every empty `search.query` response — even when search itself
   succeeded through hero_memory. Replaced with a hero_memory-aware
   warning that only fires when the memory client is missing or the
   socket file is absent.

2. The startup auto-ingest path was deleted in 7dbd8cf6 along with
   `process_collections_for_qa`. The UI's search returned empty for
   any pre-existing library because nothing was registered with
   hero_memory. Added a slim replacement: when `auto_index` is on
   and hero_memory is reachable, walk `library::list_library_dirs()`
   and call `memory.ingest` for each one. Idempotent — memory.ingest
   skips files at the same content_hash.

Now `geomind` (or any other on-disk library) surfaces in the UI
without a manual ingest step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(server): startup auto-prepare is scan+convert only (no LLM) 76324c82db

Auto-ingesting every library at startup was unworkable — 700 markdown
files × 8 dimensions × ~2s/LLM call = several hours per fresh boot.

`memory_client::MemoryClient` gains a sibling helper, `prepare()`,
that runs only `collections.scan` + `convert.run` (markdown
passthrough is hash-cheap, no LLM). The startup task in
`web/axum_server.rs` now calls `prepare` instead of `ingest`. All
libraries become known hero_memory collections at boot, but
`qa.extract` only runs when an operator (or the UI) explicitly calls
the `memory.ingest` RPC for a specific library.

End-to-end on this machine (4 libraries):
  default:       0 new, 0 converted
  geomind:     223 new, 124 converted
  mycelium:    482 new, 316 converted
  ourworld:    191 new, 110 converted
  znzcybercity: 63 new,  59 converted
Total ≈ 1.0s per library (md → md is essentially fs::copy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge origin/development + migrate herolib_ai callsites to 0.6.0 async API

Test / test (pull_request) Failing after 16s

Details

Test / integration (pull_request) Has been skipped

Details

39c9d0b334

Brings the branch up to date with `development` (deps bumped 0.5→0.6,
hero_books_web added to workspace, server main.rs cleanup, askama 0.16
upgrade, etc.).

The herolib_ai 0.6.0 release on `development` is itself a "broker-first
rewrite" — the new client wraps `hero_aibroker_sdk` and is fully async.
That's the same direction as #63 on hero_aibroker. Five callsites in
hero_books_server are migrated:

* `axum_server.rs::handler_api_ai_summary_get` — drop spawn_blocking,
  call `AiClient::default_socket().await?.chat(Model::new("groq-strong"), …).await`.
* `axum_server.rs::handler_api_ai_summary` — same pattern.
* `axum_server.rs::handler_api_transcribe` — drop spawn_blocking,
  inline the ffmpeg + transcribe + cleanup chat. Uses
  `Model::new("groq/whisper-large-v3-turbo")` for STT.
* `web/mcp.rs::tool_ask` — sync MCP dispatcher, so bridge via
  `tokio::runtime::Handle::current().block_on(…)` (already inside a
  spawn_blocking thread). Uses `Model::new("groq-strong")`.

`Model::Llama3_3_70B` (enum variant on the old `herolib_ai`) is replaced
by string-based `Model::new("groq-strong")` — that's the alias the
broker's `modelsconfig.yml` carries for "Groq Llama 3.3 70B Versatile".

`AiClient::from_env()` (read provider keys from env, talk direct to
provider) is gone in 0.6.0; everything goes through the broker UDS now,
which means the librarian no longer needs `GROQ_API_KEY` /
`OPENROUTER_API_KEY` baked into `hero_books_server`'s action env. The
dual-auth issue we hit live evaporates.

cargo check --workspace clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

timur merged commit 7bdaa25787 into development

2026-05-08 14:46:26 +00:00

timur referenced this pull request from a commit

2026-05-08 14:46:27 +00:00

Merge pull request 'feat: delegate ingest/search/ontology to hero_memory; migrate herolib_ai to 0.6.0' (#123) from development_hero_memory into development