[nu-demo] hero_embedder_server panics with blocking reqwest inside tokio async context; namespace.create rejects Q1 in daemon mode #22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom
Fresh deploy:
hero_embedder_serverwould not start or serve any request.After patching startup, every
embed/rerankRPC call would hang forever (connection accepted, no response, no error in log).After patching per-request calls,
namespace.create {name, quality: 1}consistently returned:even though
hero_embedderdhad all 4 models (Q1/Q2/Q3/Q4) loaded andnamespace.listshowed an existing Q1 namespace.Net effect:
hero_books.search.queryalways returnedcount: 0 warning: 'embedder service not running', AI Assistant could not usesearch_hero_docs, LLM hallucinated citations.Root cause
In
hero_embedder/crates/hero_embedder_lib/src/embedderd_client.rs: theEmbedderdClientholds areqwest::blocking::Clientand uses blocking.send()?.error_for_status()?.json()?chains insideembed()andrerank(). These are called fromaxumasync handlers inhero_embedder_serverviastate.rs.reqwest::blocking::Clientspawns its own tokio runtime internally; dropping that inner runtime while the outer (main) tokio runtime is live → tokio's guardrail panic.In
hero_embedder/crates/hero_embedder_lib/src/api/namespace.rs:62-71:namespace.createrejects the request ifstate.embeddersdoesn't contain the requested quality'sEmbedderModel. In daemon-delegation mode (state.embedderd_client.is_some()), the server holds NO local embedders — all models live inhero_embedderd. The check is looking in an empty local map and rejecting every quality.Demo workaround (applied 2026-04-23 on
development_mik_nu_demobranch)Four local patches on
hero_embedder:hero_embedder_server/src/main.rs: wrapdiscover_embedderd()call intokio::task::block_in_place(|| …)— legal since the main is#[tokio::main](multi-thread by default).hero_embedder_lib/src/embedderd_client.rs::embed(): wrap theself.http.post(…).send()?.error_for_status()?.json()?chain intokio::task::block_in_place(|| -> Result<_, reqwest::Error> { … })?.EmbedderdClient::rerank().hero_embedder_lib/src/api/namespace.rs:62-71: change the guard toif state.embedderd_client.is_none() && !state.embedders.contains_key(&model) { error }— i.e. trust the daemon.Result after rebuild + restart:
embedreturns 384-dim vectors,namespace.createsucceeds,hero_booksindexed docs_hero (163 docs / 7 pages),search.queryreturns real hits, AI Assistant quotes verbatim from hero_os_guide overview.Commit on
development_mik_nu_demobranch:[nu-demo] wrap blocking reqwest calls in block_in_place(3 files, ~22 ins / 17 del). Not pushed — stays local until reviewers opt in.Proper upstream fix
The
block_in_placewraps work but are fragile (multi-thread-runtime-only, and they block a whole worker thread). The clean answer is an async client:EmbedderdClientto holdreqwest::Client(async) instead ofreqwest::blocking::Client. Remove the builder's blocking import.embed()andrerank()async fn, with.send().await?.error_for_status()?.json().await?.is_reachable()sync by replacing itsself.http.get(…).send()with a plainstd::net::TcpStream::connect_timeoutprobe — no runtime involved; safe to call at startup from sync code.state.rsandapi.rs(~5 sites total) to.awaitthe now-async methods. Any fn that calls.embed(…)/.rerank(…)is alreadyasync— adding.awaitis one-token changes.tokio::task::block_in_placewraps from main.rs + embedderd_client.rs.namespace.create: either keep our guard change (trust daemon when present) OR ask the daemon via its/infoendpoint which qualities it has loaded and populatestate.embedders_available_qualities: HashSet<u8>at startup. The guard change is simpler; the daemon check is more robust.Why this is worth doing
Without this, hero_books has no vector search, the AI Assistant has no retrieval, and the entire “semantic grounding via MCP/OpenRPC” story falls apart on any non-trivial deploy. Our local patches unblock the demo but the async refactor is the clean answer — probably a 1-2 hour PR against the
hero_embedderrepo.Tracking
Related:
Filed 2026-04-23 (late evening) nu-shell demo bring-up. Signed-off-by: mik-tf
Originally filed as home#145 on 2026-04-24 by mik-tf — moved to hero_demo as part of consolidating issue tracking.