fix(osis): make OsisIndexer async, bridge sync handlers via dedicated runtime #130
No reviewers
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_rpc!130
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix-osis-indexer-async"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
OsisIndexeris now async throughout — dropped the ownedRuntime, public methods (index_document,delete_document,search,try_initialize) areasync fnand.awaithero_indexer_sdkdirectly. Schema registration runs once viatokio::sync::OnceCell::get_or_try_init.Added sync shims (
index_document_blocking,delete_document_blocking,search_blocking,try_initialize_blocking) that dispatch async work onto a sharedOnceLock<Runtime>bridge and block the caller on a stdlibsync_channel. Safe from any caller context (inside or outside another tokio runtime). The async surface remains the source of truth; the shims exist only to bridge the existing sync OSIS handler boundary.Generator (
rust_osis.rs) emits the*_blockingshims from the generated<root>_new/_set/_deletewrite-through and<root>_findarm. Before/after at the write-through call site:Smoke test →
#[tokio::test]+.await. Addedblocking_shims_safe_inside_runtimeto exercise the sync shims from inside an outer tokio runtime — exactly the pre-fix panic site — to lock the regression in.Bench wires the now-async
try_initializethrough the bench-ownedRuntimeviart.block_on(...).Why
hero_rpc#127 wired the per-domain Tantivy indexer into the generated OSIS CRUD methods via a sync facade —
OsisIndexerowned a privatetokio::runtime::Runtimeandblock_on-ed everyhero_indexer_sdkcall. That panicked with "Cannot start a runtime from within a runtime" the moment the generated handler was invoked from inside an outer tokio context (jsonrpsee server worker, or hero_service#[tokio::test]/#[tokio::main]driving CRUD).The architectural choice was wrong:
hero_indexer_sdkis generated viaopenrpc_client!and is async-only. Wrapping it in a blocking facade is the bug. Principle locked in via codegen:The future-state migration (when the generated
<root>_new/_set/_deletethemselves becomeasync fn) drops the shims entirely and.awaits the async surface — a one-line change in the codegen at that time.Test plan
cargo test -p hero_rpc_osis --test indexer_smoke --features rpc -- --nocapture— 2/2 pass includingblocking_shims_safe_inside_runtime(the pre-fix panic site, now safe).BENCH_LARGE=1000 cargo bench -p hero_rpc_osis_benches --bench index_perf -- --quick 'query_indexed_vs_full_scan/wire_find.title/1000'— preflight asserts333 hits over 1000 rowsas expected; bench completes in~3.2 msper iter with no nested-runtime panic.cargo test -p hero_rpc_osis --lib --features rpc— 85 unit tests pass.cargo test -p hero_rpc_generator --lib— 137 generator unit tests pass.cargo build --workspaceclean.Related: hero_rpc#127 (original indexer wiring), hero_rpc#128 (codegen-emitted
_find_*E2E tests that surfaced the panic when invoked from hero_service's outer tokio runtime).The generated `<root>_new`/`_set`/`_delete`/`_find` methods on `Osis<Domain>` are sync `pub fn` — they're invoked from sync `*_rpc_new` dispatch arms and (over the wire) from the sync jsonrpsee `register_method` closures. They can't `.await` the async `OsisIndexer` surface directly. Switch the emitted calls to the `*_blocking` shims hero_rpc#129 adds on `OsisIndexer`. These dispatch the async work onto a shared bridge runtime and block on a stdlib channel — safe to call from any caller context (inside or outside another tokio runtime), which is exactly the panic the previous `block_on`-in-place facade hit. Before (panicked when called from inside an outer tokio runtime): let _ = self.indexer.index_document("recipe", &sid, fields); After (safe from any context): self.indexer.index_document_blocking("recipe", &sid, fields); Same change for the delete path (`delete_document_blocking`) and the `_find` arm (`search_blocking`). The async surface on `OsisIndexer` remains the source of truth; future async OSIS handlers should `.await` it directly and skip the shims entirely.Cross-repo validation note
hero_service
developmentcannot currently build against this PR (or against hero_rpcdevelopmenteither, independently of this work): it importshero_rpc_osis::rpc::bootstrap::MultiDomainBuilder, which only exists on the unmergedissue-124-lifecycle-alignmentbranch in this repo (commit5e72817). Both before and after this PR,cargo build --workspacein hero_service fails with the sameMultiDomainBuildernot-found error.The full cross-repo
cargo test --workspacevalidation the task spec called for is therefore blocked on hero_rpc#124 landing first — it is not regressed by this PR. The OSIS indexer panic this PR fixes is covered by the in-repo smoke test (blocking_shims_safe_inside_runtime, which reproduces the exact pre-fix scenario from inside an outer#[tokio::test]runtime) and by the bench wire-arm preflight (333 hits over 1000 rowsoverwire_find.title/1000).Closing in favor of #133, which implements the ideal architectural fix instead of bridging the panic at the sync->async boundary.
#133 makes every link in the chain async: the generated OSIS CRUD handlers (_new / _set / _delete / _find / _get / _list / _list_full / _exists) are emitted as
async fn, theOsisAppRpcHandlertrait is async (RPITIT),rpc2_adapterusesregister_async_method, andOsisIndexerkeeps the async surface this branch salvaged. The result is one continuous.awaitstack from the network closure to the indexer socket — nospawn_blockingon the CRUD hot path, no bridge runtime, no*_blockingshims.The async OsisIndexer code in
crates/osis/src/index/remote.rsfrom this branch was salvaged as the starting point of #133. The bench file + smoke test changes from this branch landed too (minus the now-obsoleteblocking_shims_safe_inside_runtimetest, whose premise goes away once the shims do).Not deleting the branch yet — #133's history references it.
Pull request closed