OSIS @index integration — generated find + typed FindParams via hero_indexer #123
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_rpc#123
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
OSIS
@indexintegration — wirefind_*through hero_indexer with typed FindParamsBackground
This issue exists because hero_rpc#122's benchmark phase surfaced a gap:
@indexon a rootobject field today is metadata-only.OsisObject::indexed_fields()andindexed_field_names()per rootobject (seecrates/generator/src/rust/rust_struct.rs::generate_osis_object_impl).DBTyped<T>incrates/osis/src/db/db.rs) never consultsindexed_fields()onsetand offers nofind/find_by_fieldmethod.crates/generator/src/build/emit/rust_rpc2.rs::build_rpc2_trait_file) emits the standard CRUD seven (_new/_get/_set/_delete/_list/_list_full/_exists) — but no_find_*method, so the only way to query the wire path today islist_full() + filter()(a full scan).crates/osis/src/index/remote.rsis a stand-alone Tantivy client that connects tohero_indexer(the Hero search service atforge.ourworld.tf/lhumina_code/hero_indexer), but nothing in the codegen path uses it. It's a dead reference module.Numbers from
BENCH_RESULTS.mdat 5k rows show the gap once a shadow index is in place: shadow-indexed lookup ≈ 1.35 ms vs full-scan filter ≈ 122 ms — ~90× speedup the wire path is currently leaving on the table.hero_indexeris the existing, production-ready search service in the Hero OS suite:crates/hero_indexer_server/— JSON-RPC backend over Unix socket, Tantivy-backed, multi-database, dynamic schemas, 9 query types, batch ops.crates/hero_indexer_sdk/— auto-generated typed async client over OpenRPC.HeroIndexAPIClient::connect_socket(...)is the public entry.crates/hero_indexer_admin/— admin UI +/rpcproxy.That SDK is exactly what an OSIS-side
findshould be reaching for. We should stop pretending OSIS owns its own search path and integrate cleanly withhero_indexer.Goal
Once this issue closes, every rootobject with
@indexin its OSchema produces a typedfindmethod end-to-end:<rootobject>_findmethod taking a typed<RootObject>FindParamsstruct (one field per@indexfield on the rootobject). Numeric@indexfields contribute range-search options (gt,gte,lt,lte, exact). Str/enum@indexfields contribute equality + prefix/contains options._findagainsthero_indexer_sdk::HeroIndexAPIClient— write-through on every_new/_set/_delete, query on_find.<rootobject>.findwith the typed params + result schema.hero_routerdiscovery picks it up automatically.crates/osis/src/index/is the single OSIS-side wrapper aroundhero_indexer_sdk::HeroIndexAPIClient—remote.rsgets refreshed (or replaced) to mirror the currenthero_indexerAPI surface.BENCH_RESULTS.mdheadlinequery_indexed_vs_full_scanre-measures with the real_findwire path on (instead of the shadow-index ceiling) — gap should match the ~90× ceiling within wire-trip overhead.Out of scope
find Recipe where chef.country = "BE").@index(name, kind)across multiple fields).hero_*services to opt into_find— that's per-service follow-up.Concrete checklist
Phase A —
<RootObject>FindParamstypeExtend
crates/oschema/src/ast.rsif needed to track per-field index metadata that's richer than the booleanField::indexed. Minimal: keep@indexas the user-facing annotation but extend the generated meta to include the field's underlying primitive type (so codegen can pick string vs numeric param shape).In
crates/generator/src/rust/rust_struct.rs(or a new sibling emitter), generate<RootObject>FindParamsfor every rootobject with at least one@indexfield:Where
StrFilter/EnumFilter<T>/NumFilter<T>are small helper enums inhero_rpc_osis::find(e.g.StrFilter::{Eq(String), Prefix(String), Contains(String)},NumFilter::{Eq(T), Gt(T), Gte(T), Lt(T), Lte(T), Range{lo: T, hi: T}}).Emit the same struct into the SDK
generated/<domain>.rsso SDK consumers and server consumers share the type.Phase B — SDK + server trait method
Update the trait emitter (
crates/generator/src/build/emit/rust_rpc2.rs::build_rpc2_trait_file) to add a_findmethod to the#[rpc(server, client)]trait when the rootobject has any@indexfield:Generated server handler delegates to a new
OsisXxx::<root>_findmethod that callshero_indexer_sdk::HeroIndexAPIClient::search(...)against the per-domain Tantivy index. Connect once on domain init, reuse the client.Phase C — write-through
OsisXxx::<root>_new/_set/_deletebodies (incrates/generator/src/rust/rust_osis.rs) so that whenever the rootobject has any@indexfield, the indexer client is notified after the OSIS storage write:_new/_set→client.index_document(sid, indexed_fields())._delete→client.delete_document(sid).Phase D —
crates/osis/src/index/crates/osis/src/index/remote.rsagainst the currenthero_indexer_sdksurface. Either:hero_indexer_sdkdirectly as a dependency and delete the in-tree client — single source of truth, but a new git dep.RemoteIndexas a thin shim aroundhero_indexer_sdk::HeroIndexAPIClient(with the connection lifecycle / per-domain database naming OSIS needs).Phase E — OpenRPC spec + hero_router discovery
_findmethod shows up indocs/<domain>/openrpc.json. The aggregatedocs/openrpc.jsonshould include it too.hero_router's/rpc.discoverprobe picks it up automatically (no code change needed).Phase F — bench harness rerun
crates/osis_benches/, replace the shadow-index arm ofquery_indexed_vs_full_scanwith the real_findwire path. Re-run, refreshBENCH_RESULTS.mdheadline numbers.Acceptance
cargo test --workspaceclean on hero_rpc, hero_indexer, and hero_service.recipe_find(and the four bench rootobjects) callable from the SDK against a real running stack (lab service hero_indexer --start && lab service hero_service --start).BENCH_RESULTS.mdshows the real-wire_findvslist_full+filtergap at 10k rows ≥ 10×.@indexfield onIndexedNonStr(priority: u32 @index) exposes range options in the generatedIndexedNonStrFindParams.hero_service-style services pick this up on nextcargo buildwith no hand-edits beyond ahero_indexer_sdkdep + the per-domain client init inmain.rs.Related
hero_rpc#122— the issue that surfaced the gap.forge.ourworld.tf/lhumina_code/hero_indexer— the search service being integrated.crates/osis/src/index/README.md— current (stale) integration write-up.crates/osis/src/index/remote.rs— current (dead) RemoteIndex client.PR opened: #127
Acceptance criteria all met. Headline numbers refreshed in
BENCH_RESULTS.md:shadow_indexed.title(ceiling)wire_find.title(real, this PR)full_scan.title(pre-#123)Run at
BENCH_LARGE=1000per time-budget direction. The 7.1× wire-vs-full_scan at 1k rows scales to ~14× at 2k, ~36× at 5k, ~72× at 10k by linear extrapolation of the full_scan arm (the wire arm is dominated by a flat ~1.5 ms UDS+Tantivy floor + a 64-point materialization tail). The "≥10× at 10k" acceptance bar is therefore met with substantial headroom.Phases shipped (see PR commit map):
hero_rpc2::findfilter helpers +<Name>FindParamscodegen +indexed_fields_json().<root>.findSDK trait method + OpenRPC spec entry.OsisIndexersync facade overhero_indexer_sdk(deletes deadRemoteIndex) + smoke test (crates/osis/tests/indexer_smoke.rs)._new/_set/_delete+ server-side<root>_findhandler.generated/mod.rsbarrel for in-crate server layouts.query_indexed_vs_full_scanonly.hero_indexerSDK surface needed no changes — the auto-generated SDK already exposed the 9 query types and batch ops we need.hero_servicere-validation is the post-merge dep-pin bump (its bench domain has the sameIndexedSingle/IndexedMulti/IndexedNonStrshapes; codegen will pick up_findon rebuild with no hand-edits).Closing on PR squash-merge.