Implement hero_memory #1

Open
opened 2026-05-05 11:40:37 +00:00 by timur · 4 comments
Owner

Master tracking issue for hero_memory — the memory infrastructure for the Hero stack: a single service that ingests documents and code, organises them into collections, extracts Q&A pairs and ontological structure, generates embeddings, and serves retrieval to agents.

Spec: docs/prd/ and docs/adr/ on development.

Strategy

The PRD describes the product. This issue tracks the build path. Each phase below produces a working subsystem with tests; phases land via commits referenced from this issue.

Phase 1 — Dimensions, provenance, extractor registry

Ground the schemas before anything depends on them.

  • hero_memory_lib::dimensions — canonical Rust enum per ADR-0004 with description, scope-guard, and target-count for each entry
  • hero_memory_lib::provenance_src_* key constants and helpers per ADR-0003
  • hero_memory_lib::extractors — versioned extractor registry per ADR-0008

Phase 2 — Collection registry

Build the org unit and change-detection plumbing.

  • hero_memory_lib::collections — registry, file→hash map, scan, change detection, purge-by-source helpers
  • redb schemas: collections.redb, per-collection files.redb
  • RPC: collections.create | list | get | scan | process | delete
  • Wire _src_* keys into the existing index.add path so every embedding doc carries provenance

Phase 3 — Document conversion

  • hero_memory_lib::convert — PDF, PPTX, DOCX, HTML, code, plain (per PRD §04)
  • Cached markdown output keyed by content_hash
  • convert@v1 extractor registered

Phase 4 — Q&A extraction

  • hero_memory_lib::ai — hero_aibroker client wrapper per ADR-0006
  • hero_memory_lib::qa — per-dimension Q&A extraction with JSON-schema-constrained output
  • Embed Q: ... A: ... per ADR-0007; doc id {collection}::{path}::{anchor}::{dimension}::{idx}
  • RPC: qa.extract | list | search | purge

Phase 5 — Ontology extraction

  • hero_memory_lib::ontology — load ontology via hero_db SDK, build prompt, validate response, write nodes/edges with _src_* provenance
  • ontology@v1 extractor registered
  • RPC: ontology.attach | extract | stats | purge
  • memory.search RPC — high-level Q&A retrieval with provenance per PRD §07

Phase 7 — Service script + deployment

  • service_memory — three modes (all-in-one, --inference --root, --userspace)
  • hero_proc registration per the hero_proc_service_selfstart pattern

Phase 8 — UI

  • hero_memory_ui updated to show collections, dimensions, extractor stats, and source-purge tools

Cross-repo dependencies

These will become separate issues in their respective repos when the dependent phase is unblocked:

  • hero_db — add GRAPH.PURGE_BY_SOURCE helper; document the _src_* reserved-key convention; report row counts per _src_extractor from GRAPH.STATS
  • hero_aibroker — expose chat.complete(messages, model_hint?, json_schema?, max_tokens?, timeout?) with provider-agnostic JSON-schema-constrained output
  • hero_books — once Phase 4 lands, switch ingestion to call hero_memory and slim the crate to publishing concerns

Acceptance criteria (from PRD §01)

  • A new collection can be ingested end-to-end (convert → Q&A → embed → ontology) with one CLI command, and re-running with no source changes is a no-op
  • An agent can call memory.search(query, dimension?) and get answers with full provenance
  • hero_db reports row counts per _src_extractor and supports purging by source
  • Service runs under hero_proc in all three deployment modes

Updates

This issue is the strategy log. Each commit / PR that lands a piece of the work above gets a comment summarising what changed and which acceptance bullets it advances.

Master tracking issue for hero_memory — the memory infrastructure for the Hero stack: a single service that ingests documents and code, organises them into collections, extracts Q&A pairs and ontological structure, generates embeddings, and serves retrieval to agents. **Spec:** [`docs/prd/`](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/prd/README.md) and [`docs/adr/`](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/adr/README.md) on `development`. ## Strategy The PRD describes the product. This issue tracks the build path. Each phase below produces a working subsystem with tests; phases land via commits referenced from this issue. ### Phase 1 — Dimensions, provenance, extractor registry Ground the schemas before anything depends on them. - `hero_memory_lib::dimensions` — canonical Rust enum per [ADR-0004](docs/adr/0004-hardcoded-dimensions.md) with description, scope-guard, and target-count for each entry - `hero_memory_lib::provenance` — `_src_*` key constants and helpers per [ADR-0003](docs/adr/0003-provenance-src-keys.md) - `hero_memory_lib::extractors` — versioned extractor registry per [ADR-0008](docs/adr/0008-versioned-extractors.md) ### Phase 2 — Collection registry Build the org unit and change-detection plumbing. - `hero_memory_lib::collections` — registry, file→hash map, scan, change detection, purge-by-source helpers - redb schemas: `collections.redb`, per-collection `files.redb` - RPC: `collections.create | list | get | scan | process | delete` - Wire `_src_*` keys into the existing `index.add` path so every embedding doc carries provenance ### Phase 3 — Document conversion - `hero_memory_lib::convert` — PDF, PPTX, DOCX, HTML, code, plain (per [PRD §04](docs/prd/04-document-conversion.md)) - Cached markdown output keyed by `content_hash` - `convert@v1` extractor registered ### Phase 4 — Q&A extraction - `hero_memory_lib::ai` — hero_aibroker client wrapper per [ADR-0006](docs/adr/0006-aibroker-routing.md) - `hero_memory_lib::qa` — per-dimension Q&A extraction with JSON-schema-constrained output - Embed `Q: ... A: ...` per [ADR-0007](docs/adr/0007-qa-first-rag.md); doc id `{collection}::{path}::{anchor}::{dimension}::{idx}` - RPC: `qa.extract | list | search | purge` ### Phase 5 — Ontology extraction - `hero_memory_lib::ontology` — load ontology via hero_db SDK, build prompt, validate response, write nodes/edges with `_src_*` provenance - `ontology@v1` extractor registered - RPC: `ontology.attach | extract | stats | purge` ### Phase 6 — Memory search - `memory.search` RPC — high-level Q&A retrieval with provenance per [PRD §07](docs/prd/07-embeddings-and-retrieval.md) ### Phase 7 — Service script + deployment - `service_memory` — three modes (all-in-one, `--inference --root`, `--userspace`) - `hero_proc` registration per the `hero_proc_service_selfstart` pattern ### Phase 8 — UI - `hero_memory_ui` updated to show collections, dimensions, extractor stats, and source-purge tools ## Cross-repo dependencies These will become separate issues in their respective repos when the dependent phase is unblocked: - **hero_db** — add `GRAPH.PURGE_BY_SOURCE` helper; document the `_src_*` reserved-key convention; report row counts per `_src_extractor` from `GRAPH.STATS` - **hero_aibroker** — expose `chat.complete(messages, model_hint?, json_schema?, max_tokens?, timeout?)` with provider-agnostic JSON-schema-constrained output - **hero_books** — once Phase 4 lands, switch ingestion to call hero_memory and slim the crate to publishing concerns ## Acceptance criteria (from PRD §01) - [ ] A new collection can be ingested end-to-end (convert → Q&A → embed → ontology) with one CLI command, and re-running with no source changes is a no-op - [ ] An agent can call `memory.search(query, dimension?)` and get answers with full provenance - [ ] `hero_db` reports row counts per `_src_extractor` and supports purging by source - [ ] Service runs under `hero_proc` in all three deployment modes ## Updates This issue is the strategy log. Each commit / PR that lands a piece of the work above gets a comment summarising what changed and which acceptance bullets it advances.
Author
Owner

Phase 1 landed — dimensions, provenance, extractor registry

Commit: dbefcb9 (and lockfile sync 532ee95)

crates/hero_memory_lib/src/:

  • dimensions.rs — canonical 17-variant Dimension enum per ADR-0004: 8 document, 3 code, 6 agent_memory. Each variant exposes id(), description(), scope_guard(), target_count(), kind(). Dimension::for_kind(kind) filters the catalogue. Snake-case ids round-trip through serde and FromStr.
  • provenance.rsProvenance struct + _src_* key constants per ADR-0003. Builder API (new + with_chunk / with_topic / with_model). to_property_pairs() emits ordered (key, value) pairs ready for graph node/edge inserts; from_properties() reads them back, validating required keys.
  • extractors.rsExtractorId enum (Qa | Ontology | Embed | Convert, version per variant) per ADR-0008. Display renders as name@vN; FromStr parses; serde uses the string form. Constants QA_V1 / ONTOLOGY_V1 / EMBED_V1 / CONVERT_V1 and ALL_V1 slice.

pub used from hero_memory_lib::{Dimension, DimensionKind, ExtractorId, Provenance}.

Tests: 17 unit tests, all passing — id roundtrips, kind partition (8+3+6=17), serde format, provenance property-pair roundtrip, missing/invalid key handling, prefix detection.

Build: cargo check -p hero_memory_lib --tests clean.

What this unblocks

Every later phase builds on these three modules. Specifically:

  • Phase 2 (collections) writes Provenance into index.add so embedding docs already carry source metadata.
  • Phase 3 (conversion) records convert@v1 against the file registry.
  • Phase 4 (Q&A) uses Dimension::for_kind to iterate per collection-kind and writes provenance with topic set.
  • Phase 5 (ontology) writes provenance onto every GRAPH.NODE.ADD / GRAPH.EDGE.ADD.

Next: Phase 2 — Collection registry

The hero_memory_lib::collections module: redb-backed registry, file→hash map, scan, change detection, _src_*-aware purge helpers. Will surface as collections.* RPC methods on hero_memory_server.

## Phase 1 landed — dimensions, provenance, extractor registry Commit: [`dbefcb9`](https://forge.ourworld.tf/lhumina_code/hero_memory/commit/dbefcb9) (and lockfile sync [`532ee95`](https://forge.ourworld.tf/lhumina_code/hero_memory/commit/532ee95)) `crates/hero_memory_lib/src/`: - **`dimensions.rs`** — canonical 17-variant `Dimension` enum per [ADR-0004](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/adr/0004-hardcoded-dimensions.md): 8 document, 3 code, 6 agent_memory. Each variant exposes `id()`, `description()`, `scope_guard()`, `target_count()`, `kind()`. `Dimension::for_kind(kind)` filters the catalogue. Snake-case ids round-trip through serde and `FromStr`. - **`provenance.rs`** — `Provenance` struct + `_src_*` key constants per [ADR-0003](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/adr/0003-provenance-src-keys.md). Builder API (`new` + `with_chunk` / `with_topic` / `with_model`). `to_property_pairs()` emits ordered (key, value) pairs ready for graph node/edge inserts; `from_properties()` reads them back, validating required keys. - **`extractors.rs`** — `ExtractorId` enum (`Qa | Ontology | Embed | Convert`, version per variant) per [ADR-0008](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/adr/0008-versioned-extractors.md). `Display` renders as `name@vN`; `FromStr` parses; serde uses the string form. Constants `QA_V1` / `ONTOLOGY_V1` / `EMBED_V1` / `CONVERT_V1` and `ALL_V1` slice. `pub use`d from `hero_memory_lib::{Dimension, DimensionKind, ExtractorId, Provenance}`. **Tests:** 17 unit tests, all passing — id roundtrips, kind partition (8+3+6=17), serde format, provenance property-pair roundtrip, missing/invalid key handling, prefix detection. **Build:** `cargo check -p hero_memory_lib --tests` clean. ### What this unblocks Every later phase builds on these three modules. Specifically: - Phase 2 (collections) writes `Provenance` into `index.add` so embedding docs already carry source metadata. - Phase 3 (conversion) records `convert@v1` against the file registry. - Phase 4 (Q&A) uses `Dimension::for_kind` to iterate per collection-kind and writes provenance with `topic` set. - Phase 5 (ontology) writes provenance onto every `GRAPH.NODE.ADD` / `GRAPH.EDGE.ADD`. ### Next: Phase 2 — Collection registry The `hero_memory_lib::collections` module: redb-backed registry, file→hash map, scan, change detection, `_src_*`-aware purge helpers. Will surface as `collections.*` RPC methods on `hero_memory_server`.
Author
Owner

Phase 2 landed — collection registry, scan, provenance threading

Commit: db053c0

Collections subsystem (hero_memory_lib::collections)

  • types.rsCollection, CollectionKind (Document/Code/AgentLog), CollectionState (Ready/Scanning/Processing), FileRecord, FileKind, ScanReport, NewCollection, CollectionsError, validate_name. CollectionKind::dimension_kind() and Collection::applicable_dimensions() connect a collection to the dimension catalogue from Phase 1.
  • store.rsCollectionsStore (registry redb at <data>/collections.redb) and FilesStore (per-collection redb at <data>/collections/<name>/files.redb). Public ops: create | get | list | delete, state transitions (update_state, mark_scanned, mark_processed), list_pending_for(extractor) for "what files still need this extractor."
  • scan.rsscan(root, files) walks the filesystem, sha256-hashes every file, and reconciles the FilesStore: added / changed / unchanged / removed. A hash change clears extractor_runs and converted_hash so downstream rows are known stale and Phase 3+ extractors will re-run. Hidden files and standard build dirs (target, node_modules, .git, __pycache__, …) are skipped.

Server wiring

  • AppState gains Arc<CollectionsStore>.
  • run_server initialises it from HERO_MEMORY_DATA (renamed from EMBEDDER_DATA), default ~/hero/var/memory/data.
  • New RPC dispatch entries on hero_memory_server:
    • collections.create(name, root, kind, dimensions?, namespace?)Collection
    • collections.list(){collections: [Collection]}
    • collections.get(name){collection: Collection?}
    • collections.scan(name){name, report: ScanReport}
    • collections.delete(name){name, deleted: true}

Provenance threading (per ADR-0003)

  • Doc gains optional provenance: Provenance.
  • index.add merges _src_* keys into metadata via Provenance::merge_into_metadata before storing — every embedding row now carries its origin.
  • Provenance::to_json_object() returns a typed JSON map (preserves i64 for _src_extracted_at) for embedding metadata, alongside the string-only to_property_pairs() used for graph node/edge inserts.
  • Reserved _src_* keys win on merge collision (per the ADR).

Verification

  • 58 lib unit tests passing — collections registry roundtrips, name validation, dimension-kind enforcement, state transitions, files-store roundtrips, scan against tempdirs (fresh, rescan, content-change, vanished, hidden-skip, file-kinds), provenance JSON-object types and merge semantics.
  • Full workspace cargo check clean.

What this unblocks

  • Phase 3 (conversion) will read FileRecord.kind to pick a converter, write the cached markdown to <data>/collections/<name>/cache/<rel>.md, and stamp extractor_runs["convert"] = 1 + converted_hash.
  • Phase 4 (Q&A) will iterate Collection::applicable_dimensions() per file, call aibroker.chat.complete, and embed each pair via index.add with provenance set — the _src_* keys are already wired through.
  • Phase 5 (ontology) will use Provenance::to_property_pairs() directly when calling GRAPH.NODE.ADD / GRAPH.EDGE.ADD.

Next: Phase 3 — Document conversion

hero_memory_lib::convert — PDF / PPTX / DOCX / HTML / code / text → cached normalised markdown, keyed by content_hash. Registers as convert@v1 in the extractor map. New RPC: collections.process will start to mean something (initially: run conversion over dirty files).

## Phase 2 landed — collection registry, scan, provenance threading Commit: [`db053c0`](https://forge.ourworld.tf/lhumina_code/hero_memory/commit/db053c0) ### Collections subsystem (`hero_memory_lib::collections`) - **`types.rs`** — `Collection`, `CollectionKind` (Document/Code/AgentLog), `CollectionState` (Ready/Scanning/Processing), `FileRecord`, `FileKind`, `ScanReport`, `NewCollection`, `CollectionsError`, `validate_name`. `CollectionKind::dimension_kind()` and `Collection::applicable_dimensions()` connect a collection to the dimension catalogue from Phase 1. - **`store.rs`** — `CollectionsStore` (registry redb at `<data>/collections.redb`) and `FilesStore` (per-collection redb at `<data>/collections/<name>/files.redb`). Public ops: `create | get | list | delete`, state transitions (`update_state`, `mark_scanned`, `mark_processed`), `list_pending_for(extractor)` for "what files still need this extractor." - **`scan.rs`** — `scan(root, files)` walks the filesystem, sha256-hashes every file, and reconciles the `FilesStore`: added / changed / unchanged / removed. A hash change **clears `extractor_runs` and `converted_hash`** so downstream rows are known stale and Phase 3+ extractors will re-run. Hidden files and standard build dirs (`target`, `node_modules`, `.git`, `__pycache__`, …) are skipped. ### Server wiring - `AppState` gains `Arc<CollectionsStore>`. - `run_server` initialises it from `HERO_MEMORY_DATA` (renamed from `EMBEDDER_DATA`), default `~/hero/var/memory/data`. - New RPC dispatch entries on `hero_memory_server`: - `collections.create(name, root, kind, dimensions?, namespace?)` → `Collection` - `collections.list()` → `{collections: [Collection]}` - `collections.get(name)` → `{collection: Collection?}` - `collections.scan(name)` → `{name, report: ScanReport}` - `collections.delete(name)` → `{name, deleted: true}` ### Provenance threading (per [ADR-0003](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/adr/0003-provenance-src-keys.md)) - `Doc` gains optional `provenance: Provenance`. - `index.add` merges `_src_*` keys into `metadata` via `Provenance::merge_into_metadata` before storing — every embedding row now carries its origin. - `Provenance::to_json_object()` returns a typed JSON map (preserves `i64` for `_src_extracted_at`) for embedding metadata, alongside the string-only `to_property_pairs()` used for graph node/edge inserts. - Reserved `_src_*` keys win on merge collision (per the ADR). ### Verification - 58 lib unit tests passing — collections registry roundtrips, name validation, dimension-kind enforcement, state transitions, files-store roundtrips, scan against tempdirs (fresh, rescan, content-change, vanished, hidden-skip, file-kinds), provenance JSON-object types and merge semantics. - Full workspace `cargo check` clean. ### What this unblocks - Phase 3 (conversion) will read `FileRecord.kind` to pick a converter, write the cached markdown to `<data>/collections/<name>/cache/<rel>.md`, and stamp `extractor_runs["convert"] = 1` + `converted_hash`. - Phase 4 (Q&A) will iterate `Collection::applicable_dimensions()` per file, call `aibroker.chat.complete`, and embed each pair via `index.add` with `provenance` set — the `_src_*` keys are already wired through. - Phase 5 (ontology) will use `Provenance::to_property_pairs()` directly when calling `GRAPH.NODE.ADD` / `GRAPH.EDGE.ADD`. ### Next: Phase 3 — Document conversion `hero_memory_lib::convert` — PDF / PPTX / DOCX / HTML / code / text → cached normalised markdown, keyed by `content_hash`. Registers as `convert@v1` in the extractor map. New RPC: `collections.process` will start to mean something (initially: run conversion over dirty files).
Author
Owner

Phase 3 landed — document conversion (convert@v1)

Commit: aa5643a

Conversion subsystem (hero_memory_lib::convert)

  • cache.rsConvertCache rooted at <collection_dir>/cache. markdown_path("a/b/c.md")<cache>/a/b/c.md.md; images_dir("a/b/c.md")<cache>/a/b/c.md.images. Suffixing rather than replacing the source extension keeps foo.md and foo.rs distinct in the cache. Parent-traversal in source paths is dropped defensively.
  • converter.rsconvert_file(input, kind, md_path, images_dir) dispatches on FileKind:
    • Markdown / Text — passthrough with \r\n/\r normalisation and a trailing newline.
    • Code — wrapped in a 4-backtick fenced block tagged with the source extension; 4 backticks tolerate nested triple-backtick blocks in the source.
    • PDF / PPTX / DOCXunpdf / pptx-to-md / anytomd with image extraction to the per-source images_dir.
    • HTMLhtml-to-markdown-rs.
    • OtherConvertError::UnsupportedFormat.
  • Output: markdown text in memory, written path, sha256 of the markdown (the converted_hash recorded on the FileRecord), optional images dir.

RPC handler

  • convert.run(collection, path?, force?) — if path is set, runs that one file; otherwise iterates the registry. Skips records that already have convert@v1 stamped at the current content_hash unless force is true. Per-file outcome (converted | skipped | failed) returned in details with reason / converted_hash / markdown_path. On success, FileRecord.converted_hash and extractor_runs["convert"] = 1 are persisted.

Workspace deps added

unpdf 0.2, pptx-to-md 0.4, anytomd 1.2, html-to-markdown-rs 2.

Verification

  • 10 new convert unit tests — cache paths, distinct stems, traversal rejection, read/remove roundtrip, markdown passthrough, text line-ending normalisation, code fenced wrap, HTML, deterministic hashing, unsupported-format error.
  • 68 lib unit tests passing total (was 58). Full workspace cargo check clean.

What this unblocks

  • Phase 4 (Q&A) reads cached markdown via ConvertCache::read_markdown(rel) for each file, then calls aibroker.chat.complete per applicable dimension. Re-running qa@v1 on an unchanged file is free because conversion is already done and the markdown text is content-addressed.
  • Phase 5 (ontology) reads the same cached markdown and walks ontology TOMLs from hero_db.

Next: Phase 4 — Q&A extraction

hero_memory_lib::ai (hero_aibroker client wrapper) + hero_memory_lib::qa (per-dimension Q&A pair generation with JSON-schema-constrained output). New RPCs: qa.extract | list | search | purge. Pairs embed via index.add with provenance set, so the _src_* keys flow through automatically.

## Phase 3 landed — document conversion (`convert@v1`) Commit: [`aa5643a`](https://forge.ourworld.tf/lhumina_code/hero_memory/commit/aa5643a) ### Conversion subsystem (`hero_memory_lib::convert`) - **`cache.rs`** — `ConvertCache` rooted at `<collection_dir>/cache`. `markdown_path("a/b/c.md")` → `<cache>/a/b/c.md.md`; `images_dir("a/b/c.md")` → `<cache>/a/b/c.md.images`. Suffixing rather than replacing the source extension keeps `foo.md` and `foo.rs` distinct in the cache. Parent-traversal in source paths is dropped defensively. - **`converter.rs`** — `convert_file(input, kind, md_path, images_dir)` dispatches on `FileKind`: - **Markdown / Text** — passthrough with `\r\n`/`\r` normalisation and a trailing newline. - **Code** — wrapped in a 4-backtick fenced block tagged with the source extension; 4 backticks tolerate nested triple-backtick blocks in the source. - **PDF / PPTX / DOCX** — `unpdf` / `pptx-to-md` / `anytomd` with image extraction to the per-source `images_dir`. - **HTML** — `html-to-markdown-rs`. - **Other** — `ConvertError::UnsupportedFormat`. - Output: markdown text in memory, written path, sha256 of the markdown (the `converted_hash` recorded on the `FileRecord`), optional images dir. ### RPC handler - **`convert.run(collection, path?, force?)`** — if `path` is set, runs that one file; otherwise iterates the registry. Skips records that already have `convert@v1` stamped at the current `content_hash` unless `force` is true. Per-file outcome (`converted` | `skipped` | `failed`) returned in `details` with reason / `converted_hash` / `markdown_path`. On success, `FileRecord.converted_hash` and `extractor_runs["convert"] = 1` are persisted. ### Workspace deps added `unpdf 0.2`, `pptx-to-md 0.4`, `anytomd 1.2`, `html-to-markdown-rs 2`. ### Verification - 10 new convert unit tests — cache paths, distinct stems, traversal rejection, read/remove roundtrip, markdown passthrough, text line-ending normalisation, code fenced wrap, HTML, deterministic hashing, unsupported-format error. - 68 lib unit tests passing total (was 58). Full workspace `cargo check` clean. ### What this unblocks - Phase 4 (Q&A) reads cached markdown via `ConvertCache::read_markdown(rel)` for each file, then calls `aibroker.chat.complete` per applicable dimension. Re-running `qa@v1` on an unchanged file is free because conversion is already done and the markdown text is content-addressed. - Phase 5 (ontology) reads the same cached markdown and walks ontology TOMLs from `hero_db`. ### Next: Phase 4 — Q&A extraction `hero_memory_lib::ai` (hero_aibroker client wrapper) + `hero_memory_lib::qa` (per-dimension Q&A pair generation with JSON-schema-constrained output). New RPCs: `qa.extract | list | search | purge`. Pairs embed via `index.add` with `provenance` set, so the `_src_*` keys flow through automatically.
Author
Owner

Phase 7 landed — three deploy modes via CLI flags

Commit: 1b1d0a8

The hero_memory CLI is now the single entry point for all three deploy modes (per PRD §08). No Nu script — the binary itself owns service registration via hero_proc_sdk (the hero_proc_service_selfstart pattern).

Surface

# All-in-one (dev): inference + server + UI
hero_memory --start

# Shared inference daemon only (heavy, root)
hero_memory --start --inference

# Per-tenant userspace
hero_memory --start --userspace --inference-url http://127.0.0.1:8092
hero_memory --start --userspace --inference-url http://[<mycelium_ipv6>]:8092

# Stop
hero_memory --stop

Implementation

  • DeployMode { AllInOne, InferenceOnly, Userspace } selected by --inference / --userspace (mutually exclusive); default is AllInOne.
  • build_service_definition(mode, inference_url_override) composes the hero_proc service definition with only the actions that mode needs.
  • Userspace mode without --inference-url or HERO_MEMORY_INFERENCE_URL fails fast (exit code 2).
  • For userspace, the server's HERO_MEMORY_INFERENCE_URL env is set to the override URL; for AllInOne it points at the locally registered daemon.

Cleanup of stale references

  • Cargo.toml: dropped a bogus "Linux: service_memory install_ort --root" line (no such Nu script in this repo) → replaced with the actual download-and-extract instruction.
  • Makefile check-deps matches.
  • crates/hero_memory_inference/src/main.rs: doc + bind-list comment no longer reference service_memory.nu.
  • docs/prd/08-deployment-and-ops.md: the deployment section now describes the actual hero_memory --start CLI rather than a Nu script that does not exist.

Verification

Full workspace cargo check clean. No remaining service_memory / service_embedder text in the repo.

Cross-repo: hero_aibroker issue filed

Phases 4 and 5 need JSON-schema-constrained chat completion in hero_aibroker, which is missing today. Filed as hero_aibroker#59 — small change per the audit (add json_schema: Option<Value> to ChatRequest, pass through to OpenAI-compatible providers).

Next

Phase 4 (Q&A) and Phase 5 (ontology) are blocked on aibroker#59. While that lands, Phase 8 (UI updates for collections + convert) can proceed.

## Phase 7 landed — three deploy modes via CLI flags Commit: [`1b1d0a8`](https://forge.ourworld.tf/lhumina_code/hero_memory/commit/1b1d0a8) The `hero_memory` CLI is now the single entry point for all three deploy modes (per [PRD §08](https://forge.ourworld.tf/lhumina_code/hero_memory/src/branch/development/docs/prd/08-deployment-and-ops.md)). No Nu script — the binary itself owns service registration via `hero_proc_sdk` (the [hero_proc_service_selfstart](https://forge.ourworld.tf/lhumina_code/hero_proc) pattern). ### Surface ```sh # All-in-one (dev): inference + server + UI hero_memory --start # Shared inference daemon only (heavy, root) hero_memory --start --inference # Per-tenant userspace hero_memory --start --userspace --inference-url http://127.0.0.1:8092 hero_memory --start --userspace --inference-url http://[<mycelium_ipv6>]:8092 # Stop hero_memory --stop ``` ### Implementation - `DeployMode { AllInOne, InferenceOnly, Userspace }` selected by `--inference` / `--userspace` (mutually exclusive); default is `AllInOne`. - `build_service_definition(mode, inference_url_override)` composes the `hero_proc` service definition with only the actions that mode needs. - Userspace mode without `--inference-url` or `HERO_MEMORY_INFERENCE_URL` fails fast (exit code 2). - For userspace, the server's `HERO_MEMORY_INFERENCE_URL` env is set to the override URL; for AllInOne it points at the locally registered daemon. ### Cleanup of stale references - `Cargo.toml`: dropped a bogus "Linux: service_memory install_ort --root" line (no such Nu script in this repo) → replaced with the actual download-and-extract instruction. - `Makefile` `check-deps` matches. - `crates/hero_memory_inference/src/main.rs`: doc + bind-list comment no longer reference `service_memory.nu`. - `docs/prd/08-deployment-and-ops.md`: the deployment section now describes the actual `hero_memory --start` CLI rather than a Nu script that does not exist. ### Verification Full workspace `cargo check` clean. No remaining `service_memory` / `service_embedder` text in the repo. ### Cross-repo: `hero_aibroker` issue filed Phases 4 and 5 need JSON-schema-constrained chat completion in `hero_aibroker`, which is missing today. Filed as [hero_aibroker#59](https://forge.ourworld.tf/lhumina_code/hero_aibroker/issues/59) — small change per the audit (add `json_schema: Option<Value>` to `ChatRequest`, pass through to OpenAI-compatible providers). ### Next Phase 4 (Q&A) and Phase 5 (ontology) are blocked on aibroker#59. While that lands, Phase 8 (UI updates for collections + convert) can proceed.
mik-tf added this to the ACTIVE project 2026-05-06 17:32:08 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_memory#1
No description provided.