feat(server): expose docusaurus generation over OpenRPC as async jobs #94

Merged
mahmoud merged 9 commits from development_expose_docusaurus_openrpc_async_jobs into development 2026-04-20 15:55:32 +00:00
Member

Summary

  • Added three new JSON-RPC methods (docs.new, docs.generate, docs.jobStatus) to hero_books_server
  • Exposes hero_books_docusaurus scaffolding and site generation as async background jobs
  • Jobs return immediately with a job ID; status is polled via docs.jobStatus
  • Concurrent calls for the same input hash share a job (dedup)
  • Output cached by input hash under .docusaurus_cache/

Closes #92

Changes

  • crates/hero_books_server/Cargo.toml -- Added hero_books_docusaurus and uuid dependencies
  • crates/hero_books_server/src/web/server.rs -- Added DocsJobState enum, DocsJob struct, docs_jobs() registry, cache/hash helpers, 2 unit tests
  • crates/hero_books_server/src/web/rpc.rs -- Added dispatch entries and 3 handler functions (handle_docs_new, handle_docs_generate, handle_docs_job_status)
  • crates/hero_books_server/src/web/mod.rs -- Added re-exports for new public items
  • crates/hero_books_server/openrpc.json -- Added 3 new method definitions
  • crates/hero_books_server/src/web/rpc_spec.rs -- Added typed request/response structs, updated inline schema

Test Results

  • 11 tests passed, 0 failed
  • 2 new tests: test_docs_job_registry, test_docs_job_dedup
## Summary - Added three new JSON-RPC methods (`docs.new`, `docs.generate`, `docs.jobStatus`) to `hero_books_server` - Exposes `hero_books_docusaurus` scaffolding and site generation as async background jobs - Jobs return immediately with a job ID; status is polled via `docs.jobStatus` - Concurrent calls for the same input hash share a job (dedup) - Output cached by input hash under `.docusaurus_cache/` ## Related Issue Closes https://forge.ourworld.tf/lhumina_code/hero_books/issues/92 ## Changes - `crates/hero_books_server/Cargo.toml` -- Added `hero_books_docusaurus` and `uuid` dependencies - `crates/hero_books_server/src/web/server.rs` -- Added `DocsJobState` enum, `DocsJob` struct, `docs_jobs()` registry, cache/hash helpers, 2 unit tests - `crates/hero_books_server/src/web/rpc.rs` -- Added dispatch entries and 3 handler functions (`handle_docs_new`, `handle_docs_generate`, `handle_docs_job_status`) - `crates/hero_books_server/src/web/mod.rs` -- Added re-exports for new public items - `crates/hero_books_server/openrpc.json` -- Added 3 new method definitions - `crates/hero_books_server/src/web/rpc_spec.rs` -- Added typed request/response structs, updated inline schema ## Test Results - 11 tests passed, 0 failed - 2 new tests: `test_docs_job_registry`, `test_docs_job_dedup`
feat(server): expose docusaurus generation over OpenRPC as async jobs
All checks were successful
Test / test (pull_request) Successful in 6m44s
Test / integration (pull_request) Successful in 5m0s
bb5466f7c0
#92
fix(server): improve docs job dedup to cover completed jobs
All checks were successful
Test / test (pull_request) Successful in 6m40s
Test / integration (pull_request) Successful in 4m8s
ef04dd0189
Return existing job_id for pending, running, and done jobs with the
same input hash. Only allow retry when a previous job failed.

#92
Author
Member

End-to-End Test Results

Manually tested all three new RPC methods against a running server instance.

Test Result
docs.new returns job_id immediately Pass
docs.jobStatus shows running while in flight Pass
Dedup: same params return same job_id Pass
Build completes and generates static output Pass
Invalid job_id returns error Pass
Missing params returns error Pass
docs.generate with bad path fails gracefully Pass
Server never panics on failure Pass

Notes

  • bun run build exits with SIGABRT after successfully generating the static files. This is a known bun runtime issue, not related to this PR. The build output is complete and valid.
  • Dedup was fixed in a follow-up commit to also cover completed (done) jobs, not just pending/running ones. Failed jobs allow retry.
## End-to-End Test Results Manually tested all three new RPC methods against a running server instance. | Test | Result | |------|--------| | `docs.new` returns job_id immediately | Pass | | `docs.jobStatus` shows `running` while in flight | Pass | | Dedup: same params return same job_id | Pass | | Build completes and generates static output | Pass | | Invalid job_id returns error | Pass | | Missing params returns error | Pass | | `docs.generate` with bad path fails gracefully | Pass | | Server never panics on failure | Pass | ### Notes - `bun run build` exits with SIGABRT after successfully generating the static files. This is a known bun runtime issue, not related to this PR. The build output is complete and valid. - Dedup was fixed in a follow-up commit to also cover completed (done) jobs, not just pending/running ones. Failed jobs allow retry.
Owner

Review

The PR matches issue #92 on the mechanical bits — dep added, three methods wired, in-memory job registry, dedup by input hash, generated client and inline schema updated. Parallel jobs for distinct inputs work. Happy to merge this as a stepping stone so callers can start using the RPC shape.

Required follow-ups before/after merge

  1. Add a smoke test that exercises docs.newdocs.jobStatus polling end-to-end. Issue #92 explicitly asked for this. The two unit tests (test_docs_job_registry, test_docs_job_dedup) only poke the HashMap directly — they never go through the RPC handlers, so a bug in the handler layer wouldn't be caught.

  2. Collapse the dedup logic to the already-exported helper. find_running_docs_job is exported from server.rs but both handle_docs_new and handle_docs_generate re-implement dedup inline. The two paths also have slightly different semantics: the inline version dedups Done jobs too, the helper only dedups Pending | Running. Pick one policy and use the helper from both handlers.

  3. docs.generate should also accept a book id, not only a heroscript path. Issue #92 said "from an existing book (by book id or path)." Current impl only accepts path. Minor, but worth closing the gap.

  4. std::thread::spawn vs tokio::spawn. The issue specified tokio::spawn. For a build that shells out to bun, OS threads are defensible, but call it out in a comment so the next reader knows it was deliberate.

Architectural follow-up (separate issue, not blocking this PR)

After #92 was filed we agreed the Hero-native shape is to run docusaurus generation as a hero_proc action, with docs.* methods creating and polling runs of that action. This PR keeps jobs inside hero_books_server with an in-memory registry, which means:

  • Job state dies on server restart.
  • Logs only live in server stdout, not in hero_proc's log store.
  • Nothing shows up in the standard hero_proc admin UIs alongside other long-running work.
  • Duplicates infrastructure (job tracking, status, cancellation) that hero_proc already provides.

The external RPC shape (docs.new, docs.generate, docs.jobStatus) can stay stable when the guts are swapped, so this doesn't need to block merge — but please open a follow-up issue titled something like "Migrate docs.* jobs to hero_proc actions" and link it here.

Bottom line

Merge once (1) and (2) are addressed — (3) and (4) can fold in, (5) becomes a new issue. The RPC surface is good; the internals are the part that needs to evolve.

## Review The PR matches issue #92 on the mechanical bits — dep added, three methods wired, in-memory job registry, dedup by input hash, generated client and inline schema updated. Parallel jobs for distinct inputs work. Happy to merge this as a stepping stone so callers can start using the RPC shape. ### Required follow-ups before/after merge 1. **Add a smoke test that exercises `docs.new` → `docs.jobStatus` polling end-to-end.** Issue #92 explicitly asked for this. The two unit tests (`test_docs_job_registry`, `test_docs_job_dedup`) only poke the `HashMap` directly — they never go through the RPC handlers, so a bug in the handler layer wouldn't be caught. 2. **Collapse the dedup logic to the already-exported helper.** `find_running_docs_job` is exported from `server.rs` but both `handle_docs_new` and `handle_docs_generate` re-implement dedup inline. The two paths also have slightly different semantics: the inline version dedups `Done` jobs too, the helper only dedups `Pending | Running`. Pick one policy and use the helper from both handlers. 3. **`docs.generate` should also accept a book id, not only a heroscript path.** Issue #92 said "from an existing book (by book id or path)." Current impl only accepts `path`. Minor, but worth closing the gap. 4. **`std::thread::spawn` vs `tokio::spawn`.** The issue specified `tokio::spawn`. For a build that shells out to bun, OS threads are defensible, but call it out in a comment so the next reader knows it was deliberate. ### Architectural follow-up (separate issue, not blocking this PR) After #92 was filed we agreed the Hero-native shape is to run docusaurus generation as a **hero_proc action**, with `docs.*` methods creating and polling runs of that action. This PR keeps jobs inside `hero_books_server` with an in-memory registry, which means: - Job state dies on server restart. - Logs only live in server stdout, not in hero_proc's log store. - Nothing shows up in the standard hero_proc admin UIs alongside other long-running work. - Duplicates infrastructure (job tracking, status, cancellation) that hero_proc already provides. The external RPC shape (`docs.new`, `docs.generate`, `docs.jobStatus`) can stay stable when the guts are swapped, so this doesn't need to block merge — but please open a follow-up issue titled something like "Migrate docs.* jobs to hero_proc actions" and link it here. ### Bottom line Merge once (1) and (2) are addressed — (3) and (4) can fold in, (5) becomes a new issue. The RPC surface is good; the internals are the part that needs to evolve.
fix(server): address review feedback on docs API
All checks were successful
Test / test (pull_request) Successful in 6m29s
Test / integration (pull_request) Successful in 4m11s
1673db34ca
- Add e2e smoke test through RPC handler (docs.new -> docs.jobStatus)
- Consolidate dedup logic into find_existing_docs_job helper (covers
  pending, running, and done jobs; failed jobs allow retry)
- Accept book_id param in docs.generate to resolve by book name
- Add comments explaining std:🧵:spawn choice over tokio::spawn

#92
fix(server): address review feedback and add comprehensive tests
Some checks failed
Test / test (pull_request) Successful in 7m46s
Test / integration (pull_request) Has been cancelled
0fbafa724e
- Add e2e smoke test through RPC handler (docs.new -> docs.jobStatus)
- Consolidate dedup logic into find_existing_docs_job helper (covers
  pending, running, and done jobs; failed jobs allow retry)
- Accept book_id param in docs.generate to resolve by book name
- Return specific "book not found" error for invalid book_id
- Add comments explaining std:🧵:spawn choice over tokio::spawn
- Add tests for missing params, invalid book_id, bad path graceful
  failure, and docs.generate with path (17 tests total)

#92
style(server): apply cargo fmt formatting
All checks were successful
Test / test (pull_request) Successful in 8m37s
Test / integration (pull_request) Successful in 4m55s
90d855c656
#92
fix(server): sync docs.generate spec with book_id support and replace test sleeps with polling
All checks were successful
Test / test (pull_request) Successful in 6m42s
Test / integration (pull_request) Successful in 4m14s
fccd8c2e86
- rpc_spec.rs: update DocsGenerateRequest struct and inline OpenRPC schema
  to match openrpc.json (both path and book_id optional, summary updated)
- openrpc.client.generated.rs: regenerated to reflect the new input shape
- rpc.rs tests: add wait_for_terminal_state() polling helper; replace two
  fixed 2s sleeps with bounded polling (5s timeout, 50ms step) so tests
  finish as soon as the background thread reaches a terminal state

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
refactor(server): run docs.* as hero_proc jobs instead of in-process threads
All checks were successful
Test / test (pull_request) Successful in 7m59s
Test / integration (pull_request) Successful in 4m12s
494ecd56d6
Delete the in-memory DocsJob / docs_jobs() registry and submit each
docs.new / docs.generate call as a hero_proc job via HeroProcRPCAPIClient.
The server no longer owns job lifecycle state; hero_proc does.

Why:
- job state now persists across restarts
- cancel / retry / logs / listing via hero_proc's native APIs
- docs jobs show up alongside other Hero work in admin UIs
- removes duplicate infrastructure

Shape:
- docs.new / docs.generate build an ActionSpec whose script shells out to
  the sibling `hero_docs` binary (resolved from current_exe().parent(),
  matching the hero_books service registration convention). The action is
  tagged `docs#️⃣<input_hash>` so the next call with the same inputs
  deduplicates via job.list(tag = ...) — no local HashMap needed.
- docs.jobStatus parses job_id as i64, calls job.status, maps hero_proc
  phases (pending|waiting → pending, running|retrying → running,
  succeeded → done, failed|cancelled → failed), and for failed jobs pulls
  the last ~10 log lines via job.logs.
- output_path is deterministic: get_docusaurus_cache_dir()/<hash>/build.
  Recovered from the job's tags in docs.jobStatus.

ServerConfig gains hero_proc_socket (env HERO_PROC_SOCKET, default
~/hero/var/sockets/hero_proc/rpc.sock) and hero_docs_bin (sibling to the
running server binary). Populated once in main.rs.

Tests: removed the six tests that relied on the in-process background
thread (they'd now need a live hero_proc). Kept the four input-validation
tests that short-circuit before any RPC call, and added unit tests for
the new pure helpers (map_hero_proc_phase, shell_quote,
calculate_docs_input_hash). 16 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(server): dedup docs jobs by action_id and recover output path from job name
All checks were successful
Test / test (pull_request) Successful in 6m32s
Test / integration (pull_request) Successful in 4m58s
47039e5645
Switch from tag-based dedup to action_id-based dedup. hero_proc does not
currently persist ActionSpec.tags onto the job record — every JobSummary
came back with tags: null, so the tag filter never matched and dedup
silently failed (identical docs.new calls produced new job ids).

The hero_proc action_id field equals the action name we construct from
the input hash, is always persisted, and uniquely identifies the input —
so it's the right dedup key. Switch to filter.action_id and drop the tag
plumbing that wasn't doing anything.

Also fix docs.jobStatus output_path recovery: it was reading the hash
from summary.tags (null) too. Now parses the hash out of summary.name by
stripping the known prefix — works consistently with how we construct
the action name.

Use the full input hash in the action name instead of a 12-char prefix
— avoids theoretical collisions and makes name → hash round-trip exact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs: Update docstrings for async jobs
All checks were successful
Test / test (pull_request) Successful in 6m28s
Test / integration (pull_request) Successful in 4m54s
6ed816df74
- Clarify that docs.new submits a hero_proc job
- Clarify that docs.generate submits a hero_proc job
- Update docs.job_status docstring
Owner

Live validation against a running hero_proc

Restarted hero_books with commit 47039e5 installed, then drove the full flow over the Unix socket.

Results

Check Expected Actual
Fresh docs.new new numeric job id (as string) "826"
Same args again (dedup) same job id "826"
docs.jobStatus when done includes deterministic output_path /Users/.../.docusaurus_cache/f977b4cfcdf29197/build
Different args new job id "827"

Dedup fix along the way (47039e5)

First pass of live testing showed identical docs.new calls were producing new ids (821 → 822). Root cause: hero_proc does not currently persist ActionSpec.tags onto the job record — every JobSummary.tags came back null, so the filter.tag query silently matched nothing.

Fix: switch dedup to filter.action_id, which equals the action name we construct from the input hash, and is always persisted. Same fix applied to the output_path recovery in docs.jobStatus — it now parses the hash out of summary.name by stripping the docs_new_ / docs_generate_ prefix.

Also switched to using the full input hash in the action name (was a 12-char prefix) to rule out theoretical collisions and keep the name ↔ hash round-trip exact.

Cross-checked against hero_proc directly

// job.status 821 (from hero_proc socket)
{
  "action_id": "docs_new_f977b4cfcdf2",
  "context_name": "core",
  "exit_code": 0,
  "id": 821,
  "name": "docs_new_f977b4cfcdf2",
  "phase": "succeeded"
}

Jobs show up under hero_proc's normal job.list / job.status / job.logs surface, which was the whole point of the refactor.

Outstanding

Not blocking this PR:

  1. Issue #91 (hero_books) — hero_docs new --path nesting bug still untouched.
  2. Issue #76 (hero_skills) — bun/node installer + service_books.nu install priming of the docusaurus template still pending.
  3. ActionSpec env passthrough — not biting here because bun/node are already on hero_proc's inherited PATH, but once #76 lands and we test on a fresh box we may need explicit .env("PATH", …) / .env("HOME", …) on the submitted spec.
  4. hero_proc bug to flag upstreamActionSpec.tags aren't propagated to JobSummary.tags. Not our bug; worth filing against hero_proc so admin UI tag filtering works in future. We're now using action_id instead, so not blocked.

Recommendation

From my side this PR is merge-ready. External RPC surface is stable, all four behaviours validated live, 16 unit tests green, net ~200 LOC deleted overall vs the original in-process design.

## Live validation against a running hero_proc Restarted `hero_books` with commit `47039e5` installed, then drove the full flow over the Unix socket. ### Results | Check | Expected | Actual | |---|---|---| | Fresh `docs.new` | new numeric job id (as string) | `"826"` ✓ | | Same args again (dedup) | **same** job id | `"826"` ✓ | | `docs.jobStatus` when done | includes deterministic `output_path` | `/Users/.../.docusaurus_cache/f977b4cfcdf29197/build` ✓ | | Different args | new job id | `"827"` ✓ | ### Dedup fix along the way (`47039e5`) First pass of live testing showed identical `docs.new` calls were producing new ids (821 → 822). Root cause: **hero_proc does not currently persist `ActionSpec.tags` onto the job record** — every `JobSummary.tags` came back `null`, so the `filter.tag` query silently matched nothing. Fix: switch dedup to `filter.action_id`, which equals the action name we construct from the input hash, and is always persisted. Same fix applied to the `output_path` recovery in `docs.jobStatus` — it now parses the hash out of `summary.name` by stripping the `docs_new_` / `docs_generate_` prefix. Also switched to using the **full** input hash in the action name (was a 12-char prefix) to rule out theoretical collisions and keep the name ↔ hash round-trip exact. ### Cross-checked against hero_proc directly ```json // job.status 821 (from hero_proc socket) { "action_id": "docs_new_f977b4cfcdf2", "context_name": "core", "exit_code": 0, "id": 821, "name": "docs_new_f977b4cfcdf2", "phase": "succeeded" } ``` Jobs show up under hero_proc's normal `job.list` / `job.status` / `job.logs` surface, which was the whole point of the refactor. ### Outstanding Not blocking this PR: 1. **Issue #91** (hero_books) — `hero_docs new --path` nesting bug still untouched. 2. **Issue #76** (hero_skills) — bun/node installer + `service_books.nu install` priming of the docusaurus template still pending. 3. **ActionSpec env passthrough** — not biting here because bun/node are already on hero_proc's inherited PATH, but once #76 lands and we test on a fresh box we may need explicit `.env("PATH", …)` / `.env("HOME", …)` on the submitted spec. 4. **hero_proc bug to flag upstream** — `ActionSpec.tags` aren't propagated to `JobSummary.tags`. Not our bug; worth filing against hero_proc so admin UI tag filtering works in future. We're now using `action_id` instead, so not blocked. ### Recommendation From my side this PR is merge-ready. External RPC surface is stable, all four behaviours validated live, 16 unit tests green, net ~200 LOC deleted overall vs the original in-process design.
mahmoud merged commit d46c1b76d3 into development 2026-04-20 15:55:32 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_books!94
No description provided.