feat(11-E): seed service_agent_v3 Python flow on startup #22

Merged
mik-tf merged 7 commits from feat/11-phase-e-seed-agent-v3 into development 2026-05-06 17:35:15 +00:00
Owner

Summary

  • Phase E — seed service_agent_v3 Python flow: replaces the deleted DAG template with a @flow-decorated Python file embedded via include_str! and seeded as a Workflow + WorkflowVersion on every server startup.
  • The flow exercises the full Phase C executor stack: hero_router for service discovery, hero_aibroker REST for chat completions, generated stubs for selected services, subprocess for the generated script, and a retry-with-debug-feedback loop on failure.
  • Uses instrument(client) so every RPC call appears in the Play.spans tree.
  • Builds on top of #21 (Phase D — DAG deletion).

Changes

  • crates/hero_logic/src/seed_flows/service_agent_v3.py — flow source, ~320 LOC. @flow(name="service_agent_v3", inputs={prompt, model}).
  • crates/hero_logic/src/seed.rs — replaces the no-op stub with a BUILT_IN_FLOWS table + JSON-RPC-over-UDS upsert. Idempotent on Workflow.name so user edits survive restart.
  • crates/hero_logic/tests/e2e_create_event.rs — integration test that runs the full pipeline against live router + aibroker + osis_calendar. Skips cleanly when any prerequisite is missing.
  • .gitignore — add __pycache__/ and *.pyc.

Test plan

  • cargo build --workspace clean
  • cargo test -p hero_logic --lib — 48 tests pass (4 new in seed.rs)
  • cargo test -p hero_logic --test e2e_create_event — skips cleanly without live aibroker/osis_calendar
  • Manual end-to-end with hero_proc + hero_aibroker + hero_osis_calendar running — verifies the agent creates an event with the expected title

How to verify locally (when services are up)

hero_proc start hero_router hero_aibroker hero_osis_calendar
cargo test -p hero_logic --test e2e_create_event -- --nocapture

🤖 Generated with Claude Code

## Summary - **Phase E — seed `service_agent_v3` Python flow**: replaces the deleted DAG template with a `@flow`-decorated Python file embedded via `include_str!` and seeded as a `Workflow` + `WorkflowVersion` on every server startup. - The flow exercises the full Phase C executor stack: hero_router for service discovery, hero_aibroker REST for chat completions, generated stubs for selected services, subprocess for the generated script, and a retry-with-debug-feedback loop on failure. - Uses `instrument(client)` so every RPC call appears in the `Play.spans` tree. - Builds on top of #21 (Phase D — DAG deletion). ## Changes - `crates/hero_logic/src/seed_flows/service_agent_v3.py` — flow source, ~320 LOC. `@flow(name="service_agent_v3", inputs={prompt, model})`. - `crates/hero_logic/src/seed.rs` — replaces the no-op stub with a `BUILT_IN_FLOWS` table + JSON-RPC-over-UDS upsert. Idempotent on `Workflow.name` so user edits survive restart. - `crates/hero_logic/tests/e2e_create_event.rs` — integration test that runs the full pipeline against live router + aibroker + osis_calendar. Skips cleanly when any prerequisite is missing. - `.gitignore` — add `__pycache__/` and `*.pyc`. ## Test plan - [x] `cargo build --workspace` clean - [x] `cargo test -p hero_logic --lib` — 48 tests pass (4 new in `seed.rs`) - [x] `cargo test -p hero_logic --test e2e_create_event` — skips cleanly without live aibroker/osis_calendar - [ ] Manual end-to-end with `hero_proc` + `hero_aibroker` + `hero_osis_calendar` running — verifies the agent creates an event with the expected title ## How to verify locally (when services are up) ```bash hero_proc start hero_router hero_aibroker hero_osis_calendar cargo test -p hero_logic --test e2e_create_event -- --nocapture ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Replaces the deleted templates/service_agent_v3.json with a
@flow-decorated Python file embedded via include_str! and seeded as a
Workflow + WorkflowVersion on every server startup. The flow exercises
the full Phase C executor stack — hero_router for service discovery,
hero_aibroker REST for chat completions, generated stubs for selected
services, subprocess for the generated script, retry-with-feedback on
failure — and uses instrument(client) so every RPC call appears in the
Play span tree.

Files:
- crates/hero_logic/src/seed_flows/service_agent_v3.py — the flow
  source. 320 LOC. @flow(name="service_agent_v3", inputs={prompt, model})
- crates/hero_logic/src/seed.rs — replaces the no-op stub with a
  BUILT_IN_FLOWS table + JSON-RPC-over-UDS upsert. Idempotent on
  Workflow.name so user edits survive restart.
- crates/hero_logic/tests/e2e_create_event.rs — integration test that
  runs the full pipeline against live router+aibroker+osis_calendar.
  Skips cleanly when any prerequisite is missing.

Tests:
- 4 new unit tests in seed.rs (parser, source-vs-seed name match)
- 48 hero_logic unit tests still green
- E2E test skips cleanly without live aibroker/osis_calendar

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase E commit accidentally tracked a .pyc — remove it from the index
and add __pycache__/ + *.pyc to the root .gitignore so future runs of
`python3 -m py_compile` on seed_flows/ don't show up in git status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovery from running the live E2E test against a real
hero_proc/hero_router/hero_aibroker/hero_osis_calendar stack
(per the hero_running + hero_sockets + herolib_ai skills):

1. Aibroker chat is JSON-RPC `ai.chat` on rpc.sock — NOT REST
   `/v1/chat/completions` on rest.sock (rest.sock is SSE-only per
   `herolib_ai`). Rewrote `_AibrokerChat.chat()` to post a JSON-RPC
   2.0 envelope to `/rpc` with method `ai.chat`.
2. Default model bumped from `qwen/qwen-2.5-coder-32b-instruct`
   (truncates output around 100 chars in this broker config — the
   model emits stub scripts that compile-and-print-nothing) to
   `deepseek/deepseek-v3.1-terminus` which consistently emits
   complete code blocks.
3. OSIS `*_set` system-prompt guidance: the auto-generated
   `event_set(data: str)` wrapper sends `{"data": "<json string>"}`
   which the OSIS server rejects (it requires `data` to be a JSON
   object and EVERY field of the type to be present). Added an
   explicit pattern in the system prompt steering the LLM to call
   `client._call("event.set", {"data": <dict>})` with all defaults
   filled in. Stable across multiple runs.
4. Test diagnostics: print Play.spans tree on assertion (truncated
   per-line, full code visible) so future debugging doesn't require
   poking at the tempdir before it's torn down. Added matching
   `cg_span.log(code)` in the flow so the generated script lands in
   the span tree alongside its `code_chars` tag.

Result: end-to-end test passes in ~8s. Agent reads the prompt,
discovers 34 services, picks hero_osis_calendar, generates a
complete dict-based event_set call, executes it, and the resulting
event with the expected unique title is visible via `event.list`.

Tests:
- 48 hero_logic unit tests still green
- e2e_create_event passes against live stack; skips cleanly without

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Discovered while running the live seeded play through the UI: the OTOML
serializer used by OSIS storage chooses `"""..."""` (basic multiline)
format for `WorkflowVersion.python_source`, and that format interprets
backslash escapes on read. Two manifestations:

1. `\n` inside a Python raw string (`r"\n?"`) round-trips as a literal
   newline, breaking re.sub patterns and any other backslash-bearing
   raw string in the source.
2. Literal `"""` inside the source serialises as `\"\"` (only two of
   three quotes escaped), producing un-parseable TOML.

Both surface only after the source has gone through the JSON-RPC
`workflow.set` → OTOML store → OTOML load → executor pipeline, so the
unit-test loopback (which executes from the in-memory string) never
catches them.

Fixes in this commit, ordered narrowest-first:

- crates/hero_logic/src/engine/python_executor.rs:
  Add `encode_python_source_b64` / `decode_python_source` helpers and
  apply decode in `execute()`. A `b64:`-prefixed source is base64-decoded
  before being written to the per-Play workdir; plain sources pass
  through unchanged so direct UI/RPC uploads (which round-trip safely in
  the same process memory) still work.
- crates/hero_logic/src/seed.rs:
  Wrap the embedded python_source in `encode_python_source_b64` before
  POSTing to `workflowversion.set`. This is the path that otherwise
  triggers the OTOML round-trip bug.
- crates/hero_logic/src/seed_flows/service_agent_v3.py:
  Convert all `"""..."""` triple-double-quote string literals to
  `'''...'''` (Python treats them identically). Belt-and-braces — the
  base64 wrapper makes this unnecessary, but keeping the source
  round-trip-safe lets future direct uploads land cleanly without
  re-encoding.
- crates/hero_logic/Cargo.toml: pull in `base64 = "0.22"`.

Verified end-to-end:
- restart hero_logic via hero_proc → seed creates `service_agent_v3`
  workflow + version + pinned current_version
- `logicservice.play_run_async` returns play sid
- play completes status=success in 23.8s with 14 spans
- target event lands in hero_osis_calendar with the correct title and
  start_time
- play visible in UI at /hero_logic/ui/plays/<sid>
- 48 hero_logic unit tests still green

Known issue (out of scope): the OTOML serializer in herolib_core
produces malformed escapes for `"""` inside `"""..."""` blocks, and
silently mangles `\n`. The right long-term fix is to teach the
serializer to pick `'''...'''` (literal multiline) for backslash- or
triple-quote-bearing content. Until then, all paths that round-trip
arbitrary Python source through OSIS storage must use the `b64:` wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The play detail page was rendering an empty cytoscape DAG (the old
Node/Edge graph deleted in Phase D) — `GRAPH_DATA = {edges:[], nodes:[]}`
because there are no nodes anymore. The data was all there in Play.spans
but the UI didn't know how to render it.

This commit replaces the DAG canvas with a span-tree view so users can
actually see what their Python flow did:

- Input pane (top): pretty-printed JSON of the @flow function's inputs
- Step tree (middle): one row per span with name, status chip,
  duration_ms. Indentation matches parent_id depth so nested
  flow.step("...") and instrument()-wrapped RPC calls visibly nest under
  their parent. Click a span to expand and see:
    * RPC service.method (for instrument()-wrapped spans)
    * Link to child play (for play_run_async sub-flows)
    * Error traceback (for failed spans)
    * Full tags JSON (parameters, results, code, etc.)
    * Captured logs (stdout/stderr the flow logged via span.log)
- Output pane (bottom): pretty-printed JSON return value
- Side panel: status, duration, tokens, top-level error, workflow link
- Live polling every 800ms while non-terminal — both server and client
  use the same depth algorithm so the running tree updates without a
  template re-render

Removed:
- All cytoscape / dagre / cytoscape-dagre / node-html-label CDN imports
- The retry button (LogicService.play_retry was deleted in Phase D)
- node_runs / node_logs paths in the play handler — replaced by spans

Files:
- crates/hero_logic_ui/src/routes.rs:
    PlayDetailTemplate now carries input_pretty / output_pretty /
    spans_json. New helpers `pretty_json_or_string` and `build_spans_view`
    pre-shape the data with depth resolved from parent_id walks (capped
    at 32 to defend against pathological cycles).
- crates/hero_logic_ui/templates/play_detail.html:
    Drop the cytoscape canvas + drawer; render the input/steps/output
    sections inline with a JS-driven expand/collapse for span detail.

Verified: http://localhost:9988/hero_logic/ui/plays/00gg now shows the
prompt, 14 spans nested by parent_id (3 chat calls under select/code/debug
attempts, RPC spans under each composite step), full generated Python
code in the code_generation span's tags, and the agent's summary in the
root span's result tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mik-tf added this to the ACTIVE project 2026-05-06 17:32:06 +00:00
mik-tf changed target branch from feat/11-phase-d-delete-dag to development 2026-05-06 17:33:11 +00:00
mik-tf merged commit cbfd90e6b1 into development 2026-05-06 17:35:15 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_logic!22
No description provided.