No description

Rust 47.5%
JavaScript 26.5%
HTML 14%
CSS 11.5%
Python 0.5%

Find a file

despiegk 7851fbdca0 chore: migrate to hero_admin_lib shared assets Replace duplicated Bootstrap, Bootstrap Icons, Unpoly, highlight.js, marked, ansi_up, Chart.js and connection-status.js with shared versions from hero_admin_lib. Add /static/shared/{*path} route. Remove ~10-13 MB of duplicated static files per crate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-05-10 16:19:59 +02:00
.hero	chore: migrate to hero_admin_lib shared assets	2026-05-10 16:19:59 +02:00
crates	chore: migrate to hero_admin_lib shared assets	2026-05-10 16:19:59 +02:00
Cargo.lock	chore: migrate to hero_admin_lib shared assets	2026-05-10 16:19:59 +02:00
Cargo.toml	chore: migrate to hero_admin_lib shared assets	2026-05-10 16:19:59 +02:00
Cargo.toml.hero_builder_backup	Complete hero_runner_py implementation with documentation and admin UI	2026-05-08 23:14:15 +02:00
PURPOSE.md	Complete hero_runner_py implementation with documentation and admin UI	2026-05-08 23:14:15 +02:00
README.md	Update README with runtime lifecycle, admin UI, and testing details	2026-05-08 23:31:59 +02:00
SPEC.md	Initial hero_runner_py project structure	2026-05-08 22:30:16 +02:00

README.md

hero_runner_py

A pre-fork Python script execution service for the Hero stack. It manages isolated Python virtual environments (via uv) and executes scripts with live log streaming, persistent session tracking, and timeout enforcement — all with zero interpreter startup overhead thanks to a pre-forked worker pool.

What it does
Architecture
Crates
Source map
Environment variables
Building and running
Quick start
JSON-RPC API
SSE streaming
Runtime management
Admin UI
Timeout behavior
Logging
Testing
Hero stack integration

What it does

hero_runner_py lets callers submit a Python script (inline code or a file path) and get back:

Live stdout/stderr streamed as Server-Sent Events while the script runs
A final result (exit code, stdout, stderr, success flag) stored persistently
Session history browsable via the JSON-RPC API or the admin UI
Timeout enforcement — scripts are hard-killed after timeout_ms milliseconds, even across a POSIX fork boundary

All of this is delivered over a Unix Domain Socket using JSON-RPC 2.0.

Architecture

                ┌─────────────────────────────────────┐
                │          parent process              │
                │  (multi-threaded tokio/axum server)  │
                │                                      │
                │  JSON-RPC 2.0 over UDS               │
                │  SessionManager  RuntimeManager      │
                │  SSE fan-out     hero_log persist     │
                └──────────────┬──────────────────────┘
                               │  UnixStream socket pair
                 ┌─────────────▼──────────────┐
                 │       worker process        │  ← pre-forked, single-threaded
                 │  (one per HERO_RUNNER_PY_WORKERS)  │
                 └──────────────┬─────────────┘
                                │  fork per job
                  ┌─────────────▼──────────────┐
                  │      grandchild process      │  ← isolated per script
                  │  runner::execute()           │
                  │  select(2) I/O loop          │
                  │  SIGALRM timeout handler     │
                  └──────────────┬──────────────┘
                                 │  spawn
                   ┌─────────────▼──────────────┐
                   │       Python process         │  ← own process group (setpgid)
                   │  uv venv interpreter         │
                   └─────────────────────────────┘

Three-level fork hierarchy

Level	Process	Responsibility
Parent	tokio/axum	Accept connections, route RPC, manage sessions, fan-out SSE
Worker	single-threaded	Hold a socket pair; fork a grandchild per job; forward frames
Grandchild	single-threaded	Run `runner::execute`; stream `WorkerFrame` records back

IPC protocol

All pipes and UnixStream channels use length-prefixed JSON:

[4 bytes: body length as u32 little-endian][JSON body bytes]

The grandchild emits a sequence of WorkerFrame records per job:

Started  { pid: i32 }             ← first frame
Log      { line: LogLine }        ← zero or more, as stdout/stderr arrives
Result   { result: ScriptResult } ← final, terminal frame

Runtime layout

$HERO_PYTHON_RUNTIMES_DIR/
  <name>/
    runtime.toml   ← metadata (name, python_version, modules, paths, timestamps,
                               installing, operational, smoke_session_id)
    bin/python     ← uv-created venv interpreter
    lib/           ← installed packages
    ...

Each runtime is a complete uv virtual environment created in its own named directory.

The runtime.toml tracks three lifecycle flags:

Field	Meaning
`installing`	`true` while uv is downloading Python / creating the venv / installing packages
`operational`	`true` after a successful smoke-test (`print('hello world')`)
`smoke_session_id`	Session ID of a failed smoke-test, kept for log inspection

Socket layout

$HERO_SOCKET_DIR/hero_runner_py/
  rpc.sock       ← JSON-RPC 2.0 endpoint
  discovery.sock ← service discovery (name, version)

Crates

Crate	Binary	Purpose
`hero_runner_py_server`	`hero_runner_py_server`	Core service: worker pool, JSON-RPC, SSE, runtime management
`hero_runner_py_admin`	`hero_runner_py_admin`	Web UI proxy — serves a dashboard over HTTP
`hero_runner_py_tests`	—	Integration test suite

Source map

`hero_runner_py_server/src/`

File	Purpose
`main.rs`	Entry point: uv check → pool fork → tokio runtime → default runtime init → smoke test → axum serve
`lib.rs`	Crate root, module declarations, public re-exports
`types.rs`	`ScriptRequest`, `ScriptResult`, `WorkerFrame`, `LogLine`, `LogKind`, `ScriptKind`
`runner.rs`	Fork-safe Python subprocess execution (`select(2)` loop, SIGALRM, process group kill)
`worker.rs`	Worker process loop: receive requests, fork grandchildren, forward frames
`pool.rs`	`WorkerPool` — manages worker sockets, dispatches jobs, collects results
`session.rs`	`SessionManager` — session lifecycle, hero_db IDs, tokio broadcast for SSE, `delete` support
`ipc.rs`	`write_msg` / `read_msg` — length-prefixed JSON framing
`runtime.rs`	`RuntimeManager` — stateless reader/writer of `HERO_PYTHON_RUNTIMES_DIR`; tracks `installing`/`operational` state
`uv.rs`	`uv` wrappers: `create_venv`, `pip_install`, `ensure_python`, `python_interpreter`
`openrpc.rs`	`AppState`, `rpc_router`, all JSON-RPC method handlers
`sockets.rs`	UDS helpers: `socket_dir`, `service_socket_dir`, `socket_path`, `bind_unix_socket`
`sse.rs`	`GET /sse/session/{id}` — SSE fan-out from `SessionManager` broadcast channel
`proc_log.rs`	Forwards `LogLine` records to `hero_proc` / hero_log
`discovery.rs`	`GET /discovery` — service metadata endpoint
`assets.rs`	`rust-embed` wrapper for compiled-in assets

Environment variables

Variable	Default	Effect
`HERO_RUNNER_PY_WORKERS`	`4`	Number of worker processes to pre-fork
`HERO_RUNNER_PY_FORWARD_LOGS`	`false`	Forward each log line to `hero_proc` (set `1`/`true`/`yes`/`on`)
`HERO_SOCKET_DIR`	`~/hero/var/sockets`	Root directory for all Hero UDS sockets
`HERO_PYTHON_RUNTIMES_DIR`	`~/hero/python_runtimes`	Where Python venvs are stored
`UV_PATH`	(searched in PATH)	Override path to the `uv` binary

Building and running

# Build everything
cargo build --release

# Run the server (uv must be in PATH or UV_PATH set)
./target/release/hero_runner_py_server

# Optional: run the admin web UI (connects to the server via UDS)
./target/release/hero_runner_py_admin

On first startup the server automatically creates the default runtime (Python 3.12 with a standard set of packages) if it does not already exist, then runs a smoke test to confirm the runtime is operational. The smoke test session is cleaned up automatically on success; on failure it is kept so you can inspect its logs.

Quick start

Check health

echo '{"jsonrpc":"2.0","id":1,"method":"health","params":{}}' \
  | nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock

{"jsonrpc":"2.0","id":1,"result":{"status":"ok","service":"hero_runner_py","version":"0.1.0"}}

Create a Python runtime

echo '{"jsonrpc":"2.0","id":1,"method":"runtime_init","params":{"name":"default","python_version":"3.12","modules":["requests","numpy"]}}' \
  | nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock

This installs Python 3.12 (via uv python install) and creates a venv with requests and numpy. The runtime.toml is written immediately with installing: true so callers can see progress; installing is cleared to false once all packages are installed.

Run a script

echo '{"jsonrpc":"2.0","id":2,"method":"session_start","params":{"script":"print(\"hello from Python!\")","runtime_name":"default","timeout_ms":5000}}' \
  | nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock

{"jsonrpc":"2.0","id":2,"result":{"session_id":1}}

If the runtime is still installing, session_start fails immediately with:

Runtime 'default' is still being installed. Please wait for it to finish.

Stream live output

curl --unix-socket ~/hero/var/sockets/hero_runner_py/rpc.sock \
  http://localhost/sse/session/1

data: {"kind":"log","line":{"session_id":1,"timestamp_ms":1715000000000,"seq":1,"kind":"stdout","line":"hello from Python!"}}

data: {"kind":"result","result":{"success":true,"exit_code":0,"stdout":"hello from Python!","stderr":"","error":null}}

Get the final result

echo '{"jsonrpc":"2.0","id":3,"method":"session_result","params":{"session_id":1}}' \
  | nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock

JSON-RPC API

All requests follow JSON-RPC 2.0 over the UDS at $HERO_SOCKET_DIR/hero_runner_py/rpc.sock.

`health`

Liveness probe.

{"jsonrpc":"2.0","id":1,"method":"health","params":{}}

Response: {"status":"ok","service":"hero_runner_py","version":"0.1.0"}

`session_start`

Start a Python script execution. Returns a session_id immediately; execution runs asynchronously.

Param	Type	Required	Default	Description
`script`	string	yes	—	Python source code or file path
`script_kind`	`"code"` \| `"file"`	no	`"code"`	Inline code or path to `.py` file
`runtime_name`	string	no	`"default"`	Which runtime venv to use
`working_dir`	string	no	`"."`	Working directory for the Python process
`env_vars`	object	no	`{}`	Extra environment variables
`timeout_ms`	integer	no	`0` (no limit)	Hard kill timeout in milliseconds
`forward_logs`	boolean	no	server default	Override per-session log forwarding

{
  "jsonrpc":"2.0","id":1,"method":"session_start",
  "params":{
    "script": "import time\nfor i in range(5):\n    print(i)\n    time.sleep(0.1)",
    "runtime_name": "default",
    "timeout_ms": 10000
  }
}

Response: {"session_id": 42}

`session_stop`

Request cancellation of a running session. The Python process group is killed immediately.

{"jsonrpc":"2.0","id":2,"method":"session_stop","params":{"session_id":42}}

Response: {"ok": true}

`session_delete`

Stop a session if still running, then remove it from the session list entirely.

{"jsonrpc":"2.0","id":2,"method":"session_delete","params":{"session_id":42}}

Response: {"ok": true}

`session_list`

List all known sessions (running and finished).

{"jsonrpc":"2.0","id":3,"method":"session_list","params":{}}

Response: {"sessions": [{"id":1,"status":"succeeded",...}, ...]}

`session_get`

Get metadata for a single session.

{"jsonrpc":"2.0","id":4,"method":"session_get","params":{"session_id":42}}

Response: SessionInfo object with id, status, runtime_name, started_at_ms, ended_at_ms.

`session_logs`

Page through the captured log history for a session. Supports cursor-based pagination.

Param	Type	Required	Description
`session_id`	integer	yes	Session to query
`after_ts_ms`	integer	no	Return only lines after this timestamp
`after_seq`	integer	no	Return only lines after this sequence number
`limit`	integer	no	Maximum lines to return (default: 5000)

{
  "jsonrpc":"2.0","id":5,"method":"session_logs",
  "params":{"session_id":42,"limit":50}
}

Response:

{
  "lines": [
    {"session_id":42,"timestamp_ms":1715000001000,"seq":1,"kind":"stdout","line":"0"},
    {"session_id":42,"timestamp_ms":1715000001100,"seq":2,"kind":"stdout","line":"1"}
  ],
  "next_ts_ms": 1715000001100,
  "next_seq": 2
}

`session_result`

Get the final ScriptResult for a completed session. Returns null if the session is still running.

{"jsonrpc":"2.0","id":6,"method":"session_result","params":{"session_id":42}}

Response:

{
  "success": true,
  "exit_code": 0,
  "stdout": "0\n1\n2\n3\n4",
  "stderr": "",
  "error": null
}

`runtime_init`

Create a new Python runtime (venv + packages). Installs the requested Python version if not already present. Writes runtime.toml immediately with installing: true; clears it once all steps complete.

Param	Type	Required	Description
`name`	string	yes	Unique runtime name
`python_version`	string	yes	Python version, e.g. `"3.12"` or `"3.12.3"`
`modules`	string[]	no	pip packages to install immediately

{
  "jsonrpc":"2.0","id":7,"method":"runtime_init",
  "params":{"name":"ml","python_version":"3.11","modules":["torch","numpy"]}
}

Response: RuntimeInfo object.

`runtime_list`

List all registered runtimes.

{"jsonrpc":"2.0","id":8,"method":"runtime_list","params":{}}

Response: {"runtimes": [{"name":"default","python_version":"3.12","installing":false,"operational":true,...}, ...]}

`runtime_get`

Get metadata for a single runtime.

{"jsonrpc":"2.0","id":9,"method":"runtime_get","params":{"name":"default"}}

Response: RuntimeInfo with name, python_version, modules, venv_path, created_at, updated_at, installing, operational, smoke_session_id.

`runtime_install`

Install additional pip packages into an existing runtime. Sets installing: true during the install, clears it on completion.

{
  "jsonrpc":"2.0","id":10,"method":"runtime_install",
  "params":{"name":"default","modules":["pandas","matplotlib"]}
}

Response: {"modules": ["requests","pandas","matplotlib"]} (full updated module list).

`runtime_delete`

Delete a runtime and its virtual environment directory.

{"jsonrpc":"2.0","id":11,"method":"runtime_delete","params":{"name":"old-runtime"}}

Response: {"ok": true}

`runtime_test`

Run a smoke test against a runtime (print('hello world') + sys.version). Waits for completion, cleans up the internal session, and marks the runtime operational on success.

{"jsonrpc":"2.0","id":12,"method":"runtime_test","params":{"name":"default"}}

Response:

{
  "success": true,
  "exit_code": 0,
  "stdout": "hello world\npython: 3.12.3 ...",
  "stderr": ""
}

SSE streaming

Subscribe to live output from a session:

GET /sse/session/{session_id}

The response is a standard text/event-stream. Each event carries a JSON-encoded WorkerFrame:

data: {"kind":"log","line":{"session_id":1,"timestamp_ms":...,"seq":3,"kind":"stdout","line":"hello"}}

data: {"kind":"result","result":{"success":true,"exit_code":0,...}}

The stream closes after the Result frame is delivered. If the session is already complete when you subscribe, all stored log lines are replayed first, then the result is sent, and the stream closes.

Runtime management

Runtimes are isolated Python virtual environments managed by uv. Each runtime:

Has a unique name used in session_start as runtime_name
Uses a pinned Python version installed by uv python install
Has its own pip-installed packages that do not affect other runtimes
Is stored on disk and survives server restarts (metadata in runtime.toml)

Lifecycle states

runtime_init called
      │
      ▼
installing: true   ← visible immediately via runtime_list
      │  uv downloads Python, creates venv, installs packages
      ▼
installing: false, operational: false
      │
      ▼  smoke test runs automatically (or manually via runtime_test)
      │
   ┌──┴──────────────────┐
   │ passed              │ failed
   ▼                     ▼
operational: true    smoke_session_id set  ← session kept for log inspection

Default runtime

On first startup, the server automatically initialises a runtime named "default" (Python 3.12, standard packages) if it doesn't already exist, then runs the smoke test. Subsequent restarts skip init if the runtime already exists.

The default module set:

ipython, requests, httpx, pydantic, rich, pandas, numpy, python-dotenv, toml

Admin UI

hero_runner_py_admin serves a dashboard at the configured HTTP port (default: proxied through hero_router).

Sessions tab

Live table of all sessions with status badges, runtime name, duration
Per-row: view detail, view logs, stop (running only)
Bulk selection → Stop selected or Delete selected (stops if running, removes from list)
Search by ID or status

Runtimes tab

Table with Name, Python Version, Modules, Status, Created
Status badges: installing… (animated spinner), operational, smoke failed (with link to session logs), pending
Right-click context menu on any row:
- Test it works — opens a modal that runs runtime_test and shows exact script + output
- Install modules — opens the install modal
- Delete runtime — confirms then deletes
Action buttons in each row (test, install modules, delete)
New Runtime button opens a creation modal

Admin → Maintenance tab

Stop all running sessions
Clear local UI state
Performance benchmark — configurable N sessions, concurrency level, and target runtime:
- Runs print('hello world') N times
- Live progress bar + elapsed timer
- Results: total time, avg/session (ms), sessions/sec
- Stop button to abort mid-run

Timeout behavior

Timeout is enforced at the POSIX level with zero threading:

The grandchild process calls alarm(secs) before starting I/O
A custom SIGALRM handler runs in the grandchild:
- Calls kill(-pgid, SIGKILL) to kill the entire Python process group
- Calls kill(pid, SIGKILL) to ensure the direct child is killed
- Resets SIGALRM to SIG_DFL so the grandchild itself can die on the next alarm
The Python interpreter is placed in its own process group via setpgid(0, 0) in pre_exec, so the kill signal reaches all child processes the script may have spawned

This design is fork-safe on macOS: no threads are spawned inside the grandchild, avoiding the pthread_create / mutex deadlock that can occur when a multithreaded parent is forked.

Logging

Live stream

While a script runs, each stdout/stderr line is emitted as a WorkerFrame::Log frame on the worker → parent pipe, broadcast to all SSE subscribers via a tokio::broadcast channel.

Persistent storage

Each log line is also written to hero_log using the source path:

runner.session.<session_id>

Content format per line:

<kind>:<timestamp_ms>:<seq>|<text>

For example:

stdout:1715000001000:3|processing item 7
stderr:1715000001050:4|warning: low memory

This format allows exact cursor-based pagination in session_logs using after_ts_ms + after_seq.

Process logging

Service-level events (startup, socket bind, runtime init, smoke test results) are emitted to herolib_core::Logger with source "hero_runner_py" and forwarded to hero_log in the usual Hero stack way.

Testing

The hero_runner_py_tests crate contains integration tests split into two groups:

Group	Requires	Run condition
Pure-Rust (IPC, types, runner basics)	Nothing	Always
uv-dependent (pool, sessions, runtimes)	`uv` + Python	`--include-ignored`

Running tests

# Pure-Rust tests only
cargo test -p hero_runner_py_tests

# All tests including uv-dependent ones
cargo test -p hero_runner_py_tests -- --include-ignored --test-threads=1

--test-threads=1 is required because tests set HERO_PYTHON_RUNTIMES_DIR via std::env::set_var and fork a WorkerPool per test — concurrent tests would race on the env var.

Hero stack integration

hero_runner_py integrates with the standard Hero stack services:

Service	Integration
`hero_log`	Stores session log lines at `runner.session.<id>`, service events at `hero_runner_py`
`hero_db`	INCR for persistent, monotonically increasing session IDs that survive restarts
`hero_proc`	Optional live log forwarding per line when `HERO_RUNNER_PY_FORWARD_LOGS=true`
`hero_sockets`	UDS conventions: `$HERO_SOCKET_DIR/hero_runner_py/rpc.sock`, mode `0o660`
`hero_runner_py_admin`	Dashboard UI — proxies API calls and renders session/runtime/benchmark views

README.md

hero_runner_py

Table of Contents

What it does

Architecture

Three-level fork hierarchy

IPC protocol

Runtime layout

Socket layout

Crates

Source map

hero_runner_py_server/src/

Environment variables

Building and running

Quick start

Check health

Create a Python runtime

Run a script

Stream live output

Get the final result

JSON-RPC API

health

session_start

session_stop

session_delete

session_list

session_get

session_logs

session_result

runtime_init

runtime_list

runtime_get

runtime_install

runtime_delete

runtime_test

SSE streaming

Runtime management

Lifecycle states

Default runtime

Admin UI

Sessions tab

Runtimes tab

Admin → Maintenance tab

Timeout behavior

Logging

Live stream

Persistent storage

Process logging

Testing

Running tests

Hero stack integration

`hero_runner_py_server/src/`

`health`

`session_start`

`session_stop`

`session_delete`

`session_list`

`session_get`

`session_logs`

`session_result`

`runtime_init`

`runtime_list`

`runtime_get`

`runtime_install`

`runtime_delete`

`runtime_test`