- Rust 47.5%
- JavaScript 26.5%
- HTML 14%
- CSS 11.5%
- Python 0.5%
Replace duplicated Bootstrap, Bootstrap Icons, Unpoly, highlight.js, marked,
ansi_up, Chart.js and connection-status.js with shared versions from
hero_admin_lib. Add /static/shared/{*path} route. Remove ~10-13 MB of
duplicated static files per crate.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|---|---|---|
| .hero | ||
| crates | ||
| Cargo.lock | ||
| Cargo.toml | ||
| Cargo.toml.hero_builder_backup | ||
| PURPOSE.md | ||
| README.md | ||
| SPEC.md | ||
hero_runner_py
A pre-fork Python script execution service for the Hero stack. It manages isolated Python virtual environments (via uv) and executes scripts with live log streaming, persistent session tracking, and timeout enforcement — all with zero interpreter startup overhead thanks to a pre-forked worker pool.
Table of Contents
- What it does
- Architecture
- Crates
- Source map
- Environment variables
- Building and running
- Quick start
- JSON-RPC API
- SSE streaming
- Runtime management
- Admin UI
- Timeout behavior
- Logging
- Testing
- Hero stack integration
What it does
hero_runner_py lets callers submit a Python script (inline code or a file path) and get back:
- Live stdout/stderr streamed as Server-Sent Events while the script runs
- A final result (exit code, stdout, stderr, success flag) stored persistently
- Session history browsable via the JSON-RPC API or the admin UI
- Timeout enforcement — scripts are hard-killed after
timeout_msmilliseconds, even across a POSIX fork boundary
All of this is delivered over a Unix Domain Socket using JSON-RPC 2.0.
Architecture
┌─────────────────────────────────────┐
│ parent process │
│ (multi-threaded tokio/axum server) │
│ │
│ JSON-RPC 2.0 over UDS │
│ SessionManager RuntimeManager │
│ SSE fan-out hero_log persist │
└──────────────┬──────────────────────┘
│ UnixStream socket pair
┌─────────────▼──────────────┐
│ worker process │ ← pre-forked, single-threaded
│ (one per HERO_RUNNER_PY_WORKERS) │
└──────────────┬─────────────┘
│ fork per job
┌─────────────▼──────────────┐
│ grandchild process │ ← isolated per script
│ runner::execute() │
│ select(2) I/O loop │
│ SIGALRM timeout handler │
└──────────────┬──────────────┘
│ spawn
┌─────────────▼──────────────┐
│ Python process │ ← own process group (setpgid)
│ uv venv interpreter │
└─────────────────────────────┘
Three-level fork hierarchy
| Level | Process | Responsibility |
|---|---|---|
| Parent | tokio/axum | Accept connections, route RPC, manage sessions, fan-out SSE |
| Worker | single-threaded | Hold a socket pair; fork a grandchild per job; forward frames |
| Grandchild | single-threaded | Run runner::execute; stream WorkerFrame records back |
IPC protocol
All pipes and UnixStream channels use length-prefixed JSON:
[4 bytes: body length as u32 little-endian][JSON body bytes]
The grandchild emits a sequence of WorkerFrame records per job:
Started { pid: i32 } ← first frame
Log { line: LogLine } ← zero or more, as stdout/stderr arrives
Result { result: ScriptResult } ← final, terminal frame
Runtime layout
$HERO_PYTHON_RUNTIMES_DIR/
<name>/
runtime.toml ← metadata (name, python_version, modules, paths, timestamps,
installing, operational, smoke_session_id)
bin/python ← uv-created venv interpreter
lib/ ← installed packages
...
Each runtime is a complete uv virtual environment created in its own named directory.
The runtime.toml tracks three lifecycle flags:
| Field | Meaning |
|---|---|
installing |
true while uv is downloading Python / creating the venv / installing packages |
operational |
true after a successful smoke-test (print('hello world')) |
smoke_session_id |
Session ID of a failed smoke-test, kept for log inspection |
Socket layout
$HERO_SOCKET_DIR/hero_runner_py/
rpc.sock ← JSON-RPC 2.0 endpoint
discovery.sock ← service discovery (name, version)
Crates
| Crate | Binary | Purpose |
|---|---|---|
hero_runner_py_server |
hero_runner_py_server |
Core service: worker pool, JSON-RPC, SSE, runtime management |
hero_runner_py_admin |
hero_runner_py_admin |
Web UI proxy — serves a dashboard over HTTP |
hero_runner_py_tests |
— | Integration test suite |
Source map
hero_runner_py_server/src/
| File | Purpose |
|---|---|
main.rs |
Entry point: uv check → pool fork → tokio runtime → default runtime init → smoke test → axum serve |
lib.rs |
Crate root, module declarations, public re-exports |
types.rs |
ScriptRequest, ScriptResult, WorkerFrame, LogLine, LogKind, ScriptKind |
runner.rs |
Fork-safe Python subprocess execution (select(2) loop, SIGALRM, process group kill) |
worker.rs |
Worker process loop: receive requests, fork grandchildren, forward frames |
pool.rs |
WorkerPool — manages worker sockets, dispatches jobs, collects results |
session.rs |
SessionManager — session lifecycle, hero_db IDs, tokio broadcast for SSE, delete support |
ipc.rs |
write_msg / read_msg — length-prefixed JSON framing |
runtime.rs |
RuntimeManager — stateless reader/writer of HERO_PYTHON_RUNTIMES_DIR; tracks installing/operational state |
uv.rs |
uv wrappers: create_venv, pip_install, ensure_python, python_interpreter |
openrpc.rs |
AppState, rpc_router, all JSON-RPC method handlers |
sockets.rs |
UDS helpers: socket_dir, service_socket_dir, socket_path, bind_unix_socket |
sse.rs |
GET /sse/session/{id} — SSE fan-out from SessionManager broadcast channel |
proc_log.rs |
Forwards LogLine records to hero_proc / hero_log |
discovery.rs |
GET /discovery — service metadata endpoint |
assets.rs |
rust-embed wrapper for compiled-in assets |
Environment variables
| Variable | Default | Effect |
|---|---|---|
HERO_RUNNER_PY_WORKERS |
4 |
Number of worker processes to pre-fork |
HERO_RUNNER_PY_FORWARD_LOGS |
false |
Forward each log line to hero_proc (set 1/true/yes/on) |
HERO_SOCKET_DIR |
~/hero/var/sockets |
Root directory for all Hero UDS sockets |
HERO_PYTHON_RUNTIMES_DIR |
~/hero/python_runtimes |
Where Python venvs are stored |
UV_PATH |
(searched in PATH) | Override path to the uv binary |
Building and running
# Build everything
cargo build --release
# Run the server (uv must be in PATH or UV_PATH set)
./target/release/hero_runner_py_server
# Optional: run the admin web UI (connects to the server via UDS)
./target/release/hero_runner_py_admin
On first startup the server automatically creates the default runtime (Python 3.12 with a standard set of packages) if it does not already exist, then runs a smoke test to confirm the runtime is operational. The smoke test session is cleaned up automatically on success; on failure it is kept so you can inspect its logs.
Quick start
Check health
echo '{"jsonrpc":"2.0","id":1,"method":"health","params":{}}' \
| nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock
{"jsonrpc":"2.0","id":1,"result":{"status":"ok","service":"hero_runner_py","version":"0.1.0"}}
Create a Python runtime
echo '{"jsonrpc":"2.0","id":1,"method":"runtime_init","params":{"name":"default","python_version":"3.12","modules":["requests","numpy"]}}' \
| nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock
This installs Python 3.12 (via uv python install) and creates a venv with requests and numpy. The runtime.toml is written immediately with installing: true so callers can see progress; installing is cleared to false once all packages are installed.
Run a script
echo '{"jsonrpc":"2.0","id":2,"method":"session_start","params":{"script":"print(\"hello from Python!\")","runtime_name":"default","timeout_ms":5000}}' \
| nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock
{"jsonrpc":"2.0","id":2,"result":{"session_id":1}}
If the runtime is still installing, session_start fails immediately with:
Runtime 'default' is still being installed. Please wait for it to finish.
Stream live output
curl --unix-socket ~/hero/var/sockets/hero_runner_py/rpc.sock \
http://localhost/sse/session/1
data: {"kind":"log","line":{"session_id":1,"timestamp_ms":1715000000000,"seq":1,"kind":"stdout","line":"hello from Python!"}}
data: {"kind":"result","result":{"success":true,"exit_code":0,"stdout":"hello from Python!","stderr":"","error":null}}
Get the final result
echo '{"jsonrpc":"2.0","id":3,"method":"session_result","params":{"session_id":1}}' \
| nc -U ~/hero/var/sockets/hero_runner_py/rpc.sock
JSON-RPC API
All requests follow JSON-RPC 2.0 over the UDS at $HERO_SOCKET_DIR/hero_runner_py/rpc.sock.
health
Liveness probe.
{"jsonrpc":"2.0","id":1,"method":"health","params":{}}
Response: {"status":"ok","service":"hero_runner_py","version":"0.1.0"}
session_start
Start a Python script execution. Returns a session_id immediately; execution runs asynchronously.
| Param | Type | Required | Default | Description |
|---|---|---|---|---|
script |
string | yes | — | Python source code or file path |
script_kind |
"code" | "file" |
no | "code" |
Inline code or path to .py file |
runtime_name |
string | no | "default" |
Which runtime venv to use |
working_dir |
string | no | "." |
Working directory for the Python process |
env_vars |
object | no | {} |
Extra environment variables |
timeout_ms |
integer | no | 0 (no limit) |
Hard kill timeout in milliseconds |
forward_logs |
boolean | no | server default | Override per-session log forwarding |
{
"jsonrpc":"2.0","id":1,"method":"session_start",
"params":{
"script": "import time\nfor i in range(5):\n print(i)\n time.sleep(0.1)",
"runtime_name": "default",
"timeout_ms": 10000
}
}
Response: {"session_id": 42}
session_stop
Request cancellation of a running session. The Python process group is killed immediately.
{"jsonrpc":"2.0","id":2,"method":"session_stop","params":{"session_id":42}}
Response: {"ok": true}
session_delete
Stop a session if still running, then remove it from the session list entirely.
{"jsonrpc":"2.0","id":2,"method":"session_delete","params":{"session_id":42}}
Response: {"ok": true}
session_list
List all known sessions (running and finished).
{"jsonrpc":"2.0","id":3,"method":"session_list","params":{}}
Response: {"sessions": [{"id":1,"status":"succeeded",...}, ...]}
session_get
Get metadata for a single session.
{"jsonrpc":"2.0","id":4,"method":"session_get","params":{"session_id":42}}
Response: SessionInfo object with id, status, runtime_name, started_at_ms, ended_at_ms.
session_logs
Page through the captured log history for a session. Supports cursor-based pagination.
| Param | Type | Required | Description |
|---|---|---|---|
session_id |
integer | yes | Session to query |
after_ts_ms |
integer | no | Return only lines after this timestamp |
after_seq |
integer | no | Return only lines after this sequence number |
limit |
integer | no | Maximum lines to return (default: 5000) |
{
"jsonrpc":"2.0","id":5,"method":"session_logs",
"params":{"session_id":42,"limit":50}
}
Response:
{
"lines": [
{"session_id":42,"timestamp_ms":1715000001000,"seq":1,"kind":"stdout","line":"0"},
{"session_id":42,"timestamp_ms":1715000001100,"seq":2,"kind":"stdout","line":"1"}
],
"next_ts_ms": 1715000001100,
"next_seq": 2
}
session_result
Get the final ScriptResult for a completed session. Returns null if the session is still running.
{"jsonrpc":"2.0","id":6,"method":"session_result","params":{"session_id":42}}
Response:
{
"success": true,
"exit_code": 0,
"stdout": "0\n1\n2\n3\n4",
"stderr": "",
"error": null
}
runtime_init
Create a new Python runtime (venv + packages). Installs the requested Python version if not already present. Writes runtime.toml immediately with installing: true; clears it once all steps complete.
| Param | Type | Required | Description |
|---|---|---|---|
name |
string | yes | Unique runtime name |
python_version |
string | yes | Python version, e.g. "3.12" or "3.12.3" |
modules |
string[] | no | pip packages to install immediately |
{
"jsonrpc":"2.0","id":7,"method":"runtime_init",
"params":{"name":"ml","python_version":"3.11","modules":["torch","numpy"]}
}
Response: RuntimeInfo object.
runtime_list
List all registered runtimes.
{"jsonrpc":"2.0","id":8,"method":"runtime_list","params":{}}
Response: {"runtimes": [{"name":"default","python_version":"3.12","installing":false,"operational":true,...}, ...]}
runtime_get
Get metadata for a single runtime.
{"jsonrpc":"2.0","id":9,"method":"runtime_get","params":{"name":"default"}}
Response: RuntimeInfo with name, python_version, modules, venv_path, created_at, updated_at, installing, operational, smoke_session_id.
runtime_install
Install additional pip packages into an existing runtime. Sets installing: true during the install, clears it on completion.
{
"jsonrpc":"2.0","id":10,"method":"runtime_install",
"params":{"name":"default","modules":["pandas","matplotlib"]}
}
Response: {"modules": ["requests","pandas","matplotlib"]} (full updated module list).
runtime_delete
Delete a runtime and its virtual environment directory.
{"jsonrpc":"2.0","id":11,"method":"runtime_delete","params":{"name":"old-runtime"}}
Response: {"ok": true}
runtime_test
Run a smoke test against a runtime (print('hello world') + sys.version). Waits for completion, cleans up the internal session, and marks the runtime operational on success.
{"jsonrpc":"2.0","id":12,"method":"runtime_test","params":{"name":"default"}}
Response:
{
"success": true,
"exit_code": 0,
"stdout": "hello world\npython: 3.12.3 ...",
"stderr": ""
}
SSE streaming
Subscribe to live output from a session:
GET /sse/session/{session_id}
The response is a standard text/event-stream. Each event carries a JSON-encoded WorkerFrame:
data: {"kind":"log","line":{"session_id":1,"timestamp_ms":...,"seq":3,"kind":"stdout","line":"hello"}}
data: {"kind":"result","result":{"success":true,"exit_code":0,...}}
The stream closes after the Result frame is delivered. If the session is already complete when you subscribe, all stored log lines are replayed first, then the result is sent, and the stream closes.
Runtime management
Runtimes are isolated Python virtual environments managed by uv. Each runtime:
- Has a unique name used in
session_startasruntime_name - Uses a pinned Python version installed by
uv python install - Has its own
pip-installed packages that do not affect other runtimes - Is stored on disk and survives server restarts (metadata in
runtime.toml)
Lifecycle states
runtime_init called
│
▼
installing: true ← visible immediately via runtime_list
│ uv downloads Python, creates venv, installs packages
▼
installing: false, operational: false
│
▼ smoke test runs automatically (or manually via runtime_test)
│
┌──┴──────────────────┐
│ passed │ failed
▼ ▼
operational: true smoke_session_id set ← session kept for log inspection
Default runtime
On first startup, the server automatically initialises a runtime named "default" (Python 3.12, standard packages) if it doesn't already exist, then runs the smoke test. Subsequent restarts skip init if the runtime already exists.
The default module set:
ipython, requests, httpx, pydantic, rich, pandas, numpy, python-dotenv, toml
Admin UI
hero_runner_py_admin serves a dashboard at the configured HTTP port (default: proxied through hero_router).
Sessions tab
- Live table of all sessions with status badges, runtime name, duration
- Per-row: view detail, view logs, stop (running only)
- Bulk selection → Stop selected or Delete selected (stops if running, removes from list)
- Search by ID or status
Runtimes tab
- Table with Name, Python Version, Modules, Status, Created
- Status badges:
installing…(animated spinner),operational,smoke failed(with link to session logs),pending - Right-click context menu on any row:
- Test it works — opens a modal that runs
runtime_testand shows exact script + output - Install modules — opens the install modal
- Delete runtime — confirms then deletes
- Test it works — opens a modal that runs
- Action buttons in each row (test, install modules, delete)
- New Runtime button opens a creation modal
Admin → Maintenance tab
- Stop all running sessions
- Clear local UI state
- Performance benchmark — configurable N sessions, concurrency level, and target runtime:
- Runs
print('hello world')N times - Live progress bar + elapsed timer
- Results: total time, avg/session (ms), sessions/sec
- Stop button to abort mid-run
- Runs
Timeout behavior
Timeout is enforced at the POSIX level with zero threading:
- The grandchild process calls
alarm(secs)before starting I/O - A custom
SIGALRMhandler runs in the grandchild:- Calls
kill(-pgid, SIGKILL)to kill the entire Python process group - Calls
kill(pid, SIGKILL)to ensure the direct child is killed - Resets
SIGALRMtoSIG_DFLso the grandchild itself can die on the next alarm
- Calls
- The Python interpreter is placed in its own process group via
setpgid(0, 0)inpre_exec, so the kill signal reaches all child processes the script may have spawned
This design is fork-safe on macOS: no threads are spawned inside the grandchild, avoiding the pthread_create / mutex deadlock that can occur when a multithreaded parent is forked.
Logging
Live stream
While a script runs, each stdout/stderr line is emitted as a WorkerFrame::Log frame on the worker → parent pipe, broadcast to all SSE subscribers via a tokio::broadcast channel.
Persistent storage
Each log line is also written to hero_log using the source path:
runner.session.<session_id>
Content format per line:
<kind>:<timestamp_ms>:<seq>|<text>
For example:
stdout:1715000001000:3|processing item 7
stderr:1715000001050:4|warning: low memory
This format allows exact cursor-based pagination in session_logs using after_ts_ms + after_seq.
Process logging
Service-level events (startup, socket bind, runtime init, smoke test results) are emitted to herolib_core::Logger with source "hero_runner_py" and forwarded to hero_log in the usual Hero stack way.
Testing
The hero_runner_py_tests crate contains integration tests split into two groups:
| Group | Requires | Run condition |
|---|---|---|
| Pure-Rust (IPC, types, runner basics) | Nothing | Always |
| uv-dependent (pool, sessions, runtimes) | uv + Python |
--include-ignored |
Running tests
# Pure-Rust tests only
cargo test -p hero_runner_py_tests
# All tests including uv-dependent ones
cargo test -p hero_runner_py_tests -- --include-ignored --test-threads=1
--test-threads=1 is required because tests set HERO_PYTHON_RUNTIMES_DIR via std::env::set_var and fork a WorkerPool per test — concurrent tests would race on the env var.
Hero stack integration
hero_runner_py integrates with the standard Hero stack services:
| Service | Integration |
|---|---|
hero_log |
Stores session log lines at runner.session.<id>, service events at hero_runner_py |
hero_db |
INCR for persistent, monotonically increasing session IDs that survive restarts |
hero_proc |
Optional live log forwarding per line when HERO_RUNNER_PY_FORWARD_LOGS=true |
hero_sockets |
UDS conventions: $HERO_SOCKET_DIR/hero_runner_py/rpc.sock, mode 0o660 |
hero_runner_py_admin |
Dashboard UI — proxies API calls and renders session/runtime/benchmark views |