- Rust 73.4%
- TeX 11.2%
- JavaScript 5.2%
- Python 4%
- Shell 2.9%
- Other 3.3%
Expose run workspaces as downloadable user-facing outputs and route Hero service questions through direct safe service intent handling instead of generic LLM responses. |
||
|---|---|---|
| .forgejo/workflows | ||
| .githooks | ||
| .github/workflows | ||
| benchmarks | ||
| crates | ||
| deploy | ||
| docs | ||
| examples | ||
| scripts | ||
| skills | ||
| .env.example | ||
| .gitignore | ||
| BENCHMARK_RUNNER.md | ||
| BENCHMARKS.md | ||
| buildenv.sh | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| CONTRIBUTING.md | ||
| DEMO.md | ||
| Dockerfile | ||
| HERMES_SOURCE_NOTES.md | ||
| Makefile | ||
| OPENCLAW_SOURCE_NOTES.md | ||
| PROJECT-STATE.md | ||
| README.md | ||
| SECRETS.md | ||
| SECURITY.md | ||
| shrimp.yml | ||
| shrimp.yml.example | ||
| TUTORIAL.md | ||
| UI_VERIFICATION.md | ||
| VERIFICATION.md | ||
Hero Shrimp
Hero Shrimp is a single-user autonomous work runtime built as a Rust workspace with SQLite persistence, a live admin control plane, and a phased autonomy loop.
Today it is strongest as an autonomous repo worker. The direction is to extend that same executor-grounded, restart-safe, inspectable loop to a broader personal assistant — chat, memory, notes, briefings — without losing the "proof over model claims" stance that makes the repo-worker mode trustworthy.
What It Does Today
- runs locally as a
shrimpddaemon (engine + channels + RPC + admin socket) with ashrimpthin client over a Unix-socket RPC - persists conversations, runs, tasks, subagents, memories, playbooks, dreams, and operator artifacts in SQLite
- executes tools through a policy-heavy executor with audit logging and idempotent invocation reuse
- runs
/doaiwork as phased sub-agent execution with retry, review-repair, and replan behavior - exposes runtime, doctor, autonomy, memory, and artifact APIs through a Unix-socket admin plane
What It Is Growing Into
- general-chat quality outside the coding path, sharing the same memory and instruction stack
- personal memory scoped per user (facts, preferences, context) routed through the existing tiered-memory spine
- cross-session continuity on CLI and Telegram, not just inside one open session
- selective breadth (web search, briefings, notes) — each new capability routed through the executor with typed plans, persisted artefacts, and review
The Durable Stance
Breadth is layered on top of a loop that insists on:
- explicit proof instead of model-reported completion
- visible recovery instead of silent retries
- inspectable state via a single Unix-socket control plane
- compact deploy footprint (one binary, one SQLite file)
- single-user coherence — never a multi-tenant platform
Quick Start
Three minutes, one terminal. The smallest working setup is a single API key — no config files needed.
make build # builds target/release/shrimpd + target/release/shrimp
export OPENROUTER_API_KEYS=sk-or-... # one key is enough; comma-separated for rotation
./target/release/shrimpd & # daemon boots on bundled defaults
./target/release/shrimp send "hello" # thin client → one-shot prompt → reply → exit
That's it — the daemon ships with a bundled default.yml baked in, so the first launch works without a user config file. Add a .env and a shrimp.yml only when you want to override behaviour:
cp .env.example .env # all honored env vars, grouped by purpose
cp shrimp.yml.example shrimp.yml # channels, models, backends, routing, autonomy
Two binaries, one SQLite file under $SHRIMP_DATA_DIR, one admin socket and one RPC socket under $SHRIMP_SOCKET_DIR. No port to open.
Configuration discovery
Neither file is passed as a CLI flag. The daemon auto-locates each at startup:
| File | Lookup order (first hit wins) | Fallback |
|---|---|---|
.env |
./.env (cwd), then $SHRIMP_HOME/secrets.env |
process env only |
shrimp.yml |
$SHRIMP_CONFIG_PATH, then ./shrimp.yml, then $XDG_CONFIG_HOME/hero_shrimp/shrimp.yml |
bundled default.yml |
If both files are absent the daemon still boots — shrimp.yml falls back to the compiled-in default catalog (OpenRouter / AI broker / Groq backends, the bundled model aliases, default channels), and credentials come from whatever the shell that ran shrimpd exported. If no LLM keys are set anywhere, shrimpd boots cleanly and the first chat message fails at request time with a clear "no providers configured" — startup never silently swallows a missing key. See SECRETS.md for the canonical credentials layout ($SHRIMP_HOME/secrets.env + $SHRIMP_HOME/secrets/<file>).
Running It
Two binaries: shrimpd (daemon) and shrimp (thin client). The daemon does the work; the thin client is what you reach for from a script, cron job, or editor binding.
shrimpd — long-lived daemon
Hosts the engine, the channels configured in runtime.channels (cli, admin, telegram, whatsapp), and a Unix-socket RPC server for thin clients. Writes a pidfile so double-starts fail loudly rather than racing.
./target/release/shrimpd # blocks until SIGTERM/SIGINT
A systemd unit template lives at deploy/shrimp-daemon.service. make run and make dev are convenience wrappers around cargo run -p shrimpd.
For local development, make fixtures populates the DB with realistic demo data via the seed cargo feature. The feature is off by default — production release binaries do not include the seed code path and refuse shrimpd seed with a clear error if invoked.
shrimp send <text> — one-shot thin client
Connects to a running shrimpd over the RPC socket, sends a prompt on a fresh session, prints the reply, exits. Does not boot the engine, the DB, or any channels. Useful for scripts, cron jobs, editor keybindings.
./target/release/shrimp send "what's my status?"
./target/release/shrimp send --json "..." # structured output
shrimp tui — interactive terminal UI (in progress)
shrimp tui is reserved for an RPC-backed interactive TUI that attaches to a running daemon. The implementation is currently a stub — until it lands, use shrimp send <text> for one-shots and the admin dashboard for interactive inspection.
Configuration
Two files, clean split: .env holds secrets only; shrimp.yml holds behaviour.
.env— API keys, bot tokens, admin token. Never committed. See.env.example.shrimp.yml— channels, safety level, model catalog, LLM routing, custom backends. Seeshrimp.yml.examplefor a heavily-commented reference of every knob.
Precedence on startup: env var > shrimp.yml > bundled default (compiled in). Omit a section in shrimp.yml to keep the bundled default; set the matching SHRIMP_* env var to override a single knob from a deploy script.
The four knobs that matter on day one:
| Where | Key | What it does |
|---|---|---|
.env |
OPENROUTER_API_KEYS |
Primary LLM backend. Comma-separated for failover. |
shrimp.yml |
runtime.channels |
List: cli, admin, telegram, whatsapp. Single source of truth for which channels start. |
shrimp.yml |
runtime.safety |
strict / standard / relaxed. Gates tool policies. |
shrimp.yml |
llm.role_candidates.* |
Per-role candidate pools (fast, balanced, deep, vision, …) that drive the router. |
Custom backends
shrimp.yml exposes an open-ended backends: list so you can point Hero Shrimp at any OpenAI-compatible or Anthropic-compatible endpoint — private coding plans, enterprise gateways, a local vLLM, SambaNova, Together, Fireworks, etc. A backend entry only references the env var name holding the key; the value is resolved from std::env::var() at request time so secrets never land in YAML:
backends:
- name: my-anthropic-coding-plan
kind: anthropic_compatible
base_url: https://api.anthropic.com
api_key_env: MY_ANTHROPIC_CODING_KEY
- name: my-local-vllm
kind: openai_compatible
base_url: http://192.168.1.42:8000/v1
# no api_key_env — local, auth-free
models:
- alias: my-private-sonnet
tier: balanced
caps: [text, vision, tools, long_ctx, json_mode]
serve:
- { backend: my-anthropic-coding-plan, model: claude-sonnet-4-6 }
Then put my-private-sonnet in one of the llm.role_candidates pools and it flows through the same routing, circuit-breaker, and telemetry stack as the bundled aliases.
Channels
Telegram: set TELEGRAM_TOKEN and (for allowlist) TELEGRAM_ALLOWED_USERS (numeric user IDs, not usernames). WhatsApp: first launch prints a QR code to the admin event stream — pair once, storage persists. Admin auth: optionally set ADMIN_TOKEN, then pass it as ?token=… or Authorization: Bearer ….
Live reload
GET /api/config returns the active config with secrets redacted — apiKeyEnv is the env var name, apiKeyPresent is a bool, backends + models are included with capabilities and serving resolution. POST /api/config/reload re-reads shrimp.yml from disk, rebuilds the catalog, and emits config:reloaded on the event bus — no restart required.
Using Hero Shrimp
What the daemon actually does for you, day to day.
Chat — single message, single reply
./target/release/shrimp send "what's a Rust BTreeMap good for?"
./target/release/shrimp send --json "summarize $(cat README.md)" # structured envelope (id, session, reply)
shrimp send opens a fresh session per call, gets one reply, and exits. The session, the user message, the model reply, and any tool calls are all persisted to SQLite — visible later under the admin Sessions tab.
For a continuing conversation pinned to a session id, the admin chat surface (or a future shrimp tui) is the right shape. shrimp send is for scripts and one-shots.
Autonomous work — /doai
Inside any session — CLI, admin chat, Telegram, WhatsApp — trigger a phased autonomous run with:
/doai add a CHANGELOG.md and wire it into the release Makefile target
/doai status # what's in flight
/doai promote <file> # atomically swap a sandbox artifact into the workspace
Hero Shrimp produces a typed plan (Run → Phases), executes each phase with proof capture (verify + unified_diff artifacts), runs a reviewer that can repair-or-replan on failure, and writes everything to the timeline. Inspectable end-to-end via the admin Autonomy tab.
Skills — markdown bundles the agent can route into
Every shipped skill lives under skills/<category>/<name>/SKILL.md and is embedded into the binary at compile time (include_dir!). Discovery is automatic on boot. Operator-authored skills land under $SHRIMP_HOME/skills/learned/ and are layered on top.
To add one without writing code, ask the agent to do it (the skill-forge meta-skill calls pattern_mine_audit over your audit log, clusters recurring tool sequences, and writes a new SKILL.md autonomously by default). To add one by hand, create the file and POST /api/skills/reload — no daemon restart.
Skills with executable helpers ship Python scripts alongside the markdown and run via uv run with PEP 723 inline metadata. No global pip, no per-skill venv to manage. See docs/ADR-011-runtime-conventions.md for the rationale.
Web search — registry with ordered fallback
The web_search tool resolves through a provider registry rather than a single hardcoded backend. shrimp.yml web: block:
web:
search:
primary: duckduckgo
fallback: [exa, serper]
cache_ttl_secs: 600
DuckDuckGo is the default primary because it requires no key. Add EXA_API_KEYS=… (or SERPER_API_KEYS=…) and the fallback engages on primary failure. Both LLM and search keys honor the plural form (<NAME>S, comma-separated) for round-robin rotation.
Memory — what the agent remembers
Every persisted fact, preference, lesson, and playbook is in the memories table with provenance (source, confidence, source_run_id, source_phase_id, TTL). Recall is hybrid (FTS + vector + graph + recency, score-explainable via /api/memories/explain). The Memories tab in the admin dashboard shows everything; /api/memories/feedback lets the operator nudge ranking.
To save something deliberately: /remember <key>: <value> in any chat. To inspect what was used to build the last prompt: open the admin Sessions tab → pick the session → check the recall block in the trace.
Long-running operator surfaces
| Surface | Shape | When to reach for it |
|---|---|---|
| Admin dashboard | HTTP over Unix socket | Live monitoring, audit drilldowns, autonomy timeline, manual chat |
shrimp send |
One-shot RPC | Scripts, cron, editor keybindings |
| Telegram | Bot channel | Mobile / phone-of-record use |
| Bot channel | Same as Telegram, different account graph | |
shrimp tui |
RPC-backed TUI | (in progress) interactive terminal session against a running daemon |
All four surfaces share the same engine, the same memory, the same audit trail. There is no "primary" channel — they're peers attached to one daemon.
Per-session trace bundles
For postmortem or "what did the agent actually do" investigations:
curl --unix-socket "$SHRIMP_SOCKET_DIR/hero_shrimp/ui.sock" \
http://localhost/api/sessions/<session-id>/trace > trace.json
Returns the full session: messages, audit rows filtered by session_id, runs, tasks, conversation metadata. Single JSON file you can attach to an issue or replay locally.
The Admin Surface
The admin dashboard is served over a Unix socket ($SHRIMP_SOCKET_DIR/hero_shrimp/ui.sock) — no port, not reachable from the network. To browse it:
# socat is one way; anything that forwards a Unix socket to a TCP port works
socat TCP-LISTEN:8123,fork UNIX-CONNECT:$HOME/.local/run/hero_shrimp/ui.sock
open http://localhost:8123
What the dashboard shows: live messages, audit rows filterable by session/run/phase, token usage per model, memories, cron + one-shot jobs, channel health, SSE stream of runtime events. This is also the operator debugger — the autonomy timeline, briefings, queue snapshot, and session graph all live here.
Autonomous Work (/doai)
Inside any session — CLI, TUI, Telegram, admin chat — trigger phased autonomous execution with:
/doai implement the README improvements discussed
That enters the plan → execute → review → recover loop: the planner produces a typed Run with phases, the executor runs each phase with proof capture (verify artifacts, unified_diff artifacts), the reviewer accepts/rejects/repairs, and the supervisor replans or blocks with a diagnosis. Every state transition hits the timeline and is inspectable through the admin surface. Check /doai status for in-flight runs; /doai promote <file> atomically swaps sandbox artifacts into the workspace.
Verifying It Works
make test # cargo test --workspace
make lint # cargo clippy, -D warnings
make fmt # cargo fmt --all
make install-hooks # pre-commit: fmt-check + workspace tests
Pre-commit hooks fail the commit on formatting drift or test regressions — the guardrail has fired and blocked real drift multiple times this cycle.
Docker
make docker-build
make docker-run # mounts shrimp-data + shrimp-workspace volumes
Useful when you don't want the runtime poking at $HOME. The workspace sandbox still enforces path confinement inside the container.
Main Docs
PROJECT-STATE.md: single source of truth — direction, current state, architecture summary, work-engine entities, comparison summary, upcoming milestones, deferred items, non-goalsHERMES_SOURCE_NOTES.md: deep local-source notes from Hermes inspectionOPENCLAW_SOURCE_NOTES.md: deep local-source notes from OpenClaw inspectionBENCHMARKS.md: benchmark rubric and scoring modelBENCHMARK_RUNNER.md: current lightweight benchmark harnessDEMO.md: demo flow for showing the runtime as an autonomous repo worker
Architecture Snapshot
Five workspace crates, two binaries:
crates/shrimp-types: pure types — wire protocol (proto) plus the domain spine (domain: Run/Phase/Timeline, contract types, strong ids,RunStateStore)crates/shrimp-store: config, DB, LLM routing, memory, queueing, events, reliability, runtime maintenancecrates/shrimp-engine: tool catalog, executor, shell runtime, autonomy, subagents, agent loop, request pipeline, channel adapters (gateways::{admin, telegram, whatsapp}— feature-gated)crates/shrimpd(binaryshrimpd): daemon — boots the engine, channels, admin socket, and RPC servercrates/shrimp(binaryshrimp): thin client —sendtoday, RPC-backed TUI scheduled
Channel surface is gated by cargo features on shrimp-engine (and propagated through shrimpd): admin, telegram, whatsapp are default-on; --no-default-features builds drop all three. See docs/ADR-007-channel-feature-gates.md and docs/ADR-010-design-b-crate-collapse.md for the architecture rationale.
Operational Reality
Current local verified state:
cargo fmt --allpassescargo test --workspacepasses
The current repo also now includes source-backed notes for Hermes and OpenClaw so the comparison work does not have to be rediscovered in later sessions.
Product Thesis
Hero Shrimp should continue to win on a narrow axis:
single-user autonomous coding and repo operations with explicit proof, visible recovery, and compact self-hosting.
It should not chase:
- channel breadth for its own sake
- generic assistant-platform sprawl
- giant plugin ecosystems before the core work engine is airtight