initial setup & implementation #1
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
purpose
the crates
implementation
this story is to implement all basics, so we can further develop and test, it should not be tested nor complete, its to have all basics there which is the API's, the openrpc complete interface, the UI
Implementation Spec for Issue #1
Objective
Scaffold the
hero_gpurepository as a Hero-conventions-compliant Cargo workspace that exposes an OpenRPC-based gateway in front of an SGLang (https://docs.sglang.io) Python backend, providing OpenAI- and Anthropic-compatible HTTP endpoints, an installer that usesuvto set up the Python runtime, an SDK, an admin UI dashboard, and a CLI binary that self-registers withhero_proc. This is a basics-only scaffold: every file, route, RPC method, UI tab, and Makefile target must exist and compile, but handlers may return placeholder/stub data. No tests or runtime correctness required.Requirements
/Volumes/T7/code0/hero_gpuwith the five crates from the issue, plus the mandatoryhero_gpu_examplescrate perhero_crates_best_practices_check.hero_gpu_serverexposes JSON-RPC 2.0 over~/hero/var/sockets/hero_gpu/rpc.sock(no TCP) and ships a completeopenrpc.jsoncovering: backend lifecycle, model management, chat/completions, embeddings, health/status, and configuration.hero_gpu_apimounts OpenAI-compatible (/v1/chat/completions,/v1/completions,/v1/models,/v1/embeddings) and Anthropic-compatible (/v1/messages,/v1/models) HTTP routes, mapped onto the OpenRPC backend, served on~/hero/var/sockets/hero_gpu/openapi.sock.hero_gpu_installerprovidesinstall,status,health,updateoperations usinguvto provision the Python runtime + SGLang dependencies, in a sandboxed directory under~/hero/var/hero_gpu/.hero_gpu_sdkis a thin client generated fromcrates/hero_gpu_server/openrpc.jsonusing theopenrpc_client!macro.hero_gpu_uiis an Askama + Bootstrap 5.3.3 admin dashboard followinghero_ui_dashboard, served on~/hero/var/sockets/hero_gpu/ui.sock, with all standard tabs (Models, Backend, Playground, Logs, Stats, Admin, Docs) wired to stub data.hero_gpuCLI binary handles--start/--stopand registers all components (server, UI, installer-managed SGLang process) withhero_procviahero_proc_sdk, perhero_proc_service_selfstart.naming_convention(snake_case throughout),cargo_deps(git URLs withbranch = "development", no[patch]committed),rust_toolchain(edition 2024, norust-versionpin), andherolib_import(version0.5.0pinned for hero crates).Makefile+buildenv.sh+scripts/build_lib.sh+scripts/install.sh+scripts/download-assets.sh, all binaries installed to~/hero/bin/.Files to Modify/Create
Workspace root
Cargo.toml— workspace manifest, members, shared dependenciesMakefile— build, install, run, stop, status, logs targetsbuildenv.sh— INSTALL_DIR, BINARIES, RELEASE_DIRscripts/build_lib.sh,scripts/install.sh,scripts/download-assets.sh,scripts/sglang_runner.sh.gitignore,README.md,.forgejo/workflows/build.yamlcrates/hero_gpu_server/— server binary + OpenRPC speccrates/hero_gpu_api/— library crate for OpenAI + Anthropic adapterscrates/hero_gpu_sdk/— generated client viaopenrpc_client!crates/hero_gpu_installer/— uv-based sandbox installer (lib + thin CLI)crates/hero_gpu_ui/— Askama admin dashboard binary with all required tabscrates/hero_gpu/— CLI binary handling--start/--stop+ hero_proc registrationcrates/hero_gpu_examples/— examples + integration test skeletonImplementation Plan
Step 1: Workspace skeleton, root configuration, and build automation
Files: workspace
Cargo.toml,Makefile,buildenv.sh,scripts/*.sh,.gitignore,README.md,.forgejo/workflows/build.yaml.Cargo.toml:resolver = "3",edition = "2024",version = "0.5.0", members for all six crates, full[workspace.dependencies]block (hero_rpc_derive, hero_rpc_openrpc, hero_proc_sdk, tokio, axum, hyper, tower, tower-http, serde, serde_json, anyhow, thiserror, tracing, reqwest, askama, askama_axum, rust-embed, dirs, clap, etc.). Internal path entries for the three library crates. NO[patch]sections.buildenv.sh:BINARIES="hero_gpu hero_gpu_server hero_gpu_ui hero_gpu_installer",INSTALL_DIR=$HOME/hero/bin.Makefile: standard targets (help,version,status,build,builddev,check,fmt,lint,test,download-assets,install,installdev,run,stop,restart,logs,clean,all).scripts/sglang_runner.sh: bash entrypoint launched by hero_proc —exec uv run --project ~/hero/var/hero_gpu/python python -m sglang.launch_server --host 127.0.0.1 --port 30000 "$@".cargo check --workspace, clippy, build.Dependencies: none
Step 2:
hero_gpu_sdk— generated client crateFiles:
crates/hero_gpu_sdk/Cargo.toml,src/lib.rs.hero_rpc_derive,hero_rpc_openrpc(transport feature).lib.rs:openrpc_client!("../hero_gpu_server/openrpc.json");+pub fn default_socket_path()resolving$HERO_SOCKET_DIR/hero_gpu/rpc.sock.Dependencies: Step 1, Step 3 (needs openrpc.json to exist for macro expansion).
Step 3:
hero_gpu_server— JSON-RPC core, OpenRPC spec, sglang client, socketsFiles:
Cargo.toml,openrpc.json,src/main.rs,src/lib.rs,src/state.rs,src/socket.rs,src/discovery.rs,src/sglang/{mod,types}.rs,src/rpc/mod.rs,src/rpc/methods/{health,models,chat,completions,embeddings,backend}.rs,src/openapi/mod.rs.rpc.discover,system.health,system.stats,models.list,models.load,models.unload,models.info,chat.completions,chat.completions_stream,text.completions,embeddings.create,backend.start/stop/status/config_get/config_set. All result schemas wrapped in objects (never bare arrays — herolib_openrpc requirement).rpc.sockandopenapi.sockunder~/hero/var/sockets/hero_gpu/.http://127.0.0.1:30000, with typed request/response shapes mirroring SGLang docs.Dependencies: Step 1
Step 4:
hero_gpu_api— OpenAI + Anthropic compatible HTTP routersFiles:
crates/hero_gpu_api/Cargo.toml,src/lib.rs,src/openai/{mod,types,handlers}.rs,src/anthropic/{mod,types,handlers}.rs,src/mapping.rs.Backendtrait thathero_gpu_server::AppStateimplements → keeps dependency arrow one-directional (no cycle)./v1/chat/completions,/v1/completions,/v1/embeddings,/v1/models,/v1/models/:id./v1/messages,/v1/models.Dependencies: Step 3 (Backend trait must exist before the server can implement it)
Step 5:
hero_gpu_installer— uv-based sandbox installerFiles:
Cargo.toml,src/lib.rs,src/sandbox.rs,src/uv.rs,src/components.rs,src/health.rs,src/bin/hero_gpu_installer.rs.Sandboxrooted at~/hero/var/hero_gpu/python/(env overrideHERO_GPU_SANDBOX_ROOT).Uvwrapper shelling out viatokio::process::Command:uv venv,uv pip install "sglang[all]",uv run.is_installed()andinstall().install,status,health,update.Dependencies: Step 1
Step 6:
hero_gpu_ui— admin dashboard binaryFiles:
Cargo.toml,src/{main,state,assets,routes}.rs,src/handlers/{index,fragments}.rs,templates/{base,index}.html,templates/partials/sidebar.html,templates/partials/tabs/{models,backend,playground,logs,stats,admin,docs}.html,static/css/dashboard.css,static/js/dashboard.js,static/favicon.svg,docs/{overview,cli,sdk,api}.md.~/hero/var/sockets/hero_gpu/ui.sock.hero_gpu_sdkclient;/fragments/*Unpoly partial endpoints.BASE_PATHresolved fromX-Forwarded-Prefixfor hero_router embedding.Dependencies: Step 2 (SDK), Step 3 (openrpc.json)
Step 7:
hero_gpuCLI — service selfstart + hero_proc registrationFiles:
crates/hero_gpu/Cargo.toml,src/main.rs,src/service.rs.--start/--stopflags only (per hero_proc_service_selfstart).build_service_definition()registers three actions:hero_gpu_server— exec sibling binary,kill_othercleans rpc.sock + openapi.sock, openrpc_socket health check.hero_gpu_ui— exec sibling binary,kill_othercleans ui.sock, openrpc_socket health check.hero_gpu_sglang— bash interpreter running~/hero/var/hero_gpu/scripts/sglang_runner.sh, env vars (HERO_GPU_MODEL, HERO_GPU_PORT=30000),kill_other.port = vec![30000], http health check athttp://127.0.0.1:30000/health.--startcallshero_proc_factory().restart_service(...),--stopcallsstop_service(...).Dependencies: Step 1, Step 2, Step 3
Step 8:
hero_gpu_examples— examples + integration test skeletonFiles:
Cargo.toml,examples/{health,list_models,openai_chat}.rs,tests/integration.rs.default_socket_path()) and exercise the basic methods.openai_chat.rsposts to the OpenAI-compat route via hero_router URL.tests/integration.rsperhero_crates_best_practices_checkCheck 13 —setup/teardownwith~/hero/bin/hero_gpu --start/--stop, three#[ignore]tests (test_server_health,test_rpc_discover,test_basic_api_operations).Dependencies: Step 2, Step 3, Step 7
Acceptance Criteria
cargo check --workspacesucceeds.cargo build --release --workspaceproduces four binaries:hero_gpu,hero_gpu_server,hero_gpu_ui,hero_gpu_installer.cargo clippy --workspace -- -D warningspasses (allowdead_code/unused_variableson stub handlers if necessary).make installcopies all four binaries to~/hero/bin/.~/hero/bin/hero_gpu --helplists--startand--stop.crates/hero_gpu_server/openrpc.jsonis valid OpenRPC 1.3.0 (jqparses) and lists every implemented method.hero_gpu_sdk::HeroGpuClientcompiles (proves macro expansion).hero_gpu_uiserves HTML on~/hero/var/sockets/hero_gpu/ui.sockwith all standard tabs in the rendered DOM.hero_gpu_serverservesPOST /rpc,GET /openrpc.json,GET /health,GET /.well-known/heroservice.jsonon rpc.sock; OpenAI/Anthropic routes on openapi.sock.hero_gpu_installer status/healthexit cleanly when components missing (report "not installed", no panic).[patch]sections committed; no TCP listeners in any Rust binary (only the Python SGLang process binds TCP).crates/hero_gpu_examples/tests/integration.rsexists with the three required#[ignore]test stubs.Notes
RpcError::not_implemented(...). The OpenRPC spec, route table, UI tab structure, Makefile targets, and CLI lifecycle wiring are the load-bearing artifacts.hero_gpu_apias a separate crate. Honored per the issue. ABackendtrait insidehero_gpu_api, implemented byhero_gpu_server::AppState, keeps the dependency arrow one-directional and avoids cycles.uv runas a third hero_proc action managed by thehero_gpuCLI. Rust talks to SGLang over127.0.0.1:30000(SGLang doesn't speak UDS). This is the documented exception — only the external Python process binds TCP.{"models": [...]}) — bare arrays breakopenrpc_client!macro expansion.~/hero/var/sockets/ishero_gpu/(the service name). All binaries share the directory with socketsrpc.sock,ui.sock,openapi.sock.hero_gpuCLI handles--start/--stop. Server, UI, and installer binaries are plain foreground processes (the installer CLI'sinstall/status/health/updatesubcommands are unrelated to lifecycle).Build & Check Results
Per the issue scope ("basics only — should not be tested nor complete"), no functional tests were run. Verification was limited to type-checking and release build of the full workspace.
cargo check --workspacecargo build --release --workspacetarget/release/hero_gpu(4.4 MB)target/release/hero_gpu_server(4.8 MB)target/release/hero_gpu_ui(4.0 MB)target/release/hero_gpu_installer(1.5 MB)Crates checked
hero_gpu_api(library)hero_gpu_sdk(library, generated from openrpc.json)hero_gpu_installer(lib + bin)hero_gpu_server(lib + bin)hero_gpu_ui(bin)hero_gpu(bin — CLI)hero_gpu_examples(examples + ignored integration tests)Notes
cargo checkandcargo build --release.hero_gpu_examplesintegration tests remain#[ignore]-gated and were not run.Implementation Summary
Scaffolded the
hero_gpuworkspace per the spec — every artifact required for further development is in place, all crates type-check, and all four release binaries build cleanly. Stub data is returned where SGLang would normally answer; the surface area (RPC methods, HTTP routes, UI tabs, lifecycle hooks) is complete.Crates created (7 workspace members)
hero_gpu--start/--stop, registers all actions with hero_prochero_gpu_serverhero_gpu_apiBackendtraithero_gpu_sdkHeroGPUClient) viaopenrpc_client!macrohero_gpu_installerinstall/status/health/update)hero_gpu_uihero_gpu_examples#[ignore]integration test skeletonOpenRPC interface (16 methods)
All method results are wrapped objects (no bare arrays — required for
openrpc_client!macro expansion).rpc.discover,system.health,system.statsmodels.list,models.info,models.load,models.unloadchat.completions,chat.completions_stream,text.completions,embeddings.createbackend.start,backend.stop,backend.status,backend.config_get,backend.config_setHTTP API surface (
openapi.sock)OpenAI-compatible:
POST /v1/chat/completions,POST /v1/completions,POST /v1/embeddingsGET /v1/models,GET /v1/models/{id}Anthropic-compatible:
POST /v1/messages,POST /anthropic/v1/messagesGET /anthropic/v1/modelsDiscovery:
GET /health,GET /.well-known/heroservice.jsonSockets (under
~/hero/var/sockets/hero_gpu/)rpc.sock— JSON-RPC 2.0 (consumed byhero_gpu_sdk)openapi.sock— OpenAI/Anthropic HTTPui.sock— admin dashboardNo Rust binary opens a TCP listener. The Python SGLang process is the sole exception (binds
127.0.0.1:30000).Service lifecycle (
hero_gpu --start/--stop)Three hero_proc actions registered by the CLI:
hero_gpu_server— exec sibling binary,kill_othercleansrpc.sock+openapi.sock, openrpc_socket health check.hero_gpu_ui— exec sibling binary,kill_othercleansui.sock, openrpc_socket health check.hero_gpu_sglang— bash interpreter running~/hero/var/hero_gpu/scripts/sglang_runner.sh, envHERO_GPU_MODEL+HERO_GPU_PORT=30000,kill_other.port = vec![30000], http health check on:30000/health.Per the
hero_proc_service_selfstartpattern: only thehero_gpuCLI handles--start/--stop. The server, UI, and installer binaries are plain foreground processes (the installer CLI's own subcommands are unrelated to lifecycle).Build automation
Makefile—help,version,status,build,builddev,check,fmt,lint,test,download-assets,install,installdev,run,stop,restart,logs,logs-ui,clean,allbuildenv.sh—BINARIES,INSTALL_DIR=$HOME/hero/binscripts/install.sh,scripts/build_lib.sh,scripts/download-assets.sh(Bootstrap 5.3.3, Bootstrap Icons 1.13.1, Unpoly 3.12.1),scripts/sglang_runner.sh.forgejo/workflows/build.yaml— workspace check + clippy + release buildVerification
cargo check --workspacecargo build --release --workspaceFour release binaries produced (
hero_gpu4.4 MB,hero_gpu_server4.8 MB,hero_gpu_ui4.0 MB,hero_gpu_installer1.5 MB).Notes / out-of-scope
crates/hero_gpu_examples/tests/integration.rsare present but#[ignore]d.backend.start/backend.stopRPCs currently return guidance to usehero_procdirectly; tying them into thehero_proc_sdkfactory from inside the server is a follow-up.stream: truepaths compile but emit a single "streaming not yet implemented" SSE event. Real token streaming will plug in once the SGLang-side wiring is done.Ready for further development.