implement iroh #1

Closed
opened 2026-04-16 07:16:56 +00:00 by despiegk · 3 comments
Owner

see Here’s a practical spec for a fully replicated KVS over Iroh in Rust.

The right base is iroh-docs on top of iroh, iroh-blobs, and iroh-gossip. iroh gives you authenticated/encrypted peer-to-peer QUIC transport with relay fallback; iroh-docs gives you a mutable replicated document model; and iroh-docs already depends on iroh-blobs plus iroh-gossip for content transfer and live sync. In iroh-docs, each entry is keyed by key + author + namespace, and the entry value is a BLAKE3 hash + size + timestamp for the content, while the actual bytes are stored/transferred separately. (docs.iroh.computer)

What to build

Use one shared namespace secret as the initial “shared secret”.

That maps very naturally to Iroh’s model:

  • the namespace secret is the write capability for the replicated document;
  • every node imports the same namespace secret;
  • every node keeps a full local copy of all document entries and all blob values;
  • every node is both client and replica holder. (GitHub)

So the first version should be:

System goal

  • N nodes
  • 1 logical KVS
  • full replication on every participating node
  • shared write secret for all nodes
  • eventually consistent
  • last-write-wins by application policy
  • no sharding
  • no quorum/raft in v1

That is an important point: this is not Raft. iroh-docs is a replicated sync substrate using reconciliation and live sync, not a linearizable consensus system. So this design gives you local-first replicated state, not strict serializable consensus. (Docs.rs)

Use exactly one Iroh document namespace for one KVS.

Key layout

Store application keys as plain UTF-8 bytes, for example:

config/db/url
users/alice/profile
jobs/123/status

Value layout

Store the actual value bytes in blobs, and store the KVS mapping in the doc.

Conceptually:

doc entry key   = "users/alice/profile"
doc entry value = hash(blob) + len + timestamp + author
blob content    = raw value bytes (JSON / msgpack / binary)

This matches how iroh-docs is designed: the doc entry points to content by hash, and the content is handled separately. (GitHub)

Delete semantics

Represent delete as a tombstone. iroh-docs already has prefix delete semantics at replica level, but for a KVS I would make deletes explicit in your app layer:

{ "kind": "tombstone" }

That is simpler than physically removing history at the start.

Replication model

Every node does all of this:

  1. starts an iroh::Endpoint
  2. starts iroh-blobs
  3. starts iroh-gossip
  4. starts iroh-docs
  5. opens/imports the shared namespace
  6. connects to at least one known peer
  7. subscribes to live updates
  8. stores all received blobs locally
  9. rebuilds in-memory KVS index from local doc state on startup

This fits the documented stack exactly: Docs is spawned with an Endpoint, blobs store, and gossip protocol, then attached to an Iroh router with the docs/blob/gossip ALPNs. (Docs.rs)

Membership model for v1

Keep it simple.

v1

  • one shared secret distributed out of band
  • one static peer list or seed list
  • any node with the shared secret can read/write
  • every node connects to one or more seeds and syncs everything

later

  • per-node identities
  • read-only tickets
  • rotating namespace secrets
  • application ACLs
  • signed membership records inside the KVS itself

Iroh already has a DocTicket type containing a document capability plus peer addresses; sharing can be read or write capability depending on how you construct/export it. (Docs.rs)


Rust spec

Crates

[dependencies]
anyhow = "1"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"

iroh = "0.97"
iroh-blobs = "0.99"
iroh-gossip = "0.97"
iroh-docs = "0.97"

Those versions reflect the docs currently published for the crates I checked. (Docs.rs)

Main components

1. Node runtime

pub struct KvNode {
    pub endpoint: iroh::Endpoint,
    pub docs: iroh_docs::protocol::Docs,
    pub namespace_id: iroh_docs::NamespaceId,
    pub author: iroh_docs::Author,
}

2. Persistent local storage

Use file-backed storage for both:

  • doc metadata
  • blobs

iroh-docs supports persistent file-based storage backed by redb, and persists all replicas to a single file. (Docs.rs)

3. Shared-secret bootstrap

Your config should contain:

  • namespace_secret
  • known_peers[]

Example config:

{
  "namespace_secret": "base32-or-hex-encoded-secret",
  "known_peers": [
    "...serialized endpoint addr..."
  ]
}

4. KVS API

Expose a small local API:

async fn put(key: &str, value: Vec<u8>) -> anyhow::Result<()>;
async fn get(key: &str) -> anyhow::Result<Option<Vec<u8>>>;
async fn delete(key: &str) -> anyhow::Result<()>;
async fn list(prefix: &str) -> anyhow::Result<Vec<String>>;
async fn sync_once() -> anyhow::Result<()>;

Conflict policy

Because iroh-docs entries are keyed by key + author + namespace, the same logical app key can have multiple authored entries. Your KVS layer should collapse those into one visible value by policy. (GitHub)

For v1, use:

Visible value for a logical key = newest timestamp wins

  • if timestamps tie, break by author id lexicographically
  • tombstone beats older value

That gives deterministic convergence.

Full replication guarantee

Your app-level rule should be:

a node is “healthy” only if it stores every doc entry and every referenced blob for the namespace.

Implementation:

  • on startup, scan all current doc entries
  • for each live entry, ensure blob exists locally
  • subscribe to updates and fetch/store missing blobs immediately
  • periodically reconcile all keys and missing blobs

Because iroh-docs tracks the content hash/length but not the content itself, your KVS code must treat “entry present but blob missing” as an incomplete replica and repair it. (GitHub)


Suggested architecture

One namespace, many authors

Do not use one author for the whole cluster.

Better:

  • one shared namespace secret for write capability
  • one author keypair per node
  • each node writes under its own author

Why:

  • better audit trail
  • deterministic conflict debugging
  • easier future ACLs
  • closer to Iroh’s intended model, where author identity is separate from namespace write capability. (GitHub)

Seed peers

Each node should know a few seed peers:

  • node A, B, C as bootstrap peers
  • after connecting, peer exchange can be added later

No consensus in v1

Do not promise:

  • strict ordering
  • single leader
  • linearizable reads
  • immediate read-your-writes across partitions

Promise instead:

  • authenticated transport
  • encrypted transport
  • live sync
  • eventual convergence
  • full replica on every healthy node. (docs.iroh.computer)

Example: node setup

This example follows the documented pattern for standing up Iroh + blobs + gossip + docs.

use anyhow::Result;
use iroh::{endpoint::presets, protocol::Router, Endpoint};
use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ALPN as BLOBS_ALPN};
use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN};
use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN};

pub struct Runtime {
    pub endpoint: Endpoint,
    pub docs: Docs,
}

pub async fn start_runtime() -> Result<Runtime> {
    let endpoint = Endpoint::bind(presets::N0).await?;

    let blobs = MemStore::default();
    let gossip = Gossip::builder().spawn(endpoint.clone());

    let docs = Docs::memory()
        .spawn(endpoint.clone(), (*blobs).clone(), gossip.clone())
        .await?;

    let _router = Router::builder(endpoint.clone())
        .accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None))
        .accept(GOSSIP_ALPN, gossip)
        .accept(DOCS_ALPN, docs.clone())
        .spawn();

    Ok(Runtime { endpoint, docs })
}

That matches the published setup flow for iroh-docs. (Docs.rs)

For production, switch memory stores to persistent file-backed stores.


Example: logical KVS interface

Below is the shape I would implement, even if exact method names may need minor adjustment against the current crate API.

use anyhow::Result;
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum KvValue {
    Bytes(Vec<u8>),
    Tombstone,
}

pub struct KvStore {
    // wraps docs + blob store + namespace + author + local materialized index
}

impl KvStore {
    pub async fn put(&self, key: &str, value: Vec<u8>) -> Result<()> {
        // 1. add blob content locally
        // 2. get hash + len
        // 3. insert/update doc entry for key using this node's author
        Ok(())
    }

    pub async fn get(&self, key: &str) -> Result<Option<Vec<u8>>> {
        // 1. resolve latest visible entry for logical key
        // 2. if tombstone => None
        // 3. fetch/read blob content by hash
        Ok(None)
    }

    pub async fn delete(&self, key: &str) -> Result<()> {
        // write tombstone value as a normal entry
        Ok(())
    }
}

Materialized index

Maintain a local map:

BTreeMap<String, ResolvedEntry>

Where:

struct ResolvedEntry {
    author: Vec<u8>,
    timestamp: u64,
    hash: Option<[u8; 32]>,
    len: u64,
    tombstone: bool,
}

This index is rebuilt from the doc on startup and updated from subscriptions during runtime.


Example: using one shared secret

This is the important bootstrapping pattern.

Node 1

  • generate namespace secret once
  • persist it
  • import/open doc
  • export/share ticket or publish endpoint address

iroh-docs exposes namespace secrets as the write capability, and DocTicket can carry either read or write capability plus peer addresses. (GitHub)

Pseudo-flow:

// first cluster creation
let namespace_secret = generate_once();
save(namespace_secret);

// every node
let namespace_secret = load_shared_secret();
let namespace_id = namespace_secret.id();

// import namespace into docs store
// open the doc/replica

Node 2..N

  • receive the same namespace secret out of band
  • know one or more seed EndpointAddrs
  • import namespace
  • connect and sync
  • from then on, live updates

Example: app-level write path

pub async fn put_json<T: serde::Serialize>(
    kv: &KvStore,
    key: &str,
    value: &T,
) -> Result<()> {
    let bytes = serde_json::to_vec(value)?;
    kv.put(key, bytes).await
}

Under the hood:

put(key, value):
  blob_hash = store_blob(value)
  docs.insert(key, author, blob_hash, value.len)
  publish/sync

This is aligned with Replica::insert and Replica::hash_and_insert, which insert records referencing content by hash/len. (Docs.rs)


Example: conflict resolution

Suppose:

  • node A writes foo=1
  • node B writes foo=2
  • both offline
  • later they reconnect

Your visible KVS result should be resolved deterministically:

fn resolve(entries: &[EntryMeta]) -> Option<EntryMeta> {
    entries.iter().cloned().max_by(|a, b| {
        a.timestamp
            .cmp(&b.timestamp)
            .then_with(|| a.author.cmp(&b.author))
    })
}

That gives convergence without consensus.


Example project layout

iroh_kv/
  Cargo.toml
  src/
    main.rs
    config.rs
    runtime.rs
    kv.rs
    sync.rs
    index.rs
    api.rs

config.rs

#[derive(Debug, serde::Serialize, serde::Deserialize)]
pub struct Config {
    pub namespace_secret: String,
    pub author_secret: Option<String>,
    pub known_peers: Vec<String>,
    pub data_dir: String,
}

runtime.rs

  • start endpoint
  • start blobs/gossip/docs
  • load persistent stores

kv.rs

  • put/get/delete/list

sync.rs

  • connect to seeds
  • repair missing blobs
  • subscribe to live updates

index.rs

  • rebuild materialized key index
  • apply timestamp resolution

api.rs

  • expose local HTTP or UDS API

Operational spec

Durability

  • persist namespace secret
  • persist per-node author secret
  • persist doc store
  • persist blob store

Recovery

On restart:

  1. open local persistent stores
  2. open shared namespace
  3. rebuild visible KVS index from all entries
  4. verify referenced blobs exist
  5. connect to peers and repair missing data

Health check

Return healthy only if:

  • Iroh endpoint is bound
  • docs engine running
  • blob store accessible
  • namespace opened
  • all referenced blobs present locally
  • at least one peer connected recently, or single-node mode enabled

Anti-entropy

Even with live updates, also run periodic full reconciliation:

  • every 30s–5m depending on size
  • compare local doc state with peers
  • fetch missing blobs

Security model for v1

What the shared secret means

Your shared secret should be the namespace secret.

That means:

  • anyone holding it can write
  • anyone without it should at most get read capability
  • transport itself is still peer-authenticated and encrypted by Iroh endpoint identity. (Docs.rs)

Good enough for first version

  • distribute namespace secret manually
  • pin trusted peer endpoint ids
  • use per-node author keys for auditability

Not good enough long term

  • one secret for all writers is coarse
  • no revocation without rotation
  • membership changes require secret rotation

What not to do

Do not start with:

  • raft
  • leader election
  • quorum writes
  • sharding
  • partial replicas
  • per-key permissions

Those all fight the simplicity of your stated goal.

Your stated goal is:

“all participating nodes have all keys and values”

That is full-replication event-synced KVS, and Iroh docs/blobs/gossip are a better fit than Raft for the first version.


Minimal design decision summary

Best first implementation

  • one namespace = one KVS
  • shared namespace secret = shared write secret
  • one author per node
  • full data replication to every node
  • persistent local stores
  • deterministic LWW resolution
  • static seed peers
  • periodic repair + live sync

Why this fits Iroh

Because Iroh is explicitly built for direct authenticated/encrypted P2P connections with relay fallback, and iroh-docs is explicitly a replicated multi-dimensional key-value document protocol using set reconciliation, backed by blobs and gossip. (docs.iroh.computer)

License

iroh-docs is dual-licensed MIT OR Apache-2.0. (Docs.rs)

If you want, I can turn this into a compilable Rust starter project with main.rs, kv.rs, and a simple local HTTP API.

make the KVS as part of our codescalers

then make sure we store all info we need to remember into that KVS

make the KVS as a higher level library we can use in the ...codescalers_server

the manual sync we were implemented can be removed

see Here’s a practical spec for a **fully replicated KVS over Iroh** in Rust. The right base is **`iroh-docs` on top of `iroh`, `iroh-blobs`, and `iroh-gossip`**. `iroh` gives you authenticated/encrypted peer-to-peer QUIC transport with relay fallback; `iroh-docs` gives you a mutable replicated document model; and `iroh-docs` already depends on `iroh-blobs` plus `iroh-gossip` for content transfer and live sync. In `iroh-docs`, each entry is keyed by **key + author + namespace**, and the entry value is a **BLAKE3 hash + size + timestamp** for the content, while the actual bytes are stored/transferred separately. ([docs.iroh.computer][1]) ## What to build Use **one shared namespace secret** as the initial “shared secret”. That maps very naturally to Iroh’s model: * the **namespace secret** is the write capability for the replicated document; * every node imports the same namespace secret; * every node keeps a full local copy of all document entries and all blob values; * every node is both client and replica holder. ([GitHub][2]) So the first version should be: **System goal** * N nodes * 1 logical KVS * full replication on every participating node * shared write secret for all nodes * eventually consistent * last-write-wins by application policy * no sharding * no quorum/raft in v1 That is an important point: **this is not Raft**. `iroh-docs` is a replicated sync substrate using reconciliation and live sync, not a linearizable consensus system. So this design gives you **local-first replicated state**, not strict serializable consensus. ([Docs.rs][3]) ## Recommended data model Use exactly one Iroh document namespace for one KVS. ### Key layout Store application keys as plain UTF-8 bytes, for example: ```text config/db/url users/alice/profile jobs/123/status ``` ### Value layout Store the actual value bytes in blobs, and store the KVS mapping in the doc. Conceptually: ```text doc entry key = "users/alice/profile" doc entry value = hash(blob) + len + timestamp + author blob content = raw value bytes (JSON / msgpack / binary) ``` This matches how `iroh-docs` is designed: the doc entry points to content by hash, and the content is handled separately. ([GitHub][2]) ### Delete semantics Represent delete as a tombstone. `iroh-docs` already has prefix delete semantics at replica level, but for a KVS I would make deletes explicit in your app layer: ```json { "kind": "tombstone" } ``` That is simpler than physically removing history at the start. ## Replication model Every node does all of this: 1. starts an `iroh::Endpoint` 2. starts `iroh-blobs` 3. starts `iroh-gossip` 4. starts `iroh-docs` 5. opens/imports the shared namespace 6. connects to at least one known peer 7. subscribes to live updates 8. stores all received blobs locally 9. rebuilds in-memory KVS index from local doc state on startup This fits the documented stack exactly: `Docs` is spawned with an `Endpoint`, blobs store, and gossip protocol, then attached to an Iroh router with the docs/blob/gossip ALPNs. ([Docs.rs][3]) ## Membership model for v1 Keep it simple. ### v1 * one **shared secret** distributed out of band * one static peer list or seed list * any node with the shared secret can read/write * every node connects to one or more seeds and syncs everything ### later * per-node identities * read-only tickets * rotating namespace secrets * application ACLs * signed membership records inside the KVS itself Iroh already has a `DocTicket` type containing a document capability plus peer addresses; sharing can be read or write capability depending on how you construct/export it. ([Docs.rs][4]) --- # Rust spec ## Crates ```toml [dependencies] anyhow = "1" tokio = { version = "1", features = ["full"] } serde = { version = "1", features = ["derive"] } serde_json = "1" iroh = "0.97" iroh-blobs = "0.99" iroh-gossip = "0.97" iroh-docs = "0.97" ``` Those versions reflect the docs currently published for the crates I checked. ([Docs.rs][5]) ## Main components ### 1. Node runtime ```rust pub struct KvNode { pub endpoint: iroh::Endpoint, pub docs: iroh_docs::protocol::Docs, pub namespace_id: iroh_docs::NamespaceId, pub author: iroh_docs::Author, } ``` ### 2. Persistent local storage Use file-backed storage for both: * doc metadata * blobs `iroh-docs` supports persistent file-based storage backed by `redb`, and persists all replicas to a single file. ([Docs.rs][3]) ### 3. Shared-secret bootstrap Your config should contain: * `namespace_secret` * `known_peers[]` Example config: ```json { "namespace_secret": "base32-or-hex-encoded-secret", "known_peers": [ "...serialized endpoint addr..." ] } ``` ### 4. KVS API Expose a small local API: ```rust async fn put(key: &str, value: Vec<u8>) -> anyhow::Result<()>; async fn get(key: &str) -> anyhow::Result<Option<Vec<u8>>>; async fn delete(key: &str) -> anyhow::Result<()>; async fn list(prefix: &str) -> anyhow::Result<Vec<String>>; async fn sync_once() -> anyhow::Result<()>; ``` ## Conflict policy Because `iroh-docs` entries are keyed by key + author + namespace, the same logical app key can have multiple authored entries. Your KVS layer should collapse those into one visible value by policy. ([GitHub][2]) For v1, use: **Visible value for a logical key = newest timestamp wins** * if timestamps tie, break by author id lexicographically * tombstone beats older value That gives deterministic convergence. ## Full replication guarantee Your app-level rule should be: > a node is “healthy” only if it stores every doc entry and every referenced blob for the namespace. Implementation: * on startup, scan all current doc entries * for each live entry, ensure blob exists locally * subscribe to updates and fetch/store missing blobs immediately * periodically reconcile all keys and missing blobs Because `iroh-docs` tracks the content hash/length but not the content itself, your KVS code must treat “entry present but blob missing” as an incomplete replica and repair it. ([GitHub][2]) --- # Suggested architecture ## One namespace, many authors Do **not** use one author for the whole cluster. Better: * one shared namespace secret for write capability * one author keypair per node * each node writes under its own author Why: * better audit trail * deterministic conflict debugging * easier future ACLs * closer to Iroh’s intended model, where author identity is separate from namespace write capability. ([GitHub][2]) ## Seed peers Each node should know a few seed peers: * node A, B, C as bootstrap peers * after connecting, peer exchange can be added later ## No consensus in v1 Do not promise: * strict ordering * single leader * linearizable reads * immediate read-your-writes across partitions Promise instead: * authenticated transport * encrypted transport * live sync * eventual convergence * full replica on every healthy node. ([docs.iroh.computer][1]) --- # Example: node setup This example follows the documented pattern for standing up Iroh + blobs + gossip + docs. ```rust use anyhow::Result; use iroh::{endpoint::presets, protocol::Router, Endpoint}; use iroh_blobs::{store::mem::MemStore, BlobsProtocol, ALPN as BLOBS_ALPN}; use iroh_docs::{protocol::Docs, ALPN as DOCS_ALPN}; use iroh_gossip::{net::Gossip, ALPN as GOSSIP_ALPN}; pub struct Runtime { pub endpoint: Endpoint, pub docs: Docs, } pub async fn start_runtime() -> Result<Runtime> { let endpoint = Endpoint::bind(presets::N0).await?; let blobs = MemStore::default(); let gossip = Gossip::builder().spawn(endpoint.clone()); let docs = Docs::memory() .spawn(endpoint.clone(), (*blobs).clone(), gossip.clone()) .await?; let _router = Router::builder(endpoint.clone()) .accept(BLOBS_ALPN, BlobsProtocol::new(&blobs, None)) .accept(GOSSIP_ALPN, gossip) .accept(DOCS_ALPN, docs.clone()) .spawn(); Ok(Runtime { endpoint, docs }) } ``` That matches the published setup flow for `iroh-docs`. ([Docs.rs][3]) For production, switch memory stores to persistent file-backed stores. --- # Example: logical KVS interface Below is the shape I would implement, even if exact method names may need minor adjustment against the current crate API. ```rust use anyhow::Result; use serde::{Deserialize, Serialize}; #[derive(Debug, Clone, Serialize, Deserialize)] pub enum KvValue { Bytes(Vec<u8>), Tombstone, } pub struct KvStore { // wraps docs + blob store + namespace + author + local materialized index } impl KvStore { pub async fn put(&self, key: &str, value: Vec<u8>) -> Result<()> { // 1. add blob content locally // 2. get hash + len // 3. insert/update doc entry for key using this node's author Ok(()) } pub async fn get(&self, key: &str) -> Result<Option<Vec<u8>>> { // 1. resolve latest visible entry for logical key // 2. if tombstone => None // 3. fetch/read blob content by hash Ok(None) } pub async fn delete(&self, key: &str) -> Result<()> { // write tombstone value as a normal entry Ok(()) } } ``` ## Materialized index Maintain a local map: ```rust BTreeMap<String, ResolvedEntry> ``` Where: ```rust struct ResolvedEntry { author: Vec<u8>, timestamp: u64, hash: Option<[u8; 32]>, len: u64, tombstone: bool, } ``` This index is rebuilt from the doc on startup and updated from subscriptions during runtime. --- # Example: using one shared secret This is the important bootstrapping pattern. ## Node 1 * generate namespace secret once * persist it * import/open doc * export/share ticket or publish endpoint address `iroh-docs` exposes namespace secrets as the write capability, and `DocTicket` can carry either read or write capability plus peer addresses. ([GitHub][2]) Pseudo-flow: ```rust // first cluster creation let namespace_secret = generate_once(); save(namespace_secret); // every node let namespace_secret = load_shared_secret(); let namespace_id = namespace_secret.id(); // import namespace into docs store // open the doc/replica ``` ## Node 2..N * receive the same namespace secret out of band * know one or more seed `EndpointAddr`s * import namespace * connect and sync * from then on, live updates --- # Example: app-level write path ```rust pub async fn put_json<T: serde::Serialize>( kv: &KvStore, key: &str, value: &T, ) -> Result<()> { let bytes = serde_json::to_vec(value)?; kv.put(key, bytes).await } ``` Under the hood: ```text put(key, value): blob_hash = store_blob(value) docs.insert(key, author, blob_hash, value.len) publish/sync ``` This is aligned with `Replica::insert` and `Replica::hash_and_insert`, which insert records referencing content by hash/len. ([Docs.rs][6]) --- # Example: conflict resolution Suppose: * node A writes `foo=1` * node B writes `foo=2` * both offline * later they reconnect Your visible KVS result should be resolved deterministically: ```rust fn resolve(entries: &[EntryMeta]) -> Option<EntryMeta> { entries.iter().cloned().max_by(|a, b| { a.timestamp .cmp(&b.timestamp) .then_with(|| a.author.cmp(&b.author)) }) } ``` That gives convergence without consensus. --- # Example project layout ```text iroh_kv/ Cargo.toml src/ main.rs config.rs runtime.rs kv.rs sync.rs index.rs api.rs ``` ## `config.rs` ```rust #[derive(Debug, serde::Serialize, serde::Deserialize)] pub struct Config { pub namespace_secret: String, pub author_secret: Option<String>, pub known_peers: Vec<String>, pub data_dir: String, } ``` ## `runtime.rs` * start endpoint * start blobs/gossip/docs * load persistent stores ## `kv.rs` * `put/get/delete/list` ## `sync.rs` * connect to seeds * repair missing blobs * subscribe to live updates ## `index.rs` * rebuild materialized key index * apply timestamp resolution ## `api.rs` * expose local HTTP or UDS API --- # Operational spec ## Durability * persist namespace secret * persist per-node author secret * persist doc store * persist blob store ## Recovery On restart: 1. open local persistent stores 2. open shared namespace 3. rebuild visible KVS index from all entries 4. verify referenced blobs exist 5. connect to peers and repair missing data ## Health check Return healthy only if: * Iroh endpoint is bound * docs engine running * blob store accessible * namespace opened * all referenced blobs present locally * at least one peer connected recently, or single-node mode enabled ## Anti-entropy Even with live updates, also run periodic full reconciliation: * every 30s–5m depending on size * compare local doc state with peers * fetch missing blobs --- # Security model for v1 ## What the shared secret means Your shared secret should be the **namespace secret**. That means: * anyone holding it can write * anyone without it should at most get read capability * transport itself is still peer-authenticated and encrypted by Iroh endpoint identity. ([Docs.rs][5]) ## Good enough for first version * distribute namespace secret manually * pin trusted peer endpoint ids * use per-node author keys for auditability ## Not good enough long term * one secret for all writers is coarse * no revocation without rotation * membership changes require secret rotation --- # What not to do Do **not** start with: * raft * leader election * quorum writes * sharding * partial replicas * per-key permissions Those all fight the simplicity of your stated goal. Your stated goal is: > “all participating nodes have all keys and values” That is **full-replication event-synced KVS**, and Iroh docs/blobs/gossip are a better fit than Raft for the first version. --- # Minimal design decision summary ## Best first implementation * one namespace = one KVS * shared namespace secret = shared write secret * one author per node * full data replication to every node * persistent local stores * deterministic LWW resolution * static seed peers * periodic repair + live sync ## Why this fits Iroh Because Iroh is explicitly built for direct authenticated/encrypted P2P connections with relay fallback, and `iroh-docs` is explicitly a replicated multi-dimensional key-value document protocol using set reconciliation, backed by blobs and gossip. ([docs.iroh.computer][1]) ## License `iroh-docs` is dual-licensed **MIT OR Apache-2.0**. ([Docs.rs][7]) If you want, I can turn this into a **compilable Rust starter project** with `main.rs`, `kv.rs`, and a simple local HTTP API. [1]: https://docs.iroh.computer/what-is-iroh "What is iroh? - iroh" [2]: https://github.com/n0-computer/iroh-docs "GitHub - n0-computer/iroh-docs · GitHub" [3]: https://docs.rs/iroh-docs/latest/iroh_docs/ "iroh_docs - Rust" [4]: https://docs.rs/iroh-docs/latest/iroh_docs/struct.DocTicket.html?utm_source=chatgpt.com "DocTicket in iroh_docs - Rust - Docs.rs" [5]: https://docs.rs/iroh "iroh - Rust" [6]: https://docs.rs/iroh-docs/latest/iroh_docs/sync/struct.Replica.html?search=&utm_source=chatgpt.com "\"\" Search - Rust" [7]: https://docs.rs/iroh-docs/latest/iroh_docs/?utm_source=chatgpt.com "iroh_docs - Rust" make the KVS as part of our codescalers then make sure we store all info we need to remember into that KVS make the KVS as a higher level library we can use in the ...codescalers_server the manual sync we were implemented can be removed
Author
Owner

Implementation Spec for Issue #1 — Implement Iroh KVS

Objective

Build a fully-replicated, node-local, eventually-consistent key/value store backed by iroh-docs (on top of iroh, iroh-blobs, iroh-gossip) as a new workspace crate hero_codescalers_kvs. Migrate all persistent state currently held in hero_codescalers_server's SQLite DB (nodes, admins, groups, group-members, per-node stats) into the KVS, and rip out the existing "manual sync" machinery (the sync_queue table, sync_worker, JSON-RPC *.apply* methods, and the proxy forwarder). The namespace secret is shared out of band and grants write capability; every node signs with its own author keypair; last-write-wins (LWW) by (timestamp, author_id); deletes are tombstones; anti-entropy + live gossip sync.

Requirements

  • New workspace crate crates/hero_codescalers_kvs that compiles independently and is reusable by other Hero services.
  • KvStore high-level async API: put(key, value) -> Result<()>, get(key) -> Result<Option<Vec<u8>>>, delete(key) -> Result<()>, list(prefix) -> Result<Vec<(String, Vec<u8>)>>, list_keys(prefix) -> Result<Vec<String>>, sync_once() -> Result<()>, plus a subscribe() event stream and shutdown() method.
  • Backing: iroh::Endpoint + iroh_blobs::store::fs::Store + iroh_gossip::net::Gossip + iroh_docs::protocol::Docs; single namespace per store. Values are blobs; entries record blob_hash + len + timestamp + author.
  • Persistent default (file-backed redb + fs-blobs under data_dir); in-memory variant (iroh_blobs::store::mem, iroh_docs::store::memory) for tests.
  • Deterministic LWW: for each logical key, pick the entry with the highest (timestamp_ms, author_id) tuple.
  • Tombstones: explicit entries with a typed sentinel value (KvValue::Tombstone { ts } serialised via serde_json) so get/list skip them; a background reaper may prune tombstones older than a TTL.
  • Static seed peers list for bootstrap (node ids + direct addrs); periodic anti-entropy verifies every referenced blob exists locally and fetches missing ones; gossip provides live sync.
  • hero_codescalers_server is migrated to back its nodes, admins, groups, group_members data in the KVS instead of SQLite; sessions and users (read from /etc/passwd / w) stay as they are; jobs/logs (owned by hero_proc) stay as they are.
  • Full deletion of the existing manual sync path (see "Files to Modify/Create").
  • Unit tests in the KVS crate and an integration test that spins up two in-memory nodes, puts/deletes on one, and asserts convergence on the other.

Files to Modify/Create

New crate crates/hero_codescalers_kvs/:

  • crates/hero_codescalers_kvs/Cargo.toml — package manifest; deps iroh = "0.97", iroh-blobs = "0.99", iroh-gossip = "0.97", iroh-docs = "0.97", plus workspace tokio, serde, serde_json, anyhow, thiserror, tracing, parking_lot, chrono, futures, hex, rand, data-encoding, tempfile (dev-dep).
  • crates/hero_codescalers_kvs/src/lib.rs — re-exports; crate docs.
  • crates/hero_codescalers_kvs/src/config.rsKvConfig (namespace secret, author secret, data_dir, seed_peers, anti_entropy_interval, tombstone_ttl, persistence mode), plus a KvConfigBuilder and helpers to parse/encode namespace/author secrets.
  • crates/hero_codescalers_kvs/src/value.rsKvValue { Live { bytes }, Tombstone } with serde_json envelope { v: 1, kind: "live"|"tombstone", ts_ms, data?: base64 }.
  • crates/hero_codescalers_kvs/src/store.rsKvStore struct; holds Endpoint, Docs, Doc, Author, NamespaceSecret, blob store handle, seed-peers list, shutdown handle, events broadcaster. Implements new_persistent, new_memory, put, get, delete, list, list_keys, sync_once, subscribe, shutdown.
  • crates/hero_codescalers_kvs/src/lww.rs — per-key entry reducer: given an iterator of doc entries for a logical key, pick the winner via (ts_ms, author_id_bytes).
  • crates/hero_codescalers_kvs/src/anti_entropy.rs — background task: iterate all entries, verify blob locally, fetch missing from seed peers; optional tombstone reaper.
  • crates/hero_codescalers_kvs/src/events.rsKvEvent { Put { key }, Delete { key }, RemotePut { key, author }, ... } over tokio::sync::broadcast.
  • crates/hero_codescalers_kvs/src/keys.rs — helpers for keyspace (bytes <-> UTF-8 strings, prefix encoding).
  • crates/hero_codescalers_kvs/src/error.rsKvError enum with thiserror; pub type Result<T> = std::result::Result<T, KvError>;
  • crates/hero_codescalers_kvs/tests/roundtrip.rs — single-node put/get/delete/list tests on the in-memory variant.
  • crates/hero_codescalers_kvs/tests/two_node_sync.rs — two in-memory nodes, dial each other, assert LWW convergence on conflicting writes, assert tombstone wins over older put, assert sync_once fetches missing blobs.

Workspace root:

  • Cargo.toml — add crates/hero_codescalers_kvs to [workspace] members; add iroh, iroh-blobs, iroh-gossip, iroh-docs, thiserror, hex, data-encoding, rand, tempfile under [workspace.dependencies].
  • Cargo.lock — regenerated.

Server migration (crates/hero_codescalers_server/):

  • Cargo.toml — add hero_codescalers_kvs = { path = "../hero_codescalers_kvs" }; remove rusqlite (kept only if jobs/logs still need it — they don't, so remove).
  • src/main.rs — replace Db with a new KvState facade; remove sync_worker module; remove proxy module; delete every *.apply*, sync.status, sync.pending, remote.rpc dispatch arm; remove pending_syncs from stats; drop the mycelium TCP listener (no longer needed since replication is over iroh); load KVS config from env/data_dir; call KvStore::new_persistent at startup and register self-node into KVS.
  • src/model/mod.rs — rewrite exports; the Db type disappears, replaced by KvState (or similar) which wraps KvStore and exposes the same high-level Rust methods (node_list, admin_add, group_create, etc.) but reads/writes through the KVS.
  • src/model/db.rsdelete and replace with src/model/state.rs (new) containing the KvState facade; migrations vanish; sync_* functions vanish.
  • src/model/node.rs — rewrite: Node struct stays (serde_json in/out of KVS); all methods reimplemented against KvStore (kvs.put("nodes/<ipv6>", json) / kvs.delete(...) / kvs.list("nodes/")). Remove all sync_enqueue_rpc calls and all *_apply_remote / *_apply_stats / *_apply_delete methods — LWW + replication is handled inside KVS.
  • src/model/admin.rs — same treatment: persist under admins/<ipv6>; drop apply_* methods.
  • src/model/group.rs — persist groups under groups/<name> and members under group_members/<name>/<ipv6>; drop apply_* methods.
  • src/proxy.rsdelete.
  • src/sync_worker.rsdelete (the periodic stats task moves into main.rs or a tiny new stats_worker.rs; it no longer enqueues syncs — it just calls state.node_update_stats(...) which writes to KVS and lets replication propagate).
  • src/sessions.rs, src/users.rs — unchanged.
  • openrpc.json — drop sync.status, sync.pending, node.apply, node.apply_delete, node.apply_stats, admin.apply, admin.apply_delete, group.apply, group.apply_delete, group.apply_member, group.apply_member_delete, remote.rpc; drop pending_syncs from the stats result schema.
  • openrpc.client.generated.rs — regenerated by the client macro; no manual edits.
  • heroservice.json — unchanged.

UI (crates/hero_codescalers_ui/):

  • templates/index.html — remove the "sync pending" stat tile (lines 231 and 445).
  • static/js/dashboard.js — remove stats-sync-pending / stat-sync-pending wiring (lines 592, 886).

Docs:

  • README.md — short note that replication is now via iroh-docs.

Implementation Plan

Step 1: Add workspace members and dependencies

Files: Cargo.toml (root)

  • Add crates/hero_codescalers_kvs to [workspace] members.
  • Add iroh = "0.97", iroh-blobs = "0.99", iroh-gossip = "0.97", iroh-docs = "0.97" to [workspace.dependencies].
  • Add thiserror, hex, data-encoding, rand, tempfile to [workspace.dependencies] (tempfile as a dev-dep convention).
  • Do NOT remove rusqlite from workspace deps yet (kept until Step 7).
    Dependencies: none.

Step 2: Scaffold hero_codescalers_kvs crate (types, config, errors, value envelope)

Files: crates/hero_codescalers_kvs/Cargo.toml, crates/hero_codescalers_kvs/src/lib.rs, crates/hero_codescalers_kvs/src/config.rs, crates/hero_codescalers_kvs/src/error.rs, crates/hero_codescalers_kvs/src/value.rs, crates/hero_codescalers_kvs/src/keys.rs, crates/hero_codescalers_kvs/src/events.rs

  • Package manifest with iroh-ecosystem deps and workspace deps.
  • Error enum (KvError::{Io, Iroh, Docs, Blobs, Serde, NotFound, Shutdown, ...}).
  • KvValue envelope with serde impls; tombstones carry ts_ms; live values base64-encode bytes.
  • KvConfig struct + builder: namespace_secret: NamespaceSecret, author_secret: Option<AuthorSecret> (generated if None), data_dir: PathBuf, seed_peers: Vec<NodeAddr>, anti_entropy_interval: Duration (default 60s), tombstone_ttl: Option<Duration> (default 7d), persistence: Persistence::{File, Memory}.
  • Helpers to parse secrets from z-base-32 / hex strings.
  • KvEvent enum + broadcaster type alias.
  • Key helpers (validate UTF-8, prefix match).
    Dependencies: Step 1.

Step 3: Implement persistent KvStore constructor + lifecycle

Files: crates/hero_codescalers_kvs/src/store.rs

  • KvStore::new_persistent(cfg: KvConfig) -> Result<Self>:
    • Create data_dir, subdirs blobs/, docs/, keys/.
    • Load or generate node secret key (persisted at keys/node.secret).
    • Build Endpoint::builder().secret_key(sk).bind().
    • Open iroh_blobs::store::fs::Store at blobs/.
    • Open iroh_gossip::net::Gossip bound to endpoint.
    • Open iroh_docs::protocol::Docs with fs store at docs/.
    • Register all three protocols on the endpoint via a Router.
    • Import the provided namespace_secret to get a writable Doc handle (create if not present).
    • Load or create author, call docs.author_create / import, store author.id().
    • Add seed peers so sync can bootstrap.
    • Start live sync on the doc (doc.start_sync(peers)).
    • Spawn anti-entropy task and a shutdown oneshot.
  • KvStore::new_memory(cfg) — same but using iroh_blobs::store::mem::MemStore and iroh_docs::store::memory::Store, no on-disk persistence.
  • shutdown() — cancel background tasks, close doc, close endpoint.
    Dependencies: Step 2.

Step 4: Implement put / get / delete / list / list_keys

Files: crates/hero_codescalers_kvs/src/store.rs, crates/hero_codescalers_kvs/src/lww.rs

  • put(key, value):
    1. Build KvValue::Live { ts_ms: now_ms(), data: value }, serialise to JSON.
    2. blobs.add_bytes(json_bytes) -> hash.
    3. doc.set_hash(author, key_bytes, hash, len).
    4. Emit KvEvent::Put { key }.
  • delete(key):
    1. Build KvValue::Tombstone { ts_ms: now_ms() }, serialise.
    2. Same pipeline as put.
    3. Emit KvEvent::Delete.
  • get(key):
    1. Query all entries for key_bytes.
    2. Collapse via lww::pick_winner.
    3. If winner is None or Tombstone -> Ok(None).
    4. Else fetch blob, deserialise envelope, decode base64 data, return.
  • list(prefix) and list_keys(prefix) — prefix-range scan + LWW + skip tombstones.
  • sync_once() — call doc.sync_with_peers(seed_peers.clone()).await.
    Dependencies: Step 3.

Step 5: Events subscription

Files: crates/hero_codescalers_kvs/src/events.rs, crates/hero_codescalers_kvs/src/store.rs

  • KvStore::subscribe() -> broadcast::Receiver<KvEvent>.
  • Hook into iroh-docs' own event stream and convert LiveEvent::InsertRemote / InsertLocal / NeighborUp / NeighborDown into KvEvent and re-broadcast.
  • Emit local variant inside put/delete.
    Dependencies: Step 4.

Step 6: Anti-entropy + tombstone reaper

Files: crates/hero_codescalers_kvs/src/anti_entropy.rs, crates/hero_codescalers_kvs/src/store.rs

  • Spawn tokio::time::interval(anti_entropy_interval) task.
  • Each tick: iterate entries, verify each referenced blob locally; download missing from seed peers via iroh-blobs downloader; call doc.sync_with_peers as a safety belt.
  • Optional tombstone reaper if tombstone_ttl set.
  • CancellationToken for shutdown.
    Dependencies: Step 3, Step 4.

Step 7: Single-node unit tests

Files: crates/hero_codescalers_kvs/tests/roundtrip.rs

  • put/get, overwrite, delete, list, list_keys — all on new_memory.
    Dependencies: Step 4.

Step 8: Two-node integration test

Files: crates/hero_codescalers_kvs/tests/two_node_sync.rs

  • Two new_memory nodes, mutual seeds; assert LWW, tombstone semantics, offline->online sync via sync_once.
    Dependencies: Step 4, 5, 6.

Step 9: Server state facade (KvState) built on top of KvStore

Files: crates/hero_codescalers_server/src/model/state.rs (new), crates/hero_codescalers_server/src/model/mod.rs, crates/hero_codescalers_server/src/model/node.rs, crates/hero_codescalers_server/src/model/admin.rs, crates/hero_codescalers_server/src/model/group.rs, crates/hero_codescalers_server/Cargo.toml

  • Add hero_codescalers_kvs path dep; remove rusqlite.
  • New KvState { kvs: Arc<KvStore>, self_ipv6: String }.
  • Keyspace: nodes/<ipv6>, admins/<ipv6>, groups/<name>, group_members/<group>/<ipv6>.
  • Rewrite node/admin/group modules — same public method signatures, implemented against KvStore. Drop all *_apply_* methods.
  • Delete model/db.rs.
    Dependencies: Steps 2–6.

Step 10: Gut the manual sync machinery in the server

Files: crates/hero_codescalers_server/src/main.rs, crates/hero_codescalers_server/src/sync_worker.rs, crates/hero_codescalers_server/src/proxy.rs, crates/hero_codescalers_server/openrpc.json

  • Delete sync_worker.rs, proxy.rs.
  • main.rs: remove mycelium TCP listener, remove mod decls, add KvConfig construction from env vars, replace Db::open with KvState::open, drop all *.apply* / sync.* / remote.rpc dispatch arms, drop pending_syncs from stats, replace sync_worker::spawn with a small stats-only worker.
  • openrpc.json: remove the listed methods and pending_syncs field.
    Dependencies: Step 9.

Step 11: UI cleanup

Files: crates/hero_codescalers_ui/templates/index.html, crates/hero_codescalers_ui/static/js/dashboard.js

  • Remove sync-pending tile + wiring.
    Dependencies: Step 10.

Step 12: End-to-end validation + docs

Files: README.md

  • Short note on iroh-docs replication and required env vars.
  • Run fmt/clippy/tests.
    Dependencies: all prior steps.

Parallelisation notes

  • Steps 2–6 sequential within the KVS crate.
  • Steps 7, 8 can parallelise with Step 9.
  • Steps 10 and 11 can parallelise.

Acceptance Criteria

  • cargo build --workspace succeeds on a clean checkout.
  • cargo test -p hero_codescalers_kvs passes, including the two-node integration test.
  • cargo test --workspace --lib passes.
  • cargo clippy --workspace --all-targets -- -D warnings passes.
  • crates/hero_codescalers_server/src/sync_worker.rs and crates/hero_codescalers_server/src/proxy.rs no longer exist.
  • crates/hero_codescalers_server/src/model/db.rs no longer exists; KvState is the only state facade.
  • grep -r sync_queue crates/, grep -r sync_enqueue_rpc crates/, grep -r node_apply crates/, grep -r admin_apply crates/, grep -r group_apply crates/, grep -r rusqlite crates/ all return no matches.
  • openrpc.json contains no methods under sync.*, no *.apply*, no remote.rpc; the stats schema has no pending_syncs.
  • Starting the server with a namespace secret env var produces a persistent iroh data_dir containing a blobs/ dir and a docs/ redb file.
  • Creating a node/admin/group on one server node becomes visible on another node sharing the same namespace secret within 10 seconds, with no manual sync step.
  • Killing and restarting a node preserves all state; entries are re-synced from peers on restart.
  • The UI dashboard no longer shows a "sync pending" tile and no console errors appear.

Notes

  • iroh 0.97 API churn. The iroh ecosystem moves fast. Implementers must consult each crate's 0.97/0.99 changelog before wiring protocols — do not copy patterns from older iroh examples verbatim.
  • Namespace secret is a shared write credential. Anyone with it can write. Distribute out of band (e.g., via existing Hero secret store or an env var set by deployment tooling).
  • Author secret should be per-node and persistent. Store under data_dir/kvs/keys/author.secret, 0600 perms.
  • Mycelium listener removal. The server currently binds [mycelium_ipv6]:9955 to receive sync RPC. That binding is no longer needed. Keep mycelium address detection only because self_ipv6 remains the stable identifier used in keys.
  • Stats churn. node_update_stats runs every 30s and now writes into the KVS each tick. Fine for a few nodes; flag if the cluster grows.
  • Tombstone semantics. iroh-docs has its own delete, but semantics for LWW across deletes and re-creates are easier to reason about with typed tombstone envelopes. Reaper prunes old tombstones.
  • rpc-openrpc client. hero_codescalers_sdk regenerates its client from openrpc.json via a proc macro; no manual edits required.
  • Do not write hero_codescalers_kvs as an OpenRPC service. It is a plain Rust library crate; no heroservice.json, no socket.

Critical Files for Implementation

  • Cargo.toml (workspace root)
  • crates/hero_codescalers_kvs/src/store.rs
  • crates/hero_codescalers_server/src/main.rs
  • crates/hero_codescalers_server/src/model/state.rs
  • crates/hero_codescalers_server/openrpc.json
## Implementation Spec for Issue #1 — Implement Iroh KVS ### Objective Build a fully-replicated, node-local, eventually-consistent key/value store backed by `iroh-docs` (on top of `iroh`, `iroh-blobs`, `iroh-gossip`) as a new workspace crate `hero_codescalers_kvs`. Migrate all persistent state currently held in `hero_codescalers_server`'s SQLite DB (nodes, admins, groups, group-members, per-node stats) into the KVS, and rip out the existing "manual sync" machinery (the `sync_queue` table, `sync_worker`, JSON-RPC `*.apply*` methods, and the `proxy` forwarder). The namespace secret is shared out of band and grants write capability; every node signs with its own author keypair; last-write-wins (LWW) by `(timestamp, author_id)`; deletes are tombstones; anti-entropy + live gossip sync. ### Requirements - New workspace crate `crates/hero_codescalers_kvs` that compiles independently and is reusable by other Hero services. - `KvStore` high-level async API: `put(key, value) -> Result<()>`, `get(key) -> Result<Option<Vec<u8>>>`, `delete(key) -> Result<()>`, `list(prefix) -> Result<Vec<(String, Vec<u8>)>>`, `list_keys(prefix) -> Result<Vec<String>>`, `sync_once() -> Result<()>`, plus a `subscribe()` event stream and `shutdown()` method. - Backing: `iroh::Endpoint` + `iroh_blobs::store::fs::Store` + `iroh_gossip::net::Gossip` + `iroh_docs::protocol::Docs`; single namespace per store. Values are blobs; entries record `blob_hash + len + timestamp + author`. - Persistent default (file-backed redb + fs-blobs under `data_dir`); in-memory variant (`iroh_blobs::store::mem`, `iroh_docs::store::memory`) for tests. - Deterministic LWW: for each logical key, pick the entry with the highest `(timestamp_ms, author_id)` tuple. - Tombstones: explicit entries with a typed sentinel value (`KvValue::Tombstone { ts }` serialised via serde_json) so `get`/`list` skip them; a background reaper may prune tombstones older than a TTL. - Static seed peers list for bootstrap (node ids + direct addrs); periodic anti-entropy verifies every referenced blob exists locally and fetches missing ones; gossip provides live sync. - `hero_codescalers_server` is migrated to back its `nodes`, `admins`, `groups`, `group_members` data in the KVS instead of SQLite; sessions and users (read from `/etc/passwd` / `w`) stay as they are; jobs/logs (owned by hero_proc) stay as they are. - Full deletion of the existing manual sync path (see "Files to Modify/Create"). - Unit tests in the KVS crate and an integration test that spins up two in-memory nodes, puts/deletes on one, and asserts convergence on the other. ### Files to Modify/Create New crate `crates/hero_codescalers_kvs/`: - `crates/hero_codescalers_kvs/Cargo.toml` — package manifest; deps `iroh = "0.97"`, `iroh-blobs = "0.99"`, `iroh-gossip = "0.97"`, `iroh-docs = "0.97"`, plus workspace `tokio`, `serde`, `serde_json`, `anyhow`, `thiserror`, `tracing`, `parking_lot`, `chrono`, `futures`, `hex`, `rand`, `data-encoding`, `tempfile` (dev-dep). - `crates/hero_codescalers_kvs/src/lib.rs` — re-exports; crate docs. - `crates/hero_codescalers_kvs/src/config.rs` — `KvConfig` (namespace secret, author secret, data_dir, seed_peers, anti_entropy_interval, tombstone_ttl, persistence mode), plus a `KvConfigBuilder` and helpers to parse/encode namespace/author secrets. - `crates/hero_codescalers_kvs/src/value.rs` — `KvValue { Live { bytes }, Tombstone }` with serde_json envelope `{ v: 1, kind: "live"|"tombstone", ts_ms, data?: base64 }`. - `crates/hero_codescalers_kvs/src/store.rs` — `KvStore` struct; holds `Endpoint`, `Docs`, `Doc`, `Author`, `NamespaceSecret`, blob store handle, seed-peers list, shutdown handle, events broadcaster. Implements `new_persistent`, `new_memory`, `put`, `get`, `delete`, `list`, `list_keys`, `sync_once`, `subscribe`, `shutdown`. - `crates/hero_codescalers_kvs/src/lww.rs` — per-key entry reducer: given an iterator of doc entries for a logical key, pick the winner via `(ts_ms, author_id_bytes)`. - `crates/hero_codescalers_kvs/src/anti_entropy.rs` — background task: iterate all entries, verify blob locally, fetch missing from seed peers; optional tombstone reaper. - `crates/hero_codescalers_kvs/src/events.rs` — `KvEvent { Put { key }, Delete { key }, RemotePut { key, author }, ... }` over `tokio::sync::broadcast`. - `crates/hero_codescalers_kvs/src/keys.rs` — helpers for keyspace (bytes <-> UTF-8 strings, prefix encoding). - `crates/hero_codescalers_kvs/src/error.rs` — `KvError` enum with `thiserror`; `pub type Result<T> = std::result::Result<T, KvError>;` - `crates/hero_codescalers_kvs/tests/roundtrip.rs` — single-node put/get/delete/list tests on the in-memory variant. - `crates/hero_codescalers_kvs/tests/two_node_sync.rs` — two in-memory nodes, dial each other, assert LWW convergence on conflicting writes, assert tombstone wins over older put, assert `sync_once` fetches missing blobs. Workspace root: - `Cargo.toml` — add `crates/hero_codescalers_kvs` to `[workspace] members`; add `iroh`, `iroh-blobs`, `iroh-gossip`, `iroh-docs`, `thiserror`, `hex`, `data-encoding`, `rand`, `tempfile` under `[workspace.dependencies]`. - `Cargo.lock` — regenerated. Server migration (`crates/hero_codescalers_server/`): - `Cargo.toml` — add `hero_codescalers_kvs = { path = "../hero_codescalers_kvs" }`; remove `rusqlite` (kept only if jobs/logs still need it — they don't, so remove). - `src/main.rs` — replace `Db` with a new `KvState` facade; remove `sync_worker` module; remove `proxy` module; delete every `*.apply*`, `sync.status`, `sync.pending`, `remote.rpc` dispatch arm; remove `pending_syncs` from `stats`; drop the mycelium TCP listener (no longer needed since replication is over iroh); load KVS config from env/data_dir; call `KvStore::new_persistent` at startup and register self-node into KVS. - `src/model/mod.rs` — rewrite exports; the `Db` type disappears, replaced by `KvState` (or similar) which wraps `KvStore` and exposes the same high-level Rust methods (`node_list`, `admin_add`, `group_create`, etc.) but reads/writes through the KVS. - `src/model/db.rs` — **delete** and replace with `src/model/state.rs` (new) containing the `KvState` facade; migrations vanish; sync_* functions vanish. - `src/model/node.rs` — rewrite: `Node` struct stays (serde_json in/out of KVS); all methods reimplemented against `KvStore` (`kvs.put("nodes/<ipv6>", json)` / `kvs.delete(...)` / `kvs.list("nodes/")`). Remove all `sync_enqueue_rpc` calls and all `*_apply_remote` / `*_apply_stats` / `*_apply_delete` methods — LWW + replication is handled inside KVS. - `src/model/admin.rs` — same treatment: persist under `admins/<ipv6>`; drop apply_* methods. - `src/model/group.rs` — persist groups under `groups/<name>` and members under `group_members/<name>/<ipv6>`; drop apply_* methods. - `src/proxy.rs` — **delete**. - `src/sync_worker.rs` — **delete** (the periodic stats task moves into `main.rs` or a tiny new `stats_worker.rs`; it no longer enqueues syncs — it just calls `state.node_update_stats(...)` which writes to KVS and lets replication propagate). - `src/sessions.rs`, `src/users.rs` — unchanged. - `openrpc.json` — drop `sync.status`, `sync.pending`, `node.apply`, `node.apply_delete`, `node.apply_stats`, `admin.apply`, `admin.apply_delete`, `group.apply`, `group.apply_delete`, `group.apply_member`, `group.apply_member_delete`, `remote.rpc`; drop `pending_syncs` from the `stats` result schema. - `openrpc.client.generated.rs` — regenerated by the client macro; no manual edits. - `heroservice.json` — unchanged. UI (`crates/hero_codescalers_ui/`): - `templates/index.html` — remove the "sync pending" stat tile (lines 231 and 445). - `static/js/dashboard.js` — remove `stats-sync-pending` / `stat-sync-pending` wiring (lines 592, 886). Docs: - `README.md` — short note that replication is now via iroh-docs. ### Implementation Plan #### Step 1: Add workspace members and dependencies Files: `Cargo.toml` (root) - Add `crates/hero_codescalers_kvs` to `[workspace] members`. - Add `iroh = "0.97"`, `iroh-blobs = "0.99"`, `iroh-gossip = "0.97"`, `iroh-docs = "0.97"` to `[workspace.dependencies]`. - Add `thiserror`, `hex`, `data-encoding`, `rand`, `tempfile` to `[workspace.dependencies]` (tempfile as a dev-dep convention). - Do NOT remove `rusqlite` from workspace deps yet (kept until Step 7). Dependencies: none. #### Step 2: Scaffold `hero_codescalers_kvs` crate (types, config, errors, value envelope) Files: `crates/hero_codescalers_kvs/Cargo.toml`, `crates/hero_codescalers_kvs/src/lib.rs`, `crates/hero_codescalers_kvs/src/config.rs`, `crates/hero_codescalers_kvs/src/error.rs`, `crates/hero_codescalers_kvs/src/value.rs`, `crates/hero_codescalers_kvs/src/keys.rs`, `crates/hero_codescalers_kvs/src/events.rs` - Package manifest with iroh-ecosystem deps and workspace deps. - Error enum (`KvError::{Io, Iroh, Docs, Blobs, Serde, NotFound, Shutdown, ...}`). - `KvValue` envelope with `serde` impls; tombstones carry `ts_ms`; live values base64-encode bytes. - `KvConfig` struct + builder: `namespace_secret: NamespaceSecret`, `author_secret: Option<AuthorSecret>` (generated if `None`), `data_dir: PathBuf`, `seed_peers: Vec<NodeAddr>`, `anti_entropy_interval: Duration` (default 60s), `tombstone_ttl: Option<Duration>` (default 7d), `persistence: Persistence::{File, Memory}`. - Helpers to parse secrets from z-base-32 / hex strings. - `KvEvent` enum + broadcaster type alias. - Key helpers (validate UTF-8, prefix match). Dependencies: Step 1. #### Step 3: Implement persistent `KvStore` constructor + lifecycle Files: `crates/hero_codescalers_kvs/src/store.rs` - `KvStore::new_persistent(cfg: KvConfig) -> Result<Self>`: - Create `data_dir`, subdirs `blobs/`, `docs/`, `keys/`. - Load or generate node secret key (persisted at `keys/node.secret`). - Build `Endpoint::builder().secret_key(sk).bind()`. - Open `iroh_blobs::store::fs::Store` at `blobs/`. - Open `iroh_gossip::net::Gossip` bound to endpoint. - Open `iroh_docs::protocol::Docs` with `fs` store at `docs/`. - Register all three protocols on the endpoint via a `Router`. - Import the provided `namespace_secret` to get a writable `Doc` handle (create if not present). - Load or create author, call `docs.author_create` / `import`, store `author.id()`. - Add seed peers so sync can bootstrap. - Start live sync on the doc (`doc.start_sync(peers)`). - Spawn anti-entropy task and a shutdown oneshot. - `KvStore::new_memory(cfg)` — same but using `iroh_blobs::store::mem::MemStore` and `iroh_docs::store::memory::Store`, no on-disk persistence. - `shutdown()` — cancel background tasks, close doc, close endpoint. Dependencies: Step 2. #### Step 4: Implement `put` / `get` / `delete` / `list` / `list_keys` Files: `crates/hero_codescalers_kvs/src/store.rs`, `crates/hero_codescalers_kvs/src/lww.rs` - `put(key, value)`: 1. Build `KvValue::Live { ts_ms: now_ms(), data: value }`, serialise to JSON. 2. `blobs.add_bytes(json_bytes)` -> hash. 3. `doc.set_hash(author, key_bytes, hash, len)`. 4. Emit `KvEvent::Put { key }`. - `delete(key)`: 1. Build `KvValue::Tombstone { ts_ms: now_ms() }`, serialise. 2. Same pipeline as `put`. 3. Emit `KvEvent::Delete`. - `get(key)`: 1. Query all entries for `key_bytes`. 2. Collapse via `lww::pick_winner`. 3. If winner is `None` or `Tombstone` -> `Ok(None)`. 4. Else fetch blob, deserialise envelope, decode base64 data, return. - `list(prefix)` and `list_keys(prefix)` — prefix-range scan + LWW + skip tombstones. - `sync_once()` — call `doc.sync_with_peers(seed_peers.clone()).await`. Dependencies: Step 3. #### Step 5: Events subscription Files: `crates/hero_codescalers_kvs/src/events.rs`, `crates/hero_codescalers_kvs/src/store.rs` - `KvStore::subscribe() -> broadcast::Receiver<KvEvent>`. - Hook into iroh-docs' own event stream and convert `LiveEvent::InsertRemote` / `InsertLocal` / `NeighborUp` / `NeighborDown` into `KvEvent` and re-broadcast. - Emit local variant inside `put`/`delete`. Dependencies: Step 4. #### Step 6: Anti-entropy + tombstone reaper Files: `crates/hero_codescalers_kvs/src/anti_entropy.rs`, `crates/hero_codescalers_kvs/src/store.rs` - Spawn `tokio::time::interval(anti_entropy_interval)` task. - Each tick: iterate entries, verify each referenced blob locally; download missing from seed peers via iroh-blobs downloader; call `doc.sync_with_peers` as a safety belt. - Optional tombstone reaper if `tombstone_ttl` set. - CancellationToken for shutdown. Dependencies: Step 3, Step 4. #### Step 7: Single-node unit tests Files: `crates/hero_codescalers_kvs/tests/roundtrip.rs` - `put`/`get`, overwrite, delete, list, list_keys — all on `new_memory`. Dependencies: Step 4. #### Step 8: Two-node integration test Files: `crates/hero_codescalers_kvs/tests/two_node_sync.rs` - Two `new_memory` nodes, mutual seeds; assert LWW, tombstone semantics, offline->online sync via `sync_once`. Dependencies: Step 4, 5, 6. #### Step 9: Server state facade (`KvState`) built on top of `KvStore` Files: `crates/hero_codescalers_server/src/model/state.rs` (new), `crates/hero_codescalers_server/src/model/mod.rs`, `crates/hero_codescalers_server/src/model/node.rs`, `crates/hero_codescalers_server/src/model/admin.rs`, `crates/hero_codescalers_server/src/model/group.rs`, `crates/hero_codescalers_server/Cargo.toml` - Add `hero_codescalers_kvs` path dep; remove `rusqlite`. - New `KvState { kvs: Arc<KvStore>, self_ipv6: String }`. - Keyspace: `nodes/<ipv6>`, `admins/<ipv6>`, `groups/<name>`, `group_members/<group>/<ipv6>`. - Rewrite node/admin/group modules — same public method signatures, implemented against `KvStore`. Drop all `*_apply_*` methods. - Delete `model/db.rs`. Dependencies: Steps 2–6. #### Step 10: Gut the manual sync machinery in the server Files: `crates/hero_codescalers_server/src/main.rs`, `crates/hero_codescalers_server/src/sync_worker.rs`, `crates/hero_codescalers_server/src/proxy.rs`, `crates/hero_codescalers_server/openrpc.json` - Delete `sync_worker.rs`, `proxy.rs`. - `main.rs`: remove mycelium TCP listener, remove mod decls, add KvConfig construction from env vars, replace `Db::open` with `KvState::open`, drop all `*.apply*` / `sync.*` / `remote.rpc` dispatch arms, drop `pending_syncs` from `stats`, replace `sync_worker::spawn` with a small stats-only worker. - `openrpc.json`: remove the listed methods and `pending_syncs` field. Dependencies: Step 9. #### Step 11: UI cleanup Files: `crates/hero_codescalers_ui/templates/index.html`, `crates/hero_codescalers_ui/static/js/dashboard.js` - Remove sync-pending tile + wiring. Dependencies: Step 10. #### Step 12: End-to-end validation + docs Files: `README.md` - Short note on iroh-docs replication and required env vars. - Run fmt/clippy/tests. Dependencies: all prior steps. #### Parallelisation notes - Steps 2–6 sequential within the KVS crate. - Steps 7, 8 can parallelise with Step 9. - Steps 10 and 11 can parallelise. ### Acceptance Criteria - [ ] `cargo build --workspace` succeeds on a clean checkout. - [ ] `cargo test -p hero_codescalers_kvs` passes, including the two-node integration test. - [ ] `cargo test --workspace --lib` passes. - [ ] `cargo clippy --workspace --all-targets -- -D warnings` passes. - [ ] `crates/hero_codescalers_server/src/sync_worker.rs` and `crates/hero_codescalers_server/src/proxy.rs` no longer exist. - [ ] `crates/hero_codescalers_server/src/model/db.rs` no longer exists; `KvState` is the only state facade. - [ ] `grep -r sync_queue crates/`, `grep -r sync_enqueue_rpc crates/`, `grep -r node_apply crates/`, `grep -r admin_apply crates/`, `grep -r group_apply crates/`, `grep -r rusqlite crates/` all return no matches. - [ ] `openrpc.json` contains no methods under `sync.*`, no `*.apply*`, no `remote.rpc`; the `stats` schema has no `pending_syncs`. - [ ] Starting the server with a namespace secret env var produces a persistent iroh data_dir containing a `blobs/` dir and a `docs/` redb file. - [ ] Creating a node/admin/group on one server node becomes visible on another node sharing the same namespace secret within 10 seconds, with no manual sync step. - [ ] Killing and restarting a node preserves all state; entries are re-synced from peers on restart. - [ ] The UI dashboard no longer shows a "sync pending" tile and no console errors appear. ### Notes - **iroh 0.97 API churn.** The iroh ecosystem moves fast. Implementers must consult each crate's 0.97/0.99 changelog before wiring protocols — do not copy patterns from older iroh examples verbatim. - **Namespace secret is a shared write credential.** Anyone with it can write. Distribute out of band (e.g., via existing Hero secret store or an env var set by deployment tooling). - **Author secret should be per-node and persistent.** Store under `data_dir/kvs/keys/author.secret`, 0600 perms. - **Mycelium listener removal.** The server currently binds `[mycelium_ipv6]:9955` to receive sync RPC. That binding is no longer needed. Keep mycelium address detection only because `self_ipv6` remains the stable identifier used in keys. - **Stats churn.** `node_update_stats` runs every 30s and now writes into the KVS each tick. Fine for a few nodes; flag if the cluster grows. - **Tombstone semantics.** iroh-docs has its own delete, but semantics for LWW across deletes and re-creates are easier to reason about with typed tombstone envelopes. Reaper prunes old tombstones. - **`rpc-openrpc` client.** `hero_codescalers_sdk` regenerates its client from `openrpc.json` via a proc macro; no manual edits required. - **Do not write `hero_codescalers_kvs` as an OpenRPC service.** It is a plain Rust library crate; no heroservice.json, no socket. ### Critical Files for Implementation - `Cargo.toml` (workspace root) - `crates/hero_codescalers_kvs/src/store.rs` - `crates/hero_codescalers_server/src/main.rs` - `crates/hero_codescalers_server/src/model/state.rs` - `crates/hero_codescalers_server/openrpc.json`
Author
Owner

Test Results

All tests passing across the new KVS crate and the updated server.

cargo test --workspace

Suite Tests Passed Failed
hero_codescalers_kvsroundtrip.rs (single-node memory CRUD) 5 5 0
hero_codescalers_kvstwo_node_sync.rs (two-node convergence) 3 3 0
workspace unit tests 0
doc-tests (hero_codescalers_sdk, nu_exec) 3 3 0

Single-node tests exercise put/get/overwrite/delete semantics and LWW ordering against the in-memory blob + docs store. The two-node tests bring up two full KvStore instances in the same process, wire them bidirectionally, and verify that writes, tombstones, and LWW-resolved overwrites propagate through the iroh-docs replica within 30 seconds.

Multi-instance sync script (make test-sync)

The new scripts/test-kvs-sync.sh spawns N full hero_codescalers_server processes, each with its own UDS, data directory, and node secret, sharing only a generated namespace secret. All communication happens over the UDS OpenRPC interface exactly as a real admin client would use it.

Scenarios exercised (NODES=2, SETTLE_MS=2000, TIMEOUT=30):

  1. kv.put on node 1 propagates to node 2 via gossip/sync
  2. kv.put on node 2 overwrites node 1's value under LWW
  3. kv.delete on node 1 produces a tombstone that node 2 observes
  4. Three keys written under users/ and groups/ prefixes, kv.list_keys on node 2 returns only the two users/ keys in sorted order

Output of a clean run:

[cfg] namespace secret: 45df6e2d...422fe063
[spawn] node 1: sock=.../node1/sockets/hero_codescalers/rpc.sock
[spawn] node 2: sock=.../node2/sockets/hero_codescalers/rpc.sock
[ready] node 1 healthy
[ready] node 2 healthy
[info] node 1 endpoint: endpointadsarbce...
[info] node 2 endpoint: endpointabgaukkx...
[wire] node 1 → node 2: {"id":1,"jsonrpc":"2.0","result":{"ok":true}}
[wire] node 2 → node 1: {"id":1,"jsonrpc":"2.0","result":{"ok":true}}
[wire] all nodes cross-seeded
[wait] settling for 2.0s so gossip neighbors are established

[test 1] put key=greeting on node 1 → expect all others to converge
  node 2 saw aGVsbG8tZnJvbS1ub2RlMQ==

[test 2] put key=greeting on node 2 → expect node 1 to converge (LWW)
  node 1 saw b3ZlcndyaXR0ZW4tYnktbGFzdC1ub2Rl

[test 3] delete key=greeting on node 1 → expect node 2 to see null
  node 2 saw tombstone

[test 4] multi-key prefix list
  node 2 list_keys users/ = users/alice,users/bob

[PASS] all convergence checks succeeded across 2 instances

The script is wired into the Makefile as make test-sync and respects NODES=, TIMEOUT=, and SETTLE_MS= environment overrides.

## Test Results All tests passing across the new KVS crate and the updated server. ### `cargo test --workspace` | Suite | Tests | Passed | Failed | |---|---|---|---| | `hero_codescalers_kvs` — `roundtrip.rs` (single-node memory CRUD) | 5 | 5 | 0 | | `hero_codescalers_kvs` — `two_node_sync.rs` (two-node convergence) | 3 | 3 | 0 | | workspace unit tests | 0 | — | — | | doc-tests (`hero_codescalers_sdk`, `nu_exec`) | 3 | 3 | 0 | Single-node tests exercise put/get/overwrite/delete semantics and LWW ordering against the in-memory blob + docs store. The two-node tests bring up two full `KvStore` instances in the same process, wire them bidirectionally, and verify that writes, tombstones, and LWW-resolved overwrites propagate through the iroh-docs replica within 30 seconds. ### Multi-instance sync script (`make test-sync`) The new `scripts/test-kvs-sync.sh` spawns N full `hero_codescalers_server` processes, each with its own UDS, data directory, and node secret, sharing only a generated namespace secret. All communication happens over the UDS OpenRPC interface exactly as a real admin client would use it. Scenarios exercised (NODES=2, SETTLE_MS=2000, TIMEOUT=30): 1. `kv.put` on node 1 propagates to node 2 via gossip/sync 2. `kv.put` on node 2 overwrites node 1's value under LWW 3. `kv.delete` on node 1 produces a tombstone that node 2 observes 4. Three keys written under `users/` and `groups/` prefixes, `kv.list_keys` on node 2 returns only the two `users/` keys in sorted order Output of a clean run: ``` [cfg] namespace secret: 45df6e2d...422fe063 [spawn] node 1: sock=.../node1/sockets/hero_codescalers/rpc.sock [spawn] node 2: sock=.../node2/sockets/hero_codescalers/rpc.sock [ready] node 1 healthy [ready] node 2 healthy [info] node 1 endpoint: endpointadsarbce... [info] node 2 endpoint: endpointabgaukkx... [wire] node 1 → node 2: {"id":1,"jsonrpc":"2.0","result":{"ok":true}} [wire] node 2 → node 1: {"id":1,"jsonrpc":"2.0","result":{"ok":true}} [wire] all nodes cross-seeded [wait] settling for 2.0s so gossip neighbors are established [test 1] put key=greeting on node 1 → expect all others to converge node 2 saw aGVsbG8tZnJvbS1ub2RlMQ== [test 2] put key=greeting on node 2 → expect node 1 to converge (LWW) node 1 saw b3ZlcndyaXR0ZW4tYnktbGFzdC1ub2Rl [test 3] delete key=greeting on node 1 → expect node 2 to see null node 2 saw tombstone [test 4] multi-key prefix list node 2 list_keys users/ = users/alice,users/bob [PASS] all convergence checks succeeded across 2 instances ``` The script is wired into the Makefile as `make test-sync` and respects `NODES=`, `TIMEOUT=`, and `SETTLE_MS=` environment overrides.
Author
Owner

Implementation Summary

The issue is fully implemented. All server state now lives in a fully-replicated Iroh-backed KVS; the previous manual TCP/mycelium sync path has been removed.

New crate: hero_codescalers_kvs

A higher-level library over iroh, iroh-docs, iroh-blobs, and iroh-gossip (all at 0.97/0.99). Exposes a KvStore with put, get, delete, list, list_keys, sync_once, start_sync_with, subscribe, and shutdown.

Files added:

  • crates/hero_codescalers_kvs/Cargo.toml
  • crates/hero_codescalers_kvs/src/lib.rs — public re-exports
  • crates/hero_codescalers_kvs/src/config.rsKvConfig builder, namespace/author secret helpers, Persistence::{Memory, File}
  • crates/hero_codescalers_kvs/src/error.rsKvError / Result aliases
  • crates/hero_codescalers_kvs/src/events.rsKvEvent broadcast channel (PutLocal/DeleteLocal/PutRemote/DeleteRemote/SyncStarted/SyncFinished)
  • crates/hero_codescalers_kvs/src/keys.rs — key-namespace helpers
  • crates/hero_codescalers_kvs/src/value.rsKvValue::{Live, Tombstone} JSON envelope with base64-encoded payloads and wall-clock timestamps
  • crates/hero_codescalers_kvs/src/lww.rs — deterministic LWW reducer (timestamp, author_id_bytes)
  • crates/hero_codescalers_kvs/src/anti_entropy.rs — periodic reconcile task backed by tokio_util::CancellationToken
  • crates/hero_codescalers_kvs/src/store.rsKvStore — wires Endpoint (N0 preset) + FsStore / MemStore blobs + Gossip + Docs + ALPN-dispatched Router, opens or imports the namespace, loads or creates an author, emits live events, starts initial sync, and exposes the public API

Tests added:

  • crates/hero_codescalers_kvs/tests/roundtrip.rs — 5 single-node CRUD tests
  • crates/hero_codescalers_kvs/tests/two_node_sync.rs — 3 two-node convergence tests (put propagation, LWW overwrite, tombstone win)

Server integration: hero_codescalers_server

  • src/model/state.rs (new): KvState — wraps Arc<KvStore> and the local IPv6, provides async CRUD helpers over the nodes/, admins/, groups/, and group_members/ prefixes
  • src/model/{node,admin,group}.rs: rewritten to async methods on KvState; all data now flows through the KVS
  • src/model/mod.rs: re-exports for the new state module, drops db
  • src/main.rs: reads HERO_CODESCALERS_KVS_{NAMESPACE_SECRET,AUTHOR_SECRET,SEEDS,DATA_DIR}, constructs a single KvStore, drops the mycelium TCP listener, replaces sync_worker with an inline 30-second stats task, and routes 8 new OpenRPC methods through a UDS-only admin gate:
    • kv.put, kv.get, kv.delete, kv.list, kv.list_keys, kv.info, kv.peer_add, kv.sync_once
  • openrpc.json + openrpc.client.generated.rs: extended with the 8 new methods, param schemas, and result envelopes
  • Removed: src/model/db.rs, src/proxy.rs, src/sync_worker.rs, and the legacy rusqlite dependency

Admin model over UDS

require_admin treats any caller with caller_ip == None (i.e. UDS) as an admin, consistent with the existing OpenRPC-over-UDS design. Only HTTP/TCP callers have to pass the IP-based admin check.

UI

crates/hero_codescalers_ui/templates/index.html and static/js/dashboard.js no longer reference the obsolete stats.sync_pending / stat-sync-pending fields that the old sync worker populated.

Tooling

  • Makefile: added test-kvs (crate tests) and test-sync (multi-instance script) targets
  • scripts/test-kvs-sync.sh: spawns N server instances, cross-wires them over the UDS OpenRPC interface with kv.peer_add, and asserts propagation, LWW, tombstones, and prefix listing within a bounded timeout

Test Results

All tests pass:

running 5 tests — roundtrip.rs           ... 5 passed, 0 failed
running 3 tests — two_node_sync.rs       ... 3 passed, 0 failed
running 3 tests — doc-tests              ... 3 passed, 0 failed
make test-sync (NODES=2)                 ... 4 scenarios, all passed

Notes

  • Namespace secret (32-byte hex) is the shared write capability; each node has its own author key.
  • KvStore returns entries through pick_winner so clients always see LWW-resolved reads.
  • Anti-entropy interval defaults to 30 s; callers can also request an immediate round through kv.sync_once.
  • EndpointAddr from iroh 0.97 replaces the old NodeAddr alias — both names are re-exported from the crate for downstream ergonomics.
## Implementation Summary The issue is fully implemented. All server state now lives in a fully-replicated Iroh-backed KVS; the previous manual TCP/mycelium sync path has been removed. ### New crate: `hero_codescalers_kvs` A higher-level library over `iroh`, `iroh-docs`, `iroh-blobs`, and `iroh-gossip` (all at 0.97/0.99). Exposes a `KvStore` with `put`, `get`, `delete`, `list`, `list_keys`, `sync_once`, `start_sync_with`, `subscribe`, and `shutdown`. Files added: - `crates/hero_codescalers_kvs/Cargo.toml` - `crates/hero_codescalers_kvs/src/lib.rs` — public re-exports - `crates/hero_codescalers_kvs/src/config.rs` — `KvConfig` builder, namespace/author secret helpers, `Persistence::{Memory, File}` - `crates/hero_codescalers_kvs/src/error.rs` — `KvError` / `Result` aliases - `crates/hero_codescalers_kvs/src/events.rs` — `KvEvent` broadcast channel (PutLocal/DeleteLocal/PutRemote/DeleteRemote/SyncStarted/SyncFinished) - `crates/hero_codescalers_kvs/src/keys.rs` — key-namespace helpers - `crates/hero_codescalers_kvs/src/value.rs` — `KvValue::{Live, Tombstone}` JSON envelope with base64-encoded payloads and wall-clock timestamps - `crates/hero_codescalers_kvs/src/lww.rs` — deterministic LWW reducer `(timestamp, author_id_bytes)` - `crates/hero_codescalers_kvs/src/anti_entropy.rs` — periodic reconcile task backed by `tokio_util::CancellationToken` - `crates/hero_codescalers_kvs/src/store.rs` — `KvStore` — wires `Endpoint` (N0 preset) + `FsStore` / `MemStore` blobs + `Gossip` + `Docs` + ALPN-dispatched `Router`, opens or imports the namespace, loads or creates an author, emits live events, starts initial sync, and exposes the public API Tests added: - `crates/hero_codescalers_kvs/tests/roundtrip.rs` — 5 single-node CRUD tests - `crates/hero_codescalers_kvs/tests/two_node_sync.rs` — 3 two-node convergence tests (put propagation, LWW overwrite, tombstone win) ### Server integration: `hero_codescalers_server` - `src/model/state.rs` (new): `KvState` — wraps `Arc<KvStore>` and the local IPv6, provides async CRUD helpers over the `nodes/`, `admins/`, `groups/`, and `group_members/` prefixes - `src/model/{node,admin,group}.rs`: rewritten to async methods on `KvState`; all data now flows through the KVS - `src/model/mod.rs`: re-exports for the new `state` module, drops `db` - `src/main.rs`: reads `HERO_CODESCALERS_KVS_{NAMESPACE_SECRET,AUTHOR_SECRET,SEEDS,DATA_DIR}`, constructs a single `KvStore`, drops the mycelium TCP listener, replaces `sync_worker` with an inline 30-second stats task, and routes 8 new OpenRPC methods through a UDS-only admin gate: - `kv.put`, `kv.get`, `kv.delete`, `kv.list`, `kv.list_keys`, `kv.info`, `kv.peer_add`, `kv.sync_once` - `openrpc.json` + `openrpc.client.generated.rs`: extended with the 8 new methods, param schemas, and result envelopes - Removed: `src/model/db.rs`, `src/proxy.rs`, `src/sync_worker.rs`, and the legacy `rusqlite` dependency ### Admin model over UDS `require_admin` treats any caller with `caller_ip == None` (i.e. UDS) as an admin, consistent with the existing OpenRPC-over-UDS design. Only HTTP/TCP callers have to pass the IP-based admin check. ### UI `crates/hero_codescalers_ui/templates/index.html` and `static/js/dashboard.js` no longer reference the obsolete `stats.sync_pending` / `stat-sync-pending` fields that the old sync worker populated. ### Tooling - `Makefile`: added `test-kvs` (crate tests) and `test-sync` (multi-instance script) targets - `scripts/test-kvs-sync.sh`: spawns N server instances, cross-wires them over the UDS OpenRPC interface with `kv.peer_add`, and asserts propagation, LWW, tombstones, and prefix listing within a bounded timeout ### Test Results All tests pass: ``` running 5 tests — roundtrip.rs ... 5 passed, 0 failed running 3 tests — two_node_sync.rs ... 3 passed, 0 failed running 3 tests — doc-tests ... 3 passed, 0 failed make test-sync (NODES=2) ... 4 scenarios, all passed ``` ### Notes - Namespace secret (32-byte hex) is the shared write capability; each node has its own author key. - `KvStore` returns entries through `pick_winner` so clients always see LWW-resolved reads. - Anti-entropy interval defaults to 30 s; callers can also request an immediate round through `kv.sync_once`. - `EndpointAddr` from iroh 0.97 replaces the old `NodeAddr` alias — both names are re-exported from the crate for downstream ergonomics.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_codescalers#1
No description provided.