Per-caller rate limit for typing.relay (A3 — WS refactor follow-up) #14

Closed
opened 2026-04-23 19:40:53 +00:00 by sameh-farouk · 1 comment
Member

Follow-up to #13. Closes gap A3 from the post-refactor architectural assessment.

Problem: the refactor exempted typing.relay from the global rate-limit cap (60/min) so legitimate typing doesn't starve user actions. The exemption is total — a client sending 1000 typing.start/sec would fan out to every channel member, amplifying by channel size. The refactor's belt-and-suspenders user_id override prevents spoofing but does nothing about volume.

Solution: add a third TokenBucket to PerCallerLimits for typing specifically. 60/min per caller — bypasses global (so heavy typing is tolerated) but has its own ceiling. Fails soft via the existing RpcError::RateLimited path.

Docs:

Size: ~60 lines production + ~80 lines tests. 2 tasks.

Branch: lands on feat/ws-refactor (same PR as the refactor) — closes a vulnerability the refactor opened, so the two belong together.

Follow-up to #13. Closes gap A3 from the post-refactor architectural assessment. **Problem:** the refactor exempted `typing.relay` from the global rate-limit cap (60/min) so legitimate typing doesn't starve user actions. The exemption is total — a client sending 1000 `typing.start`/sec would fan out to every channel member, amplifying by channel size. The refactor's belt-and-suspenders `user_id` override prevents spoofing but does nothing about volume. **Solution:** add a third `TokenBucket` to `PerCallerLimits` for typing specifically. 60/min per caller — bypasses global (so heavy typing is tolerated) but has its own ceiling. Fails soft via the existing `RpcError::RateLimited` path. **Docs:** - Spec: [`plan/feature-ws-typing-ratelimit.md`](../src/branch/feat/ws-refactor/plan/feature-ws-typing-ratelimit.md) - Impl plan: [`plan/impl-ws-typing-ratelimit.md`](../src/branch/feat/ws-refactor/plan/impl-ws-typing-ratelimit.md) **Size:** ~60 lines production + ~80 lines tests. 2 tasks. **Branch:** lands on `feat/ws-refactor` (same PR as the refactor) — closes a vulnerability the refactor opened, so the two belong together.
Author
Member

Implementation landed

Two commits on feat/ws-refactor:

  • ec91bda feat(server): per-caller typing.relay rate limit (60/min default)
    Extends PerCallerLimits with a third TokenBucket (typing) alongside the existing global and send. New RateLimiter::new_with_typing(global, send, typing) constructor; new(global, send) still works and defaults typing to 60. check_method routes "typing.relay" through a dedicated branch that returns Ok(()) BEFORE touching the global bucket — confirmed by reading the control flow. typing.relay removed from the with_exempt([...]) call in main.rs because the new branch handles it. 3 new unit tests.

  • f66e952 test(server): typing.relay floods get rejected after bucket drains
    Integration test fires 100 rapid typing.relay calls; asserts 50–70 successes + ≥30 rejections with JSON-RPC error code matching RpcError::RateLimited.

Plan corrections caught during implementation

The spec assumed the RateLimited JSON-RPC error code was -32029 (generic JSON-RPC range). Actual code is -32005 (per crates/hero_collab_server/src/rpc_error.rs). Similarly the spec assumed raw_rpc returned a JSON-RPC envelope with result/error keys; it actually returns Result<Value, OpenRpcError>. Both fixed in b6bb40d so future executors don't repeat the stumble.

Architectural notes

  • exempt_methods allowlist mechanism stays registered, but after removing typing.relay from it, it currently lists zero methods. Kept as future-proofing for any genuine "bypass everything" use case (none currently planned).
  • Pattern precedent: future high-frequency relay methods should get their own bucket (like typing) rather than go on the exempt list. The exempt list is for unlimited things; buckets are for rate-limited things.
  • Memory footprint per caller went from 120 bytes (2 buckets) to 180 bytes (3 buckets). At 10k concurrent callers that's 1.8 MB — trivial.

Resolves against the amplification vector introduced by the ws-refactor (events.sock × channel-size fanout per typing.relay call).

Tests impact

cargo test -p hero_collab_server: unit 56, integration 63 (was 62 in pre-A3, +1 new flood test). Zero regressions.

## Implementation landed Two commits on `feat/ws-refactor`: - [`ec91bda`](../commit/ec91bda) `feat(server): per-caller typing.relay rate limit (60/min default)` Extends `PerCallerLimits` with a third `TokenBucket` (`typing`) alongside the existing `global` and `send`. New `RateLimiter::new_with_typing(global, send, typing)` constructor; `new(global, send)` still works and defaults typing to 60. `check_method` routes `"typing.relay"` through a dedicated branch that returns `Ok(())` BEFORE touching the global bucket — confirmed by reading the control flow. `typing.relay` removed from the `with_exempt([...])` call in `main.rs` because the new branch handles it. 3 new unit tests. - [`f66e952`](../commit/f66e952) `test(server): typing.relay floods get rejected after bucket drains` Integration test fires 100 rapid `typing.relay` calls; asserts 50–70 successes + ≥30 rejections with JSON-RPC error code matching `RpcError::RateLimited`. ## Plan corrections caught during implementation The spec assumed the `RateLimited` JSON-RPC error code was `-32029` (generic JSON-RPC range). Actual code is **`-32005`** (per `crates/hero_collab_server/src/rpc_error.rs`). Similarly the spec assumed `raw_rpc` returned a JSON-RPC envelope with `result`/`error` keys; it actually returns `Result<Value, OpenRpcError>`. Both fixed in [`b6bb40d`](../commit/b6bb40d) so future executors don't repeat the stumble. ## Architectural notes - `exempt_methods` allowlist mechanism stays registered, but after removing `typing.relay` from it, it currently lists zero methods. Kept as future-proofing for any genuine "bypass everything" use case (none currently planned). - **Pattern precedent:** future high-frequency relay methods should get their own bucket (like typing) rather than go on the exempt list. The exempt list is for unlimited things; buckets are for rate-limited things. - Memory footprint per caller went from 120 bytes (2 buckets) to 180 bytes (3 buckets). At 10k concurrent callers that's 1.8 MB — trivial. Resolves against the amplification vector introduced by the ws-refactor (events.sock × channel-size fanout per `typing.relay` call). ## Tests impact `cargo test -p hero_collab_server`: unit 56, integration 63 (was 62 in pre-A3, +1 new flood test). Zero regressions.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_collab#14
No description provided.