Per-caller rate limit for typing.relay (A3 — WS refactor follow-up) #14
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_collab#14
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Follow-up to #13. Closes gap A3 from the post-refactor architectural assessment.
Problem: the refactor exempted
typing.relayfrom the global rate-limit cap (60/min) so legitimate typing doesn't starve user actions. The exemption is total — a client sending 1000typing.start/sec would fan out to every channel member, amplifying by channel size. The refactor's belt-and-suspendersuser_idoverride prevents spoofing but does nothing about volume.Solution: add a third
TokenBuckettoPerCallerLimitsfor typing specifically. 60/min per caller — bypasses global (so heavy typing is tolerated) but has its own ceiling. Fails soft via the existingRpcError::RateLimitedpath.Docs:
plan/feature-ws-typing-ratelimit.mdplan/impl-ws-typing-ratelimit.mdSize: ~60 lines production + ~80 lines tests. 2 tasks.
Branch: lands on
feat/ws-refactor(same PR as the refactor) — closes a vulnerability the refactor opened, so the two belong together.Implementation landed
Two commits on
feat/ws-refactor:ec91bdafeat(server): per-caller typing.relay rate limit (60/min default)Extends
PerCallerLimitswith a thirdTokenBucket(typing) alongside the existingglobalandsend. NewRateLimiter::new_with_typing(global, send, typing)constructor;new(global, send)still works and defaults typing to 60.check_methodroutes"typing.relay"through a dedicated branch that returnsOk(())BEFORE touching the global bucket — confirmed by reading the control flow.typing.relayremoved from thewith_exempt([...])call inmain.rsbecause the new branch handles it. 3 new unit tests.f66e952test(server): typing.relay floods get rejected after bucket drainsIntegration test fires 100 rapid
typing.relaycalls; asserts 50–70 successes + ≥30 rejections with JSON-RPC error code matchingRpcError::RateLimited.Plan corrections caught during implementation
The spec assumed the
RateLimitedJSON-RPC error code was-32029(generic JSON-RPC range). Actual code is-32005(percrates/hero_collab_server/src/rpc_error.rs). Similarly the spec assumedraw_rpcreturned a JSON-RPC envelope withresult/errorkeys; it actually returnsResult<Value, OpenRpcError>. Both fixed inb6bb40dso future executors don't repeat the stumble.Architectural notes
exempt_methodsallowlist mechanism stays registered, but after removingtyping.relayfrom it, it currently lists zero methods. Kept as future-proofing for any genuine "bypass everything" use case (none currently planned).Resolves against the amplification vector introduced by the ws-refactor (events.sock × channel-size fanout per
typing.relaycall).Tests impact
cargo test -p hero_collab_server: unit 56, integration 63 (was 62 in pre-A3, +1 new flood test). Zero regressions.