osis dispatch flattens typed handler errors → all surface as -32603 "Redis operation error" (validation errors should be -32602) #83
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_rpc#83
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Every business/validation error returned by an OSIS service handler surfaces to JSON-RPC clients as
-32603 Internal errorwith a misleadingdatastring prefixedRedis operation error:, regardless of the actual cause. Typed error variants (e.g. a service'sInvalidInputvsInternal) are discarded by the generated dispatch glue, so clients can't distinguish a caller mistake (should be-32602 Invalid params) from a real server fault (-32603).Seen across every osis-based service (hero_db, hero_proc, hero_compute, …). Concrete example from
hero_compute'sComputeService.deploy_vm, which returnsComputeServiceError::InvalidInput("insufficient capacity …"):The service did the right thing (typed
InvalidInput), but the wire response saysInternal error/Redis operation error, neither of which is true.Mechanism (where the type info is lost)
ComputeServiceError::InvalidInput(msg)(Display:"Invalid input: <msg>").*_server_generated.rs,handle_service_call/handle_rpc_call) flattens it to a string and re-wraps it: The typed variant is gone here — only the Display string remains.RpcError::Operation'sDisplayis hardcoded (crates/osis/src/rpc/error.rs:13): → the misleadingRedis operation error:prefix, even when nothing Redis-related happened.crates/osis/src/rpc/dispatch.rs) maps the failed call to a fixed JSON-RPC code:JsonRpcError::internal_error(&rpc_resp.data)→-32603JsonRpcError::application_error(...)→-32000There is no branch that produces
invalid_params(-32602) for caller/validation errors.Impact
-32603.Redis operation error:prefix is actively misleading during debugging (suggests a storage fault that didn't occur).Proposed fix
Proper (preferred): preserve the handler error category through the generated glue so dispatch can map it:
ComputeServiceError) a smallcategory()(or a sharedRpcErrorKind:Invalid,NotFound,PermissionDenied,Internal).RpcErrorit constructs (add variants likeRpcError::Invalid(String)/RpcError::NotFound(String)/RpcError::PermissionDenied(String)instead of funneling everything throughOperation).dispatch.rs:Invalid → invalid_params(-32602),NotFound → -32602(or app error -32000),PermissionDenied → app error,Internal/Operation → internal_error(-32603).RpcError::Operation's message so non-Redis failures don't claimRedis operation error.Pragmatic interim (smaller, fragile): in
dispatch.rs, when building theJsonRpcErrorfrom a failed response, classify by the message prefix produced by the standard error Displays —"Invalid input:" → invalid_params(-32602),"Not found:" → -32602,"Permission denied:" → application_error. Also fix theOperationDisplay prefix. String-matching is brittle and should be a stopgap only.Acceptance
InvalidInputerror surfaces as JSON-RPC-32602with a clean message (noRedis operation error:/Internal errorwording).-32603.Context
Found while adding a capacity precheck to
hero_computedeploy_vm (lhumina_code/hero_compute PR #113). The precheck correctly returnsInvalidInput, but it reaches the client as-32603 Internal errorbecause of the above. Filing here because the fix is framework + codegen, not per-service.Working on this in
issue-83-error-categories. First commit (7d37514) lands the framework + codegen changes:RpcErrorKind { Invalid, NotFound, PermissionDenied, Internal }added incrates/osis/src/rpc/error.rs, plus matchingRpcErrorvariants. RenamedRpcError::OperationDisplay prefix away fromRedis operation error:.protocol::RpcResponseand the per-domain generatedRpcResponsenow carryerror_kind.handle_request,handle_rpc_call, andhandle_service_callpropagate the category instead of collapsing toRpcError::Operation(String).crates/generator/src/rust/rust_rpc.rs) emitscategory()on every<Service>Errorenum; the OSIS emit (crates/generator/src/rust/rust_osis.rs) wires it into the service-method RPC wrappers and intohandle_rpc_call/handle_service_call.crates/osis/src/rpc/dispatch.rs,crates/server/src/server/{domain,unified}_server.rs) now map by typed variant:InvalidandNotFound→-32602,PermissionDenied→-32000,Internal→-32603.Next: regenerate
example/recipe_server, then add a unit/integration test exercisingInvalidInput→-32602.All three pieces landed on
issue-83-error-categories(4 commits, last is46601e1). Summary of how the proposed-fix design plays out:Shared category:
RpcErrorKind { Invalid, NotFound, PermissionDenied, Internal }incrates/osis/src/rpc/error.rs, plus matchingRpcError::Invalid/NotFound/PermissionDeniedvariants.RpcError::OperationDisplay fixed (no moreRedis operation error:for non-Redis failures).Codegen carries it through: every generated
<Service>Errorenum now hascategory()returning anRpcErrorKind. The OSIS server glue (*_server_generated.rs) tagsRpcResponse.error_kindfrome.category()for service-method errors and fromRpcErrorKind::Invalidfor missing-parameter errors;handle_rpc_call/handle_service_callreconstruct the typedRpcErrorvariant fromerror_kindinstead of funnelling throughRpcError::Operation.Dispatch maps by variant:
RpcError::InvalidandNotFound→JsonRpcError::invalid_params(...)(-32602)RpcError::PermissionDenied→JsonRpcError::application_error(...)(-32000)RpcError::Operationand other Internal-kind variants →JsonRpcError::internal_error(...)(-32603)Applied at
crates/osis/src/rpc/dispatch.rsand atcrates/server/src/server/{domain,unified}_server.rs. The old string-matching classification (msg.contains("Unknown method"),msg.contains("not found"), …) is gone in favor of typedRpcErrormatching.Acceptance verification: regenerated
example/recipe_servershows the new glue (verified by grep onosis_server_generated.rs). Three integration tests intests_error_category.rsdriveOsisRecipes::handle_service_calland assert:RpcError::Invalid(...), message contains noRedis operation error:prefixRecipeServiceError::NotFound→RpcError::NotFound(...), same Redis-prefix guardUnknown method:marker dispatch keys on for-32601Unit tests in
error.rs(3) anddispatch.rs(6) pin the kind→code mapping. All 77hero_rpc_osislib tests + 6 recipe_server tests pass. The branch is ready for review at https://forge.ourworld.tf/lhumina_code/hero_rpc/compare/development...issue-83-error-categories