Default 10 second upstream timeout cuts off substrate-awaiting compute calls #111
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_router#111
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
hero_router's default upstream timeout (10 seconds) is shorter than the legitimate response time of substrate-awaiting RPCs. Concretely,
deployer.provision_vmcallsComputeService.deploy_vmwhich now (per the D-27 fix at hero_compute 39d9b8a) waits for on-chain ack before returning, so a successful response can take 30 to 300 seconds. Through hero_router the call returnscompute.deploy_vm: invalid response shape: json parse: expected value at line 1 column 1; raw: upstream timeoutafter exactly 10 seconds while the upstream daemon is still legitimately working. Bypassing the router and hitting the daemon UDS directly returns the real result in 70+ seconds with no client-side timeout. Two reasonable fix options: lift the default to something like 600 seconds for all routes that route to a substrate-backed daemon, or accept a per-route timeout config so the compute path gets 600 seconds while everything else keeps the existing default. The current behaviour silently truncates the deployer-mediated flow even when the underlying call succeeds eventually.Signed-by: mik-tf mik-tf@noreply.invalid