fix(terminal): abort sibling PTY tasks on disconnect to avoid task leak (#50) #56
No reviewers
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_router!56
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "development_fix_pty_task_leak"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
pty_proxyspawns four concurrent tasks (browser_to_server,bytes_bridge,forwarder,server_to_browser) and races them withtokio::select!. When one finishes theselect!used to return and the other threeJoinHandles were dropped — which does NOT cancel the tasks. Under rapid reconnect, three tasks piled up per dead session, each holding a WebSocket sink, buffers, andArcreferences.This PR pins the handles, aborts the non-winners on each arm, and awaits the aborted handles so their captured stack frames drop before
pty_proxyreturns. No new crate dependency (tokio_util::sync::CancellationTokenwas considered and rejected for the 4-task case).Closes #50.
Changes
crates/hero_router/src/server/terminal.rs(+50 / -5)tokio::select!tail ofpty_proxywith:tokio::pin!(browser_to_server, bytes_bridge, forwarder, server_to_browser);select!arms, each of shape_ = &mut X => { abort(others); let _ = others.await; }.attached().remove(&name)block AFTER the select.inject_reply(called from theterminal.replyRPC) and the internal resize handler both read the map entry during normal session operation — moving the remove earlier would silently break CPR/DA reply forwarding and resize updates.Correctness notes
select!. Awaiting it again can deadlock or panic depending on the tokio version. This PR only awaits the three non-winners in each arm.let _ = h.awaitafter the aborts ensures each cancelled task finishes unwinding beforepty_proxyreturns, so capturedSplitSinks, sources, and channel senders drop promptly.Cargo.tomlis untouched.Why no unit test
The four tasks close over
WebSocketsinks /SplitSink<WebSocketStream<..>>types that are not trivially mockable, andattached()is a process-globalOnceLock. A meaningful end-to-end test would require an in-process axum WebSocket harness plus a tokio-tungstenite echo server — heavier than the fix itself. The change is verifiable by inspection of the four arms.Verification
cargo build --workspace: pass.cargo test --workspace: 71 passed / 0 failed / 0 ignored.cargo clippy --workspace --all-targets -- -D warnings: clean.cargo fmt --all --check: clean.Suggested manual smoke test for reviewer