fix(web): single-instance pidfile guard to stop stray web procs on redeploy #115

Merged
salmaelsoly merged 1 commit from development_web_pidfile_guard into integration 2026-06-14 12:30:52 +00:00
Member

Summary

Fixes redeploy fragility in hero_shrimp_web: repeated relaunches left several stray processes because run_ui() blindly unlinked web.sock and bound a fresh inode with no liveness check, so the old process kept serving its orphaned inode. This adds a single-instance pidfile guard, reusing the daemon's proven PidFile (now shared).

Closes #46

Changes

  • Extracted PidFile into the shared hero_shrimp_types::pidfile (single implementation for daemon + web); generalised the adjacent-socket-to-clear into a parameter.
  • Added hero_shrimp_types::paths::default_web_pidfile() (web.pid, distinct from daemon.pid).
  • hero_shrimp_server re-exports the shared guard; dropped its now-unused libc dep.
  • hero_shrimp_web::run_ui() acquires the guard (force-takeover) before binding and holds it across axum::serve; refuses loudly if a predecessor can't be evicted.
  • Documented the hardened redeploy in docs/getting-started.md.

Test Results

  • cargo test -p hero_shrimp_types: 24 unit + 6 doctests pass, incl. new web-socket takeover and web-pidfile path tests.
  • cargo build for the three crates succeeds; server pidfile behaviour unchanged.
  • Reproduced the issue: three back-to-back relaunches previously left all three alive; now they deterministically collapse to one live instance, with an eviction banner logged per predecessor.
## Summary Fixes redeploy fragility in `hero_shrimp_web`: repeated relaunches left several stray processes because `run_ui()` blindly unlinked `web.sock` and bound a fresh inode with no liveness check, so the old process kept serving its orphaned inode. This adds a single-instance pidfile guard, reusing the daemon's proven `PidFile` (now shared). ## Related Issue Closes https://forge.ourworld.tf/lhumina_code/hero_shrimp/issues/46 ## Changes - Extracted `PidFile` into the shared `hero_shrimp_types::pidfile` (single implementation for daemon + web); generalised the adjacent-socket-to-clear into a parameter. - Added `hero_shrimp_types::paths::default_web_pidfile()` (`web.pid`, distinct from `daemon.pid`). - `hero_shrimp_server` re-exports the shared guard; dropped its now-unused `libc` dep. - `hero_shrimp_web::run_ui()` acquires the guard (force-takeover) before binding and holds it across `axum::serve`; refuses loudly if a predecessor can't be evicted. - Documented the hardened redeploy in `docs/getting-started.md`. ## Test Results - `cargo test -p hero_shrimp_types`: 24 unit + 6 doctests pass, incl. new web-socket takeover and web-pidfile path tests. - `cargo build` for the three crates succeeds; server pidfile behaviour unchanged. - Reproduced the issue: three back-to-back relaunches previously left all three alive; now they deterministically collapse to one live instance, with an eviction banner logged per predecessor.
salmaelsoly changed title from fix(web): single-instance pidfile guard to stop stray web procs on redeploy to WIP: fix(web): single-instance pidfile guard to stop stray web procs on redeploy 2026-06-14 09:48:34 +00:00
salmaelsoly changed title from WIP: fix(web): single-instance pidfile guard to stop stray web procs on redeploy to fix(web): single-instance pidfile guard to stop stray web procs on redeploy 2026-06-14 09:48:41 +00:00
salmaelsoly force-pushed development_web_pidfile_guard from f87d3727fa to 774c2eaae5
Some checks failed
lab release / release (pull_request) Failing after 1s
2026-06-14 10:55:46 +00:00
Compare
salmaelsoly force-pushed development_web_pidfile_guard from 774c2eaae5
Some checks failed
lab release / release (pull_request) Failing after 1s
to 8ce767e8b8 2026-06-14 11:54:52 +00:00
Compare
salmaelsoly merged commit 98af83d67c into integration 2026-06-14 12:30:52 +00:00
salmaelsoly deleted branch development_web_pidfile_guard 2026-06-14 12:30:58 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_shrimp!115
No description provided.