fix(engine): harden recovery regex compilation against panics #56

Merged
rawan merged 1 commit from development_recovery_panic_hygiene into development 2026-05-25 14:33:45 +00:00
Member

Summary

P1 panic-hygiene fix for the recovery hot path. The only remaining production panic surface in tool_call_recovery.rs was three Regex::new(...).expect(...) calls on constant regex literals. Each is now cached in a OnceLock<Option<Regex>> and degrades gracefully on compile failure — a tracing::warn! plus the function's documented neutral fallback. Public signatures unchanged; callers compile without modification.

Closes #42

Changes

  • crates/hero_shrimp_engine/src/agent_core/agent/tool_call_recovery.rs
    • strip_bracket_tool_calls: regex cached in static BRACKET_RE: OnceLock<Option<Regex>>. On compile failure: tracing::warn!(target: "tool_call_recovery", ...) and return (content.to_string(), Vec::new()) — same as the existing "no markers found" branch.
    • normalize_arrow_syntax: both regexes cached in their own OnceLock<Option<Regex>>. On compile failure of either: tracing::warn! and return None — same as the existing "shape didn't match" branch.
    • Side effect: regexes are compiled once per process instead of on every recovery turn.
  • crates/hero_shrimp_engine/src/agent_core/agent/force_write.rs — audited, no production changes required. All unwrap() calls (lines 210–451) are inside #[cfg(test)] mod tests (boundary at line 198).

Test Results

  • cargo test -p hero_shrimp_engine --lib tool_call_recovery40 passed, 0 failed, including the proptest fuzz suite and recovery_never_panics_on_adversarial_input.
  • cargo test -p hero_shrimp_engine --lib force_write14 passed, 0 failed.
  • cargo test -p hero_shrimp_engine (full crate) — 1639 passed, 9 failed. The 9 failures are pre-existing on development and unrelated to this change — they live in verification::runner::*, tools::external_cmd::*, tools::tool_catalog::verify::e2e_datetime_server::*, and tests::autonomy_*, all of which spawn shell processes or open sockets and fail in the local sandboxed environment regardless of branch. Verified by re-running on development: identical 1639/9 split, identical test names.
  • cargo build -p hero_shrimp_engine — clean, no new warnings.

Notes

The regex panic path is unreachable in practice (constant literals known to compile), so this is defense-in-depth. It formalises the contract already asserted by the adversarial test at tool_call_recovery.rs lines 696-707 (which wraps recovery in std::panic::catch_unwind): no input or environmental condition may panic the recovery routine.

## Summary P1 panic-hygiene fix for the recovery hot path. The only remaining production panic surface in `tool_call_recovery.rs` was three `Regex::new(...).expect(...)` calls on constant regex literals. Each is now cached in a `OnceLock<Option<Regex>>` and degrades gracefully on compile failure — a `tracing::warn!` plus the function's documented neutral fallback. Public signatures unchanged; callers compile without modification. ## Related Issue Closes https://forge.ourworld.tf/lhumina_code/hero_shrimp/issues/42 ## Changes - `crates/hero_shrimp_engine/src/agent_core/agent/tool_call_recovery.rs` - `strip_bracket_tool_calls`: regex cached in `static BRACKET_RE: OnceLock<Option<Regex>>`. On compile failure: `tracing::warn!(target: "tool_call_recovery", ...)` and return `(content.to_string(), Vec::new())` — same as the existing "no markers found" branch. - `normalize_arrow_syntax`: both regexes cached in their own `OnceLock<Option<Regex>>`. On compile failure of either: `tracing::warn!` and return `None` — same as the existing "shape didn't match" branch. - Side effect: regexes are compiled once per process instead of on every recovery turn. - `crates/hero_shrimp_engine/src/agent_core/agent/force_write.rs` — audited, no production changes required. All `unwrap()` calls (lines 210–451) are inside `#[cfg(test)] mod tests` (boundary at line 198). ## Test Results - `cargo test -p hero_shrimp_engine --lib tool_call_recovery` — **40 passed, 0 failed**, including the proptest fuzz suite and `recovery_never_panics_on_adversarial_input`. - `cargo test -p hero_shrimp_engine --lib force_write` — **14 passed, 0 failed**. - `cargo test -p hero_shrimp_engine` (full crate) — **1639 passed, 9 failed**. The 9 failures are pre-existing on `development` and unrelated to this change — they live in `verification::runner::*`, `tools::external_cmd::*`, `tools::tool_catalog::verify::e2e_datetime_server::*`, and `tests::autonomy_*`, all of which spawn shell processes or open sockets and fail in the local sandboxed environment regardless of branch. Verified by re-running on `development`: identical 1639/9 split, identical test names. - `cargo build -p hero_shrimp_engine` — clean, no new warnings. ## Notes The regex panic path is unreachable in practice (constant literals known to compile), so this is defense-in-depth. It formalises the contract already asserted by the adversarial test at `tool_call_recovery.rs` lines 696-707 (which wraps recovery in `std::panic::catch_unwind`): no input or environmental condition may panic the recovery routine.
fix(engine): harden recovery regex compilation against panics
Some checks failed
Verify / verify (push) Failing after 9s
Verify / verify (pull_request) Failing after 11s
5f84e049dc
Replace three Regex::new(...).expect(...) calls in tool_call_recovery.rs
with OnceLock<Option<Regex>> patterns so a regex compile failure can no
longer crash the recovery hot path. On failure each callsite emits a
tracing::warn! and returns its documented neutral fallback
((content, Vec::new()) for strip_bracket_tool_calls, None for
normalize_arrow_syntax), letting the existing downstream escalation
machinery take over. Side effect: the regexes are now compiled once
per process instead of on every recovery turn.

force_write.rs required no production changes — every unwrap() in
that file is inside #[cfg(test)] mod tests.

#42
rawan force-pushed development_recovery_panic_hygiene from 5f84e049dc
Some checks failed
Verify / verify (push) Failing after 9s
Verify / verify (pull_request) Failing after 11s
to f140a2ca8a
Some checks failed
Verify / verify (pull_request) Failing after 1m34s
Verify / verify (push) Failing after 1m37s
2026-05-25 14:33:18 +00:00
Compare
rawan merged commit 48ddd5d927 into development 2026-05-25 14:33:45 +00:00
rawan deleted branch development_recovery_panic_hygiene 2026-05-25 14:33:45 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_shrimp!56
No description provided.