Steering an autonomy job in its final/summary turn is silently dropped

rawan commented

2026-06-07 13:44:48 +00:00

Member

Symptom

Steered a running autonomy job ("add docs and tests") while it was wrapping up. The guidance was acknowledged ("will be picked up at the next checkpoint") but never applied — the job finished without docs/tests.

Root cause

There are two steering paths, and message-send steering uses the one with a timing gap:

Live inbox (session.steer -> global_steering_inbox): drained every iteration at iteration_shell.rs L193-211 and folded in as a [steering #N] user message. Immediate.
Autonomy operator-guidance: message.send to a live autonomy job routes to steer_existing_job_from_message (session_autonomy.rs L139-147), which does NOT touch the live inbox — it writes pending_operator_guidance. That is only surfaced by OperatorGuidanceProvider (job_context.rs L171-201), which runs inside build_history — i.e. only when the orchestrator builds the prompt for the next turn/checkpoint.

If the job is already in its finalization/summary turn ("deliverable is complete, let me just report the summary and be done"), there is no next checkpoint, so the queued guidance is never rendered into a prompt and is dropped.

Secondary: pending_operator_guidance is read but never cleared after consumption (job_context.rs), so on a job that does continue it can be re-injected every turn.

Proposed fix

Gate completion on pending guidance: before an autonomy job finalizes, check pending_operator_guidance; if present, force one more turn/replan instead of ending. (Fixes the exact case.)
Optionally also route message-send steering into the live inbox so it is drained per-iteration like session.steer.
Clear pending_operator_guidance once consumed.

## Symptom Steered a running autonomy job ("add docs and tests") while it was wrapping up. The guidance was acknowledged ("will be picked up at the next checkpoint") but never applied — the job finished without docs/tests. ## Root cause There are two steering paths, and message-send steering uses the one with a timing gap: 1. **Live inbox** (`session.steer` -> `global_steering_inbox`): drained every iteration at `iteration_shell.rs` L193-211 and folded in as a `[steering #N]` user message. Immediate. 2. **Autonomy operator-guidance**: message.send to a live autonomy job routes to `steer_existing_job_from_message` (`session_autonomy.rs` L139-147), which does NOT touch the live inbox — it writes `pending_operator_guidance`. That is only surfaced by `OperatorGuidanceProvider` (`job_context.rs` L171-201), which runs inside `build_history` — i.e. only when the orchestrator builds the prompt for the *next* turn/checkpoint. If the job is already in its finalization/summary turn ("deliverable is complete, let me just report the summary and be done"), there is no next checkpoint, so the queued guidance is never rendered into a prompt and is dropped. Secondary: `pending_operator_guidance` is read but never cleared after consumption (`job_context.rs`), so on a job that does continue it can be re-injected every turn. ## Proposed fix 1. Gate completion on pending guidance: before an autonomy job finalizes, check `pending_operator_guidance`; if present, force one more turn/replan instead of ending. (Fixes the exact case.) 2. Optionally also route message-send steering into the live inbox so it is drained per-iteration like `session.steer`. 3. Clear `pending_operator_guidance` once consumed.

rawan commented

2026-06-07 13:51:52 +00:00

Author

Member

Update: fails from BOTH UI entry points

Confirmed steering does nothing from either UI control — and they call different RPCs, so the failure spans both mechanisms, not just the checkpoint-timing gap above:

UI control	Handler	RPC	Backend path
Chat composer “Steer” button	`MessageInput.tsx:290` -> `steerActiveTurn` (`store.ts:1126-1135`)	`session.steer`	live inbox, drained every iteration (`iteration_shell.rs:193-211`)
Job panel “steer →”	`LiveJobPanel.tsx:271-275`	`job.steer`	operator-guidance / checkpoint
Job drawer SteeringInput	`SteeringInput.tsx:37`	`job.steer`	operator-guidance / checkpoint
Activity “nudge it”	`ChatActivity.tsx:96`	`session.steer`	live inbox

The original issue only explained the job.steer / checkpoint path. But the chat Steer button uses session.steer (the live inbox that’s drained every iteration) and that also doesn’t land — so the live-inbox path is broken too.

Lead to verify: the live-inbox drain is gated on options.session_id being Some (iteration_shell.rs:193 — if let Some(sid) = state.options.session_id ...). If an autonomy job’s agent loop runs with session_id: None, session.steer pushes to a key nothing ever drains, so it silently no-ops. Need to confirm whether autonomy turns thread the session id into AgentOptions.

Net: all four steering entry points fail for autonomy jobs — job.steer ones because guidance is only read at a next checkpoint that never comes, and session.steer ones likely because the loop has no session_id to drain against (to confirm).

## Update: fails from BOTH UI entry points Confirmed steering does nothing from either UI control — and they call **different RPCs**, so the failure spans both mechanisms, not just the checkpoint-timing gap above: | UI control | Handler | RPC | Backend path | |---|---|---|---| | Chat composer **“Steer”** button | `MessageInput.tsx:290` -> `steerActiveTurn` (`store.ts:1126-1135`) | `session.steer` | **live inbox**, drained every iteration (`iteration_shell.rs:193-211`) | | **Job panel** “steer →” | `LiveJobPanel.tsx:271-275` | `job.steer` | operator-guidance / checkpoint | | **Job drawer** SteeringInput | `SteeringInput.tsx:37` | `job.steer` | operator-guidance / checkpoint | | Activity “nudge it” | `ChatActivity.tsx:96` | `session.steer` | live inbox | The original issue only explained the `job.steer` / checkpoint path. But the chat **Steer** button uses `session.steer` (the live inbox that’s drained every iteration) and **that also doesn’t land** — so the live-inbox path is broken too. **Lead to verify:** the live-inbox drain is gated on `options.session_id` being `Some` (`iteration_shell.rs:193` — `if let Some(sid) = state.options.session_id ...`). If an autonomy job’s agent loop runs with `session_id: None`, `session.steer` pushes to a key nothing ever drains, so it silently no-ops. Need to confirm whether autonomy turns thread the session id into `AgentOptions`. Net: **all four steering entry points fail for autonomy jobs** — `job.steer` ones because guidance is only read at a next checkpoint that never comes, and `session.steer` ones likely because the loop has no `session_id` to drain against (to confirm).

rawan commented

2026-06-11 12:42:20 +00:00

Author

Member

Implementation Spec for Issue #93

Findings (confirmed against code)

Live-inbox drain & session_id gating — iteration_shell.rs:193-211: drain is gated on if let Some(sid) = state.options.session_id..., calls global_steering_inbox().drain_with_ids(sid, mode), folds each in as a [steering #N] user message. Confirmed.
Steering inbox keying — steering.rs:77-78: inbox is by_session: HashMap<i64, VecDeque<QueuedMessage>>, keyed by i64 session id. session.steer pushes via global_steering_inbox().push(sid, text) (session.rs:172-174). Confirmed.
steer_existing_job_from_message — session_autonomy.rs:455-519: writes pending_operator_guidance into the job row details_json via queue_pending_operator_guidance (L571-611). Does NOT touch the live inbox. Reply says "picked up at the next checkpoint." Confirmed.
OperatorGuidanceProvider — job_context.rs:163-229: runs inside build_history, reads guidance from workspace task_state.json and/or DB details_json. Confirmed.
build_history runs once per turn — loop_setup.rs:224: providers run once at turn setup, not per iteration. If the job is in its final turn, no new prompt is built and queued guidance is dropped. This is the root cause. Confirmed.
pending_operator_guidance never cleared — written at session_autonomy.rs:588, read at job_context.rs:187,228, no DB write removes it. A continuing job re-injects the same guidance every turn. Confirmed.
The issue's session_id "lead" is FALSE for the in-process path — proof_run.rs:658-696 -> in_process.rs:204-207 -> pipeline.rs:385,405: the autonomy loop runs with options.session_id = Some(real_session_id). The chat and its autonomy job share the session id, so the iteration_shell drain DOES run for autonomy jobs. The session.steer path is not no-op'd by a missing session id. Confirmed.

Objective

Ensure operator guidance steered into a running autonomy job is applied even when the job is in its final/summary turn, and stop re-injecting already-consumed guidance on jobs that continue.

Requirements

Before an autonomy agent loop finalizes, if there is pending operator guidance not yet folded into the current turn, fold it in and continue for at least one more iteration instead of finalizing.
Clear consumed guidance from its source so it is not re-applied on subsequent turns/iterations.
Scope strictly to autonomy jobs; chat turns must be byte-identical.
Bound the gate so a failed clear cannot loop the final turn forever.

Files to Modify

crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs — add read + clear helpers for pending_operator_guidance.
crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs — add the gate before record_final_answer.
crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs — add a bounded per-turn fold counter to AgentLoopState.
crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs — (optional hardening) mirror message.send steer into the live inbox.

Implementation Plan

Step 1: Read + clear helpers in `job_context.rs`

Add pending_operator_guidance(database, job_id) exposing the existing DB read (L215-229), and clear_pending_operator_guidance(database, job_id) that re-resolves the row (i64 -> get_autonomy_job, fallback get_autonomy_job_by_artifact_job_id), removes pending_operator_guidance/_at from details_json, retains pending_operator_guidance_events as audit, and writes via upsert_autonomy_job.
Dependencies: none.

Step 2: Bounded fold counter in `AgentLoopState`

Add operator_guidance_folds: u32 (init 0) and a MAX_OPERATOR_GUIDANCE_FOLDS cap (e.g. 3). Initialize at the construction site in loop_setup.rs.
Dependencies: none (Step 3 reads it).

Step 3: Gate finalization in `no_tool_calls.rs`

Immediately before record_final_answer, add: if options.job_kind().is_autonomy() && counter < cap && pending guidance exists -> push guidance as a high-priority user message, increment counter, clear the DB guidance, return Continue. Refund one iteration if needed so the forced turn actually runs.
Dependencies: Steps 1, 2.

Step 4 (optional): Mirror message.send steer into live inbox

In session_autonomy.rs::steer_existing_job_from_message, also global_steering_inbox().push(session.as_i64(), text) so the running loop drains it per-iteration.
Dependencies: independent.

Step 5: Tests

clear_pending_operator_guidance round-trip (write, clear, assert key gone, events retained) co-located with existing job_context.rs tests.
Loop test: autonomy job + pending guidance -> handle_no_tool_calls returns Continue and clears the row; non-autonomy chat turn returns Break.

Acceptance Criteria

Autonomy job steered during its final/summary turn applies the guidance instead of finishing without it.
pending_operator_guidance is removed once folded in; pending_operator_guidance_events retained as audit.
A continuing autonomy job no longer re-injects the same guidance every turn.
Gate fires only for autonomy jobs; chat turns are byte-identical (still Break).
Gate is bounded so a failed clear cannot loop the final turn forever.
New tests cover the clear-helper round-trip and autonomy-vs-chat gate behavior; existing tests pass.

Notes

The issue's central session_id lead is wrong for the in-process path (Finding 7); the confirmed failure is the checkpoint-timing gap plus never-cleared guidance. The RPC-client topology (separate worker process) was not traced — if autonomy jobs dispatch to another process, the inbox is process-global and Step 4 alone would not bridge processes.
OperatorGuidanceProvider reads guidance from both the workspace task_state.json and the DB row; Step 1 clears only the DB key. Whether the on-disk operator_guidance field is auto-cleared after a replan was not fully confirmed.
If the iteration budget is already exhausted when the gate fires, the forced Continue needs an iteration refund to produce a real extra turn.

## Implementation Spec for Issue #93 ### Findings (confirmed against code) 1. **Live-inbox drain & session_id gating** — `iteration_shell.rs:193-211`: drain is gated on `if let Some(sid) = state.options.session_id...`, calls `global_steering_inbox().drain_with_ids(sid, mode)`, folds each in as a `[steering #N]` user message. Confirmed. 2. **Steering inbox keying** — `steering.rs:77-78`: inbox is `by_session: HashMap<i64, VecDeque<QueuedMessage>>`, keyed by i64 session id. `session.steer` pushes via `global_steering_inbox().push(sid, text)` (`session.rs:172-174`). Confirmed. 3. **`steer_existing_job_from_message`** — `session_autonomy.rs:455-519`: writes `pending_operator_guidance` into the job row `details_json` via `queue_pending_operator_guidance` (L571-611). Does NOT touch the live inbox. Reply says "picked up at the next checkpoint." Confirmed. 4. **`OperatorGuidanceProvider`** — `job_context.rs:163-229`: runs inside `build_history`, reads guidance from workspace `task_state.json` and/or DB `details_json`. Confirmed. 5. **`build_history` runs once per turn** — `loop_setup.rs:224`: providers run once at turn setup, not per iteration. If the job is in its final turn, no new prompt is built and queued guidance is dropped. **This is the root cause.** Confirmed. 6. **`pending_operator_guidance` never cleared** — written at `session_autonomy.rs:588`, read at `job_context.rs:187,228`, no DB write removes it. A continuing job re-injects the same guidance every turn. Confirmed. 7. **The issue's session_id "lead" is FALSE for the in-process path** — `proof_run.rs:658-696` -> `in_process.rs:204-207` -> `pipeline.rs:385,405`: the autonomy loop runs with `options.session_id = Some(real_session_id)`. The chat and its autonomy job share the session id, so the iteration_shell drain DOES run for autonomy jobs. The `session.steer` path is not no-op'd by a missing session id. Confirmed. ### Objective Ensure operator guidance steered into a running autonomy job is applied even when the job is in its final/summary turn, and stop re-injecting already-consumed guidance on jobs that continue. ### Requirements - Before an autonomy agent loop finalizes, if there is pending operator guidance not yet folded into the current turn, fold it in and continue for at least one more iteration instead of finalizing. - Clear consumed guidance from its source so it is not re-applied on subsequent turns/iterations. - Scope strictly to autonomy jobs; chat turns must be byte-identical. - Bound the gate so a failed clear cannot loop the final turn forever. ### Files to Modify - `crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs` — add read + clear helpers for `pending_operator_guidance`. - `crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs` — add the gate before `record_final_answer`. - `crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs` — add a bounded per-turn fold counter to `AgentLoopState`. - `crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs` — (optional hardening) mirror message.send steer into the live inbox. ### Implementation Plan #### Step 1: Read + clear helpers in `job_context.rs` Add `pending_operator_guidance(database, job_id)` exposing the existing DB read (L215-229), and `clear_pending_operator_guidance(database, job_id)` that re-resolves the row (i64 -> `get_autonomy_job`, fallback `get_autonomy_job_by_artifact_job_id`), removes `pending_operator_guidance`/`_at` from `details_json`, retains `pending_operator_guidance_events` as audit, and writes via `upsert_autonomy_job`. Dependencies: none. #### Step 2: Bounded fold counter in `AgentLoopState` Add `operator_guidance_folds: u32` (init 0) and a `MAX_OPERATOR_GUIDANCE_FOLDS` cap (e.g. 3). Initialize at the construction site in `loop_setup.rs`. Dependencies: none (Step 3 reads it). #### Step 3: Gate finalization in `no_tool_calls.rs` Immediately before `record_final_answer`, add: if `options.job_kind().is_autonomy()` && counter < cap && pending guidance exists -> push guidance as a high-priority user message, increment counter, clear the DB guidance, return `Continue`. Refund one iteration if needed so the forced turn actually runs. Dependencies: Steps 1, 2. #### Step 4 (optional): Mirror message.send steer into live inbox In `session_autonomy.rs::steer_existing_job_from_message`, also `global_steering_inbox().push(session.as_i64(), text)` so the running loop drains it per-iteration. Dependencies: independent. #### Step 5: Tests - `clear_pending_operator_guidance` round-trip (write, clear, assert key gone, events retained) co-located with existing `job_context.rs` tests. - Loop test: autonomy job + pending guidance -> `handle_no_tool_calls` returns `Continue` and clears the row; non-autonomy chat turn returns `Break`. ### Acceptance Criteria - [ ] Autonomy job steered during its final/summary turn applies the guidance instead of finishing without it. - [ ] `pending_operator_guidance` is removed once folded in; `pending_operator_guidance_events` retained as audit. - [ ] A continuing autonomy job no longer re-injects the same guidance every turn. - [ ] Gate fires only for autonomy jobs; chat turns are byte-identical (still `Break`). - [ ] Gate is bounded so a failed clear cannot loop the final turn forever. - [ ] New tests cover the clear-helper round-trip and autonomy-vs-chat gate behavior; existing tests pass. ### Notes - The issue's central session_id lead is wrong for the in-process path (Finding 7); the confirmed failure is the checkpoint-timing gap plus never-cleared guidance. The RPC-client topology (separate worker process) was not traced — if autonomy jobs dispatch to another process, the inbox is process-global and Step 4 alone would not bridge processes. - `OperatorGuidanceProvider` reads guidance from both the workspace `task_state.json` and the DB row; Step 1 clears only the DB key. Whether the on-disk `operator_guidance` field is auto-cleared after a replan was not fully confirmed. - If the iteration budget is already exhausted when the gate fires, the forced `Continue` needs an iteration refund to produce a real extra turn.

rawan self-assigned this

2026-06-11 12:44:33 +00:00

rawan commented

2026-06-11 12:56:04 +00:00

Author

Member

Test Results

Ran cargo test -p hero_shrimp_engine -p hero_shrimp_server after implementing the fix. All suites green.

hero_shrimp_engine lib: 1667 passed, 0 failed, 1 ignored
hero_shrimp_server lib + integration suites: 315 + 49 + 13 + 7 + 3 + 2 passed, 0 failed
Total: 2056+ passed, 0 failed

New tests added in job_context.rs:

clear_pending_operator_guidance_round_trip — writes a job row with pending_operator_guidance + pending_operator_guidance_at + pending_operator_guidance_events, asserts the read helper returns the guidance, clears it, then asserts the guidance keys are gone, the read returns None, and pending_operator_guidance_events is retained as audit.
clear_pending_operator_guidance_missing_row_is_noop — clearing with an unresolvable numeric id, a non-numeric id, a whitespace id, and None does not panic.

The full job_context test group: 8 passed; 0 failed.

## Test Results Ran `cargo test -p hero_shrimp_engine -p hero_shrimp_server` after implementing the fix. All suites green. - `hero_shrimp_engine` lib: 1667 passed, 0 failed, 1 ignored - `hero_shrimp_server` lib + integration suites: 315 + 49 + 13 + 7 + 3 + 2 passed, 0 failed - Total: 2056+ passed, 0 failed New tests added in `job_context.rs`: - `clear_pending_operator_guidance_round_trip` — writes a job row with `pending_operator_guidance` + `pending_operator_guidance_at` + `pending_operator_guidance_events`, asserts the read helper returns the guidance, clears it, then asserts the guidance keys are gone, the read returns `None`, and `pending_operator_guidance_events` is retained as audit. - `clear_pending_operator_guidance_missing_row_is_noop` — clearing with an unresolvable numeric id, a non-numeric id, a whitespace id, and `None` does not panic. The full job_context test group: `8 passed; 0 failed`.

rawan commented

2026-06-11 12:57:24 +00:00

Author

Member

Implementation Summary

Fixed the dropped-steering bug for autonomy jobs in their final/summary turn, plus the never-cleared guidance re-injection, plus added per-iteration delivery for message.send steering.

Root cause (confirmed)

The job.steer / message.send path only writes pending_operator_guidance into the job row. That field is read by OperatorGuidanceProvider inside build_history, which runs once per turn at setup. If the job is already in its finalization turn, no new prompt is built, so the queued guidance is never rendered and is silently dropped. It was also never cleared, so on a job that did continue it re-injected every turn.

Note on the secondary lead: the comment's hypothesis that session.steer no-ops because the autonomy loop runs with session_id: None does not hold for the in-process path — the loop runs with session_id: Some(real_session_id) (pipeline.rs), so the live-inbox drain does fire for autonomy jobs.

Changes

crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs — added pending_operator_guidance(...) (read helper) and clear_pending_operator_guidance(...) (removes the pending_operator_guidance / _at keys from the job row details_json, retaining pending_operator_guidance_events as audit).
crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs — added a bounded per-turn counter operator_guidance_folds on AgentLoopState and MAX_OPERATOR_GUIDANCE_FOLDS = 3.
crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs — added a finalization gate just before record_final_answer: for autonomy jobs, if pending guidance exists and the fold cap is not reached, fold the guidance in as a high-priority user message, clear it, and force one more turn instead of finishing. Scoped strictly to autonomy jobs; chat turns are unchanged.
crates/hero_shrimp_engine/src/agent_core/agent/loop_setup.rs — initialize the new counter.
crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs — steer_existing_job_from_message now also pushes the steer text into the live steering inbox so the running loop drains it per-iteration (matches the session.steer path), in addition to the checkpoint guidance.

Tests

New: clear_pending_operator_guidance_round_trip, clear_pending_operator_guidance_missing_row_is_noop in job_context.rs.
cargo test -p hero_shrimp_engine -p hero_shrimp_server: all green, 0 failures (engine 1667 passed, server suites 315+ passed).

Caveats

The clear helper addresses the DB pending_operator_guidance key. OperatorGuidanceProvider also reads an on-disk operator_guidance field in the workspace task_state.json; whether that on-disk field is auto-cleared after a replan was not changed here.
The RPC-client topology (autonomy job dispatched to a separate worker process) was not traced; the steering inbox is process-global, so the live-inbox mirroring helps the in-process path. The checkpoint-gate fix is process-independent.

Changes are committed locally on the working branch; no push.

## Implementation Summary Fixed the dropped-steering bug for autonomy jobs in their final/summary turn, plus the never-cleared guidance re-injection, plus added per-iteration delivery for message.send steering. ### Root cause (confirmed) The `job.steer` / message.send path only writes `pending_operator_guidance` into the job row. That field is read by `OperatorGuidanceProvider` inside `build_history`, which runs once per turn at setup. If the job is already in its finalization turn, no new prompt is built, so the queued guidance is never rendered and is silently dropped. It was also never cleared, so on a job that did continue it re-injected every turn. Note on the secondary lead: the comment's hypothesis that `session.steer` no-ops because the autonomy loop runs with `session_id: None` does not hold for the in-process path — the loop runs with `session_id: Some(real_session_id)` (`pipeline.rs`), so the live-inbox drain does fire for autonomy jobs. ### Changes - `crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs` — added `pending_operator_guidance(...)` (read helper) and `clear_pending_operator_guidance(...)` (removes the `pending_operator_guidance` / `_at` keys from the job row `details_json`, retaining `pending_operator_guidance_events` as audit). - `crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs` — added a bounded per-turn counter `operator_guidance_folds` on `AgentLoopState` and `MAX_OPERATOR_GUIDANCE_FOLDS = 3`. - `crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs` — added a finalization gate just before `record_final_answer`: for autonomy jobs, if pending guidance exists and the fold cap is not reached, fold the guidance in as a high-priority user message, clear it, and force one more turn instead of finishing. Scoped strictly to autonomy jobs; chat turns are unchanged. - `crates/hero_shrimp_engine/src/agent_core/agent/loop_setup.rs` — initialize the new counter. - `crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs` — `steer_existing_job_from_message` now also pushes the steer text into the live steering inbox so the running loop drains it per-iteration (matches the `session.steer` path), in addition to the checkpoint guidance. ### Tests - New: `clear_pending_operator_guidance_round_trip`, `clear_pending_operator_guidance_missing_row_is_noop` in `job_context.rs`. - `cargo test -p hero_shrimp_engine -p hero_shrimp_server`: all green, 0 failures (engine 1667 passed, server suites 315+ passed). ### Caveats - The clear helper addresses the DB `pending_operator_guidance` key. `OperatorGuidanceProvider` also reads an on-disk `operator_guidance` field in the workspace `task_state.json`; whether that on-disk field is auto-cleared after a replan was not changed here. - The RPC-client topology (autonomy job dispatched to a separate worker process) was not traced; the steering inbox is process-global, so the live-inbox mirroring helps the in-process path. The checkpoint-gate fix is process-independent. Changes are committed locally on the working branch; no push.

rawan commented

2026-06-11 12:57:32 +00:00

Author

Member

Implementation Summary

Fixed the dropped-steering bug for autonomy jobs in their final/summary turn, plus the never-cleared guidance re-injection, plus added per-iteration delivery for message.send steering.

Root cause (confirmed)

The job.steer / message.send path only writes pending_operator_guidance into the job row. That field is read by OperatorGuidanceProvider inside build_history, which runs once per turn at setup. If the job is already in its finalization turn, no new prompt is built, so the queued guidance is never rendered and is silently dropped. It was also never cleared, so on a job that did continue it re-injected every turn.

Note on the secondary lead: the comment's hypothesis that session.steer no-ops because the autonomy loop runs with session_id: None does not hold for the in-process path — the loop runs with session_id: Some(real_session_id) (pipeline.rs), so the live-inbox drain does fire for autonomy jobs.

Changes

crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs — added pending_operator_guidance(...) (read helper) and clear_pending_operator_guidance(...) (removes the pending_operator_guidance / _at keys from the job row details_json, retaining pending_operator_guidance_events as audit).
crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs — added a bounded per-turn counter operator_guidance_folds on AgentLoopState and MAX_OPERATOR_GUIDANCE_FOLDS = 3.
crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs — added a finalization gate just before record_final_answer: for autonomy jobs, if pending guidance exists and the fold cap is not reached, fold the guidance in as a high-priority user message, clear it, and force one more turn instead of finishing. Scoped strictly to autonomy jobs; chat turns are unchanged.
crates/hero_shrimp_engine/src/agent_core/agent/loop_setup.rs — initialize the new counter.
crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs — steer_existing_job_from_message now also pushes the steer text into the live steering inbox so the running loop drains it per-iteration (matches the session.steer path), in addition to the checkpoint guidance.

Tests

New: clear_pending_operator_guidance_round_trip, clear_pending_operator_guidance_missing_row_is_noop in job_context.rs.
cargo test -p hero_shrimp_engine -p hero_shrimp_server: all green, 0 failures (engine 1667 passed, server suites 315+ passed).

Caveats

The clear helper addresses the DB pending_operator_guidance key. OperatorGuidanceProvider also reads an on-disk operator_guidance field in the workspace task_state.json; whether that on-disk field is auto-cleared after a replan was not changed here.
The RPC-client topology (autonomy job dispatched to a separate worker process) was not traced; the steering inbox is process-global, so the live-inbox mirroring helps the in-process path. The checkpoint-gate fix is process-independent.

Changes are committed locally on the working branch; no push.

## Implementation Summary Fixed the dropped-steering bug for autonomy jobs in their final/summary turn, plus the never-cleared guidance re-injection, plus added per-iteration delivery for message.send steering. ### Root cause (confirmed) The `job.steer` / message.send path only writes `pending_operator_guidance` into the job row. That field is read by `OperatorGuidanceProvider` inside `build_history`, which runs once per turn at setup. If the job is already in its finalization turn, no new prompt is built, so the queued guidance is never rendered and is silently dropped. It was also never cleared, so on a job that did continue it re-injected every turn. Note on the secondary lead: the comment's hypothesis that `session.steer` no-ops because the autonomy loop runs with `session_id: None` does not hold for the in-process path — the loop runs with `session_id: Some(real_session_id)` (`pipeline.rs`), so the live-inbox drain does fire for autonomy jobs. ### Changes - `crates/hero_shrimp_engine/src/agent_core/agent/job_context.rs` — added `pending_operator_guidance(...)` (read helper) and `clear_pending_operator_guidance(...)` (removes the `pending_operator_guidance` / `_at` keys from the job row `details_json`, retaining `pending_operator_guidance_events` as audit). - `crates/hero_shrimp_engine/src/agent_core/agent/loop_support.rs` — added a bounded per-turn counter `operator_guidance_folds` on `AgentLoopState` and `MAX_OPERATOR_GUIDANCE_FOLDS = 3`. - `crates/hero_shrimp_engine/src/agent_core/agent/no_tool_calls.rs` — added a finalization gate just before `record_final_answer`: for autonomy jobs, if pending guidance exists and the fold cap is not reached, fold the guidance in as a high-priority user message, clear it, and force one more turn instead of finishing. Scoped strictly to autonomy jobs; chat turns are unchanged. - `crates/hero_shrimp_engine/src/agent_core/agent/loop_setup.rs` — initialize the new counter. - `crates/hero_shrimp_server/src/rpc/methods/session_autonomy.rs` — `steer_existing_job_from_message` now also pushes the steer text into the live steering inbox so the running loop drains it per-iteration (matches the `session.steer` path), in addition to the checkpoint guidance. ### Tests - New: `clear_pending_operator_guidance_round_trip`, `clear_pending_operator_guidance_missing_row_is_noop` in `job_context.rs`. - `cargo test -p hero_shrimp_engine -p hero_shrimp_server`: all green, 0 failures (engine 1667 passed, server suites 315+ passed). ### Caveats - The clear helper addresses the DB `pending_operator_guidance` key. `OperatorGuidanceProvider` also reads an on-disk `operator_guidance` field in the workspace `task_state.json`; whether that on-disk field is auto-cleared after a replan was not changed here. - The RPC-client topology (autonomy job dispatched to a separate worker process) was not traced; the steering inbox is process-global, so the live-inbox mirroring helps the in-process path. The checkpoint-gate fix is process-independent. Changes are committed locally on the working branch; no push.

rawan referenced this issue

2026-06-11 14:31:58 +00:00

int_steering #112

rawan referenced this issue

2026-06-11 14:34:08 +00:00

fix: steering #113

rawan closed this issue

2026-06-11 14:36:20 +00:00

Rows
Columns

Steering an autonomy job in its final/summary turn is silently dropped #93

Symptom

Root cause

Proposed fix

Update: fails from BOTH UI entry points

Implementation Spec for Issue #93

Findings (confirmed against code)

Objective

Requirements

Files to Modify

Implementation Plan

Step 1: Read + clear helpers in job_context.rs

Step 2: Bounded fold counter in AgentLoopState

Step 3: Gate finalization in no_tool_calls.rs

Step 4 (optional): Mirror message.send steer into live inbox

Step 5: Tests

Acceptance Criteria

Notes

Test Results

Implementation Summary

Root cause (confirmed)

Changes

Tests

Caveats

Implementation Summary

Root cause (confirmed)

Changes

Tests

Caveats

Step 1: Read + clear helpers in `job_context.rs`

Step 2: Bounded fold counter in `AgentLoopState`

Step 3: Gate finalization in `no_tool_calls.rs`