lhumina_code/hero_shrimp

Fork 0

thinking is getting shown as normal text #104

New issue

Closed

opened 2026-06-10 11:40:40 +00:00 by salmaelsoly · 4 comments

salmaelsoly commented

2026-06-10 11:40:40 +00:00

Member

![image](/attachments/84b4e664-2707-46ae-9b94-d42db340dd02)

image.png

112 KiB

salmaelsoly self-assigned this

2026-06-11 09:57:15 +00:00

salmaelsoly commented

2026-06-11 11:19:08 +00:00

Author

Member

Implementation Spec for Issue #104

Objective

Inline <think>...</think> reasoning segments emitted by models inside the assistant content field must be routed into the existing reasoning lane (reasoning events, LlmResponse.reasoning, messages.reasoning_json, the UI ThinkingPane) instead of being rendered as plain text in the chat. The fix must work for streaming (tags may be split across SSE chunks), non-streaming responses, and already-persisted history that still contains raw tags.

Requirements

Strip <think>...</think> segments from assistant content at the runtime completion layer so every downstream consumer (events, persistence, turn:end reply, job narration) receives clean content.
Route the extracted text into the existing reasoning lane: llm:delta with kind: "reasoning" during streaming, and ReasoningBlock entries on the final LlmResponse (so persist_message_with_reasoning stores it in reasoning_json and multi-turn rehydration keeps working).
Handle tags split across stream chunks (e.g. one chunk ends with <thi, the next starts with nk>): no tag fragment may leak into the content lane, and held-back text must be flushed when it turns out not to be a tag.
Handle an unclosed <think> at end of stream (treat the remainder as reasoning, not content).
Handle multiple <think> blocks interleaved with prose in one response (as shown in the issue screenshot).
Leave the Anthropic-compatible path untouched (thinking arrives as structured blocks there) and leave trajectory export untouched (it intentionally wraps reasoning in <think> for distillation JSONL).
Frontend fallback: messages already persisted with raw <think> tags (and replies from older daemons) must render with the thinking content moved into the collapsible ThinkingPane, never as body text.
Settled (non-streaming) assistant messages that have reasoning text must offer a collapsed "show reasoning" pane instead of hiding the reasoning entirely.

Files to Modify/Create

crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs - New module: incremental stream filter plus whole-string splitter for <think> tags, with unit tests.
crates/hero_shrimp_runtime/src/llm/completion/mod.rs - Wire the filter into the streaming SSE loop and the non-streaming OpenAI-compatible path.
crates/hero_shrimp_web/ui/src/store.ts - Client-side fallback: extract <think> segments in flushStreamBuffer, the turn:end handler, and loadConversationMessages history mapping.
crates/hero_shrimp_web/ui/src/components/ChatThread.tsx - Render a collapsed ThinkingPane on settled assistant messages that carry reasoningText.
crates/hero_shrimp_web/static/ - Regenerated Vite bundle (output of npm run build in crates/hero_shrimp_web/ui/; embedded via rust-embed).

Implementation Plan

Step 1: Add a think-tag parser module to the runtime

Files: crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs (new), crates/hero_shrimp_runtime/src/llm/completion/mod.rs (module declaration only)

Create think_tags.rs containing a stateful incremental parser ThinkTagFilter with push(&mut self, chunk: &str) -> Vec<DeltaPiece> and finish(&mut self) -> Vec<DeltaPiece>, where each DeltaPiece is either Content or Reasoning text. State machine toggled by exact <think> / </think> markers; when a chunk ends with a proper prefix of a potential marker, hold it back and resolve on the next push; finish() flushes held-back text as content and treats any still-open think section as reasoning.
Add split_think_tags(content: &str) -> (Option<String>, Vec<String>) for non-streaming responses: returns content with all <think>...</think> spans removed (collapsing excess blank lines, trimming; None if nothing remains) plus extracted reasoning texts. An unclosed trailing <think> consumes to end-of-string as reasoning.
Unit tests covering: tag split across two and three chunks; multiple think blocks interleaved with prose; a lone < that is not a tag; unclosed <think> at stream end; input with no tags passes through byte-identical; split_think_tags on the issue's example transcript.
Declare mod think_tags; in completion/mod.rs.
Dependencies: none

Step 2: Filter inline think tags in the streaming dispatch path

Files: crates/hero_shrimp_runtime/src/llm/completion/mod.rs

In the streaming loop, instantiate one ThinkTagFilter per stream next to acc_content / acc_reasoning.
Route each delta.content chunk through filter.push(c); append returned pieces to acc_content or acc_reasoning and, when emit_stream is true, emit llm:delta with kind: "content" or kind: "reasoning" respectively, reusing the existing JSON shapes. Structured delta.reasoning / delta.reasoning_content handling stays unchanged.
After the stream loop ends and before final assembly, call filter.finish() and append the resulting pieces to the accumulators.
Keep filtering and accumulation unconditional even when emit_stream is false (background phases); only event emission stays gated, matching current behavior.
Dependencies: Step 1

Step 3: Filter inline think tags in the non-streaming OpenAI-compatible path

Files: crates/hero_shrimp_runtime/src/llm/completion/mod.rs

In the non-streaming JSON branch, after content is extracted and reasoning is built via reasoning_from_assistant_message, run split_think_tags on the content; replace content with the cleaned remainder and push each extracted segment as a ReasoningBlock onto the reasoning vec.
Do this before the phase_emits_to_chat_stream block so the existing agent:reasoning emission and the llm:response preview pick up the changes.
Leave the Anthropic-compatible branch untouched.
Dependencies: Step 1

Step 4: Client-side fallback extraction in the web UI store

Files: crates/hero_shrimp_web/ui/src/store.ts

Add a helper next to stripPseudoToolCalls: extractThinkBlocks(text) removes all complete <think>...</think> spans plus a trailing unclosed <think>..., returning the cleaned text and the concatenated reasoning.
Apply it in three places: flushStreamBuffer (run on the accumulated text before stripPseudoToolCalls, appending extracted reasoning to the message's reasoningText/reasoningChars); the turn:end handler (clean reply, merge reasoning); and loadConversationMessages (clean persisted history so old messages with raw tags render correctly).
This is defense in depth: covers messages already in the database, older daemons, and any path the backend filter misses.
Dependencies: none (independent of Steps 1-3)

Step 5: Show a collapsed reasoning pane on settled messages

Files: crates/hero_shrimp_web/ui/src/components/ChatThread.tsx

ThinkingPane currently renders only during streaming fallback and job-narration think segments; a finalized message with reasoningText shows nothing. Render a collapsed ThinkingPane above the message body whenever the message is an assistant message with reasoningText and is not in the streaming-fallback state, so extracted thinking from history reload and turn:end is reachable via the existing "show reasoning" toggle.
Dependencies: Step 4

Step 6: Rebuild the embedded UI bundle and run verification

Files: crates/hero_shrimp_web/ui/ (build), crates/hero_shrimp_web/static/assets/ (generated output)

Run npm run build in crates/hero_shrimp_web/ui/ and commit the regenerated static/assets (the hashed bundle is committed and embedded via rust-embed at compile time).
Run cargo test -p hero_shrimp_runtime, cargo clippy -p hero_shrimp_runtime -p hero_shrimp_web -- -D warnings, and a workspace cargo check.
Dependencies: Steps 1-5

Acceptance Criteria

A streamed response containing <think>...</think> renders the thinking text only in the collapsible ThinkingPane; the message body contains no <think> or </think> markers.
A <think> or </think> marker split across SSE chunk boundaries never leaks fragments (e.g. <thi) into the visible body, and non-tag text starting with < is not swallowed.
An unclosed <think> at end of stream is treated as reasoning, not content.
Multiple <think> blocks interleaved with prose all land in the reasoning lane; the prose between them remains intact and in order.
Non-streaming OpenAI-compatible responses get the same treatment; LlmResponse.content is clean and extracted segments appear in LlmResponse.reasoning (and hence messages.reasoning_json).
The persisted assistant message text contains no think tags; reopening a conversation persisted before this fix (raw tags in DB) also renders clean, with reasoning available behind "show reasoning".
Settled assistant messages with reasoning show a collapsed ThinkingPane toggle; responses without think tags are unchanged from current behavior.
Anthropic-compatible structured thinking and trajectory/ShareGPT export behavior are unchanged.
New unit tests in think_tags.rs pass; cargo test -p hero_shrimp_runtime and workspace clippy are green; the regenerated static/assets bundle is committed.

Notes

The runtime already has the full reasoning plumbing (events, ReasoningBlock, reasoning_json persistence, ThinkingPane); this issue is purely that inline-tag models bypass it. No new event type is needed; reuse llm:delta kind:"reasoning".
Match only exact lowercase <think> / </think>. Some DeepSeek-R1 deployments omit the opening tag and emit only a closing </think>; handling a leading orphan </think> is a worthwhile hardening but should be a deliberate, tested choice in Step 1.
trajectory.rs intentionally wraps reasoning in <think> for ShareGPT export; no change there and the filter must not run on that output.
The store's job narration lane is fed by the same llm:delta events, so the backend fix automatically cleans background-job narration too.
The active frontend is crates/hero_shrimp_web/ui/ (Vite build into static/assets); v2-src/ must not be modified.

## Implementation Spec for Issue #104 ### Objective Inline `<think>...</think>` reasoning segments emitted by models inside the assistant `content` field must be routed into the existing reasoning lane (reasoning events, `LlmResponse.reasoning`, `messages.reasoning_json`, the UI ThinkingPane) instead of being rendered as plain text in the chat. The fix must work for streaming (tags may be split across SSE chunks), non-streaming responses, and already-persisted history that still contains raw tags. ### Requirements - Strip `<think>...</think>` segments from assistant content at the runtime completion layer so every downstream consumer (events, persistence, `turn:end` reply, job narration) receives clean content. - Route the extracted text into the existing reasoning lane: `llm:delta` with `kind: "reasoning"` during streaming, and `ReasoningBlock` entries on the final `LlmResponse` (so `persist_message_with_reasoning` stores it in `reasoning_json` and multi-turn rehydration keeps working). - Handle tags split across stream chunks (e.g. one chunk ends with `<thi`, the next starts with `nk>`): no tag fragment may leak into the content lane, and held-back text must be flushed when it turns out not to be a tag. - Handle an unclosed `<think>` at end of stream (treat the remainder as reasoning, not content). - Handle multiple `<think>` blocks interleaved with prose in one response (as shown in the issue screenshot). - Leave the Anthropic-compatible path untouched (thinking arrives as structured blocks there) and leave trajectory export untouched (it intentionally wraps reasoning in `<think>` for distillation JSONL). - Frontend fallback: messages already persisted with raw `<think>` tags (and replies from older daemons) must render with the thinking content moved into the collapsible ThinkingPane, never as body text. - Settled (non-streaming) assistant messages that have reasoning text must offer a collapsed "show reasoning" pane instead of hiding the reasoning entirely. ### Files to Modify/Create - `crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs` - New module: incremental stream filter plus whole-string splitter for `<think>` tags, with unit tests. - `crates/hero_shrimp_runtime/src/llm/completion/mod.rs` - Wire the filter into the streaming SSE loop and the non-streaming OpenAI-compatible path. - `crates/hero_shrimp_web/ui/src/store.ts` - Client-side fallback: extract `<think>` segments in `flushStreamBuffer`, the `turn:end` handler, and `loadConversationMessages` history mapping. - `crates/hero_shrimp_web/ui/src/components/ChatThread.tsx` - Render a collapsed ThinkingPane on settled assistant messages that carry `reasoningText`. - `crates/hero_shrimp_web/static/` - Regenerated Vite bundle (output of `npm run build` in `crates/hero_shrimp_web/ui/`; embedded via rust-embed). ### Implementation Plan #### Step 1: Add a think-tag parser module to the runtime Files: `crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs` (new), `crates/hero_shrimp_runtime/src/llm/completion/mod.rs` (module declaration only) - Create `think_tags.rs` containing a stateful incremental parser `ThinkTagFilter` with `push(&mut self, chunk: &str) -> Vec<DeltaPiece>` and `finish(&mut self) -> Vec<DeltaPiece>`, where each `DeltaPiece` is either Content or Reasoning text. State machine toggled by exact `<think>` / `</think>` markers; when a chunk ends with a proper prefix of a potential marker, hold it back and resolve on the next push; `finish()` flushes held-back text as content and treats any still-open think section as reasoning. - Add `split_think_tags(content: &str) -> (Option<String>, Vec<String>)` for non-streaming responses: returns content with all `<think>...</think>` spans removed (collapsing excess blank lines, trimming; `None` if nothing remains) plus extracted reasoning texts. An unclosed trailing `<think>` consumes to end-of-string as reasoning. - Unit tests covering: tag split across two and three chunks; multiple think blocks interleaved with prose; a lone `<` that is not a tag; unclosed `<think>` at stream end; input with no tags passes through byte-identical; `split_think_tags` on the issue's example transcript. - Declare `mod think_tags;` in `completion/mod.rs`. Dependencies: none #### Step 2: Filter inline think tags in the streaming dispatch path Files: `crates/hero_shrimp_runtime/src/llm/completion/mod.rs` - In the streaming loop, instantiate one `ThinkTagFilter` per stream next to `acc_content` / `acc_reasoning`. - Route each `delta.content` chunk through `filter.push(c)`; append returned pieces to `acc_content` or `acc_reasoning` and, when `emit_stream` is true, emit `llm:delta` with `kind: "content"` or `kind: "reasoning"` respectively, reusing the existing JSON shapes. Structured `delta.reasoning` / `delta.reasoning_content` handling stays unchanged. - After the stream loop ends and before final assembly, call `filter.finish()` and append the resulting pieces to the accumulators. - Keep filtering and accumulation unconditional even when `emit_stream` is false (background phases); only event emission stays gated, matching current behavior. Dependencies: Step 1 #### Step 3: Filter inline think tags in the non-streaming OpenAI-compatible path Files: `crates/hero_shrimp_runtime/src/llm/completion/mod.rs` - In the non-streaming JSON branch, after `content` is extracted and `reasoning` is built via `reasoning_from_assistant_message`, run `split_think_tags` on the content; replace `content` with the cleaned remainder and push each extracted segment as a `ReasoningBlock` onto the `reasoning` vec. - Do this before the `phase_emits_to_chat_stream` block so the existing `agent:reasoning` emission and the `llm:response` preview pick up the changes. - Leave the Anthropic-compatible branch untouched. Dependencies: Step 1 #### Step 4: Client-side fallback extraction in the web UI store Files: `crates/hero_shrimp_web/ui/src/store.ts` - Add a helper next to `stripPseudoToolCalls`: `extractThinkBlocks(text)` removes all complete `<think>...</think>` spans plus a trailing unclosed `<think>...`, returning the cleaned text and the concatenated reasoning. - Apply it in three places: `flushStreamBuffer` (run on the accumulated text before `stripPseudoToolCalls`, appending extracted reasoning to the message's `reasoningText`/`reasoningChars`); the `turn:end` handler (clean `reply`, merge reasoning); and `loadConversationMessages` (clean persisted history so old messages with raw tags render correctly). - This is defense in depth: covers messages already in the database, older daemons, and any path the backend filter misses. Dependencies: none (independent of Steps 1-3) #### Step 5: Show a collapsed reasoning pane on settled messages Files: `crates/hero_shrimp_web/ui/src/components/ChatThread.tsx` - `ThinkingPane` currently renders only during streaming fallback and job-narration think segments; a finalized message with `reasoningText` shows nothing. Render a collapsed `ThinkingPane` above the message body whenever the message is an assistant message with `reasoningText` and is not in the streaming-fallback state, so extracted thinking from history reload and `turn:end` is reachable via the existing "show reasoning" toggle. Dependencies: Step 4 #### Step 6: Rebuild the embedded UI bundle and run verification Files: `crates/hero_shrimp_web/ui/` (build), `crates/hero_shrimp_web/static/assets/` (generated output) - Run `npm run build` in `crates/hero_shrimp_web/ui/` and commit the regenerated `static/assets` (the hashed bundle is committed and embedded via rust-embed at compile time). - Run `cargo test -p hero_shrimp_runtime`, `cargo clippy -p hero_shrimp_runtime -p hero_shrimp_web -- -D warnings`, and a workspace `cargo check`. Dependencies: Steps 1-5 ### Acceptance Criteria - [ ] A streamed response containing `<think>...</think>` renders the thinking text only in the collapsible ThinkingPane; the message body contains no `<think>` or `</think>` markers. - [ ] A `<think>` or `</think>` marker split across SSE chunk boundaries never leaks fragments (e.g. `<thi`) into the visible body, and non-tag text starting with `<` is not swallowed. - [ ] An unclosed `<think>` at end of stream is treated as reasoning, not content. - [ ] Multiple `<think>` blocks interleaved with prose all land in the reasoning lane; the prose between them remains intact and in order. - [ ] Non-streaming OpenAI-compatible responses get the same treatment; `LlmResponse.content` is clean and extracted segments appear in `LlmResponse.reasoning` (and hence `messages.reasoning_json`). - [ ] The persisted assistant message text contains no think tags; reopening a conversation persisted before this fix (raw tags in DB) also renders clean, with reasoning available behind "show reasoning". - [ ] Settled assistant messages with reasoning show a collapsed ThinkingPane toggle; responses without think tags are unchanged from current behavior. - [ ] Anthropic-compatible structured thinking and trajectory/ShareGPT export behavior are unchanged. - [ ] New unit tests in `think_tags.rs` pass; `cargo test -p hero_shrimp_runtime` and workspace clippy are green; the regenerated `static/assets` bundle is committed. ### Notes - The runtime already has the full reasoning plumbing (events, `ReasoningBlock`, `reasoning_json` persistence, ThinkingPane); this issue is purely that inline-tag models bypass it. No new event type is needed; reuse `llm:delta kind:"reasoning"`. - Match only exact lowercase `<think>` / `</think>`. Some DeepSeek-R1 deployments omit the opening tag and emit only a closing `</think>`; handling a leading orphan `</think>` is a worthwhile hardening but should be a deliberate, tested choice in Step 1. - `trajectory.rs` intentionally wraps reasoning in `<think>` for ShareGPT export; no change there and the filter must not run on that output. - The store's job narration lane is fed by the same `llm:delta` events, so the backend fix automatically cleans background-job narration too. - The active frontend is `crates/hero_shrimp_web/ui/` (Vite build into `static/assets`); `v2-src/` must not be modified.

salmaelsoly commented

2026-06-11 12:11:12 +00:00

Author

Member

Test Results

cargo test --workspace (all 14 member crates, unit + integration + doc tests):

Total: 2758
Passed: 2740
Failed: 0
Ignored: 18 (16 doc-test examples marked ignore, 2 unit tests)

Clippy: clean (cargo clippy --workspace -- -D warnings exits 0, no warnings)

New think_tags unit tests: 13 passed (hero_shrimp_runtime::llm::completion::think_tags)

All 13 think_tags tests verified individually:

held_back_partial_close_at_finish_flushes_as_reasoning
held_back_partial_marker_at_finish_flushes_as_content
held_back_prefix_that_is_not_a_tag_flushes_as_content
lone_lt_followed_by_non_marker_stays_content
marker_split_across_three_chunks
marker_split_across_two_chunks
multiple_blocks_interleaved_with_prose
no_tags_passes_through_byte_identical
split_entirely_one_think_block_returns_none_content
split_multiple_blocks
split_no_tags_returns_original_unchanged
split_unclosed_trailing_think
unclosed_think_at_end_of_stream_is_reasoning

## Test Results `cargo test --workspace` (all 14 member crates, unit + integration + doc tests): - Total: 2758 - Passed: 2740 - Failed: 0 - Ignored: 18 (16 doc-test examples marked ignore, 2 unit tests) Clippy: clean (`cargo clippy --workspace -- -D warnings` exits 0, no warnings) New think_tags unit tests: 13 passed (`hero_shrimp_runtime::llm::completion::think_tags`) All 13 think_tags tests verified individually: - held_back_partial_close_at_finish_flushes_as_reasoning - held_back_partial_marker_at_finish_flushes_as_content - held_back_prefix_that_is_not_a_tag_flushes_as_content - lone_lt_followed_by_non_marker_stays_content - marker_split_across_three_chunks - marker_split_across_two_chunks - multiple_blocks_interleaved_with_prose - no_tags_passes_through_byte_identical - split_entirely_one_think_block_returns_none_content - split_multiple_blocks - split_no_tags_returns_original_unchanged - split_unclosed_trailing_think - unclosed_think_at_end_of_stream_is_reasoning

salmaelsoly commented

2026-06-11 12:14:29 +00:00

Author

Member

Implementation Summary

Inline <think>...</think> reasoning emitted by models inside the assistant content is now routed into the existing reasoning lane instead of being rendered as plain text in the chat. The thinking content still appears, but in the collapsible ThinkingPane where reasoning normally shows, and it is persisted as reasoning rather than message text.

Changes

crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs (new): incremental ThinkTagFilter for streaming (handles markers split across SSE chunks, held-back partial markers, unclosed trailing <think>) and split_think_tags for whole-string responses; 13 unit tests.
crates/hero_shrimp_runtime/src/llm/completion/mod.rs: streaming path routes every delta.content chunk through the filter, emitting llm:delta with kind: "content" or kind: "reasoning" and accumulating into the existing acc_content/acc_reasoning; filter is flushed after the stream ends. Non-streaming OpenAI-compatible path runs split_think_tags on the extracted content and pushes extracted segments as ReasoningBlock entries before event emission. Anthropic-compatible branch and trajectory export untouched.
crates/hero_shrimp_web/ui/src/store.ts: client-side fallback extractThinkBlocks applied in flushStreamBuffer, the turn:end handler, and loadConversationMessages, so history persisted with raw tags (and replies from older daemons) renders clean with the reasoning attached to the message.
crates/hero_shrimp_web/ui/src/components/ChatThread.tsx: settled assistant messages with reasoning now render a collapsed ThinkingPane above the body (default closed, existing toggle), so extracted thinking remains reachable after the message finalizes.
crates/hero_shrimp_web/static/: regenerated Vite bundle (app.CKaPpq6t.js, index.DypXtPlZ.css, index.html references).

Test Results

cargo test --workspace: 2740 passed, 0 failed, 18 ignored (2758 total)
cargo clippy --workspace -- -D warnings: clean
New think_tags unit tests: 13 passed

Notes

Only exact lowercase <think> / </think> markers are matched. The DeepSeek variant that omits the opening tag (orphan leading </think>) is deliberately not handled; this is documented in the parser module.
Reasoning extracted from inline tags is persisted to reasoning_json like structured reasoning, so it survives conversation reloads.

## Implementation Summary Inline `<think>...</think>` reasoning emitted by models inside the assistant content is now routed into the existing reasoning lane instead of being rendered as plain text in the chat. The thinking content still appears, but in the collapsible ThinkingPane where reasoning normally shows, and it is persisted as reasoning rather than message text. ### Changes - `crates/hero_shrimp_runtime/src/llm/completion/think_tags.rs` (new): incremental `ThinkTagFilter` for streaming (handles markers split across SSE chunks, held-back partial markers, unclosed trailing `<think>`) and `split_think_tags` for whole-string responses; 13 unit tests. - `crates/hero_shrimp_runtime/src/llm/completion/mod.rs`: streaming path routes every `delta.content` chunk through the filter, emitting `llm:delta` with `kind: "content"` or `kind: "reasoning"` and accumulating into the existing `acc_content`/`acc_reasoning`; filter is flushed after the stream ends. Non-streaming OpenAI-compatible path runs `split_think_tags` on the extracted content and pushes extracted segments as `ReasoningBlock` entries before event emission. Anthropic-compatible branch and trajectory export untouched. - `crates/hero_shrimp_web/ui/src/store.ts`: client-side fallback `extractThinkBlocks` applied in `flushStreamBuffer`, the `turn:end` handler, and `loadConversationMessages`, so history persisted with raw tags (and replies from older daemons) renders clean with the reasoning attached to the message. - `crates/hero_shrimp_web/ui/src/components/ChatThread.tsx`: settled assistant messages with reasoning now render a collapsed ThinkingPane above the body (default closed, existing toggle), so extracted thinking remains reachable after the message finalizes. - `crates/hero_shrimp_web/static/`: regenerated Vite bundle (app.CKaPpq6t.js, index.DypXtPlZ.css, index.html references). ### Test Results - cargo test --workspace: 2740 passed, 0 failed, 18 ignored (2758 total) - cargo clippy --workspace -- -D warnings: clean - New think_tags unit tests: 13 passed ### Notes - Only exact lowercase `<think>` / `</think>` markers are matched. The DeepSeek variant that omits the opening tag (orphan leading `</think>`) is deliberately not handled; this is documented in the parser module. - Reasoning extracted from inline tags is persisted to `reasoning_json` like structured reasoning, so it survives conversation reloads.

salmaelsoly referenced this issue from a commit

2026-06-11 12:16:04 +00:00

fix(runtime,web): route inline think tags into reasoning lane

salmaelsoly referenced this issue

2026-06-11 12:16:23 +00:00

fix(runtime,web): route inline think tags into reasoning lane #109

salmaelsoly commented

2026-06-11 12:16:41 +00:00

Author

Member

Pull request opened: #109

This PR implements the changes discussed in this issue.