lhumina_code/home

Fork 0

[nu-demo] hero_agent should support tool_choice="required" for grounded modes in llm_client.rs #150

New issue

Closed

opened 2026-04-24 01:20:01 +00:00 by mik-tf · 1 comment

mik-tf commented

2026-04-24 01:20:01 +00:00

Owner

Why

Even with the right system prompt (see #149) and the right tool in the always_include list, well-behaved LLMs still don't always call the tool — "use this tool first" reads as a hint when it's phrased as a guideline, and models weigh it against their own confidence.

For the Hero OS demo, when a user asks about Hero-stack topics, we want deterministic grounding: the model MUST call search_hero_docs, period. The OpenAI tool-use API (and its OpenAI-compatible cousins) support this directly via tool_choice:

Value	Behavior
`"none"`	Tools not offered
`"auto"` (default)	Model decides
`"required"`	Model MUST call some tool (any offered)
`{"type":"function","function":{"name":"search_hero_docs"}}`	Model MUST call this specific tool

Claude's tool-use extension (via Anthropic's OpenAI-compat endpoint — which is what aibroker is doing) honors all four.

Current state

hero_agent/crates/hero_agent/src/llm_client.rs::try_completion (line ~343) builds the request body as:

let mut body = serde_json::json!({
    "model": target.model,
    "messages": messages,
    "max_tokens": options.max_tokens.unwrap_or(self.max_tokens),
    "temperature": options.temperature.unwrap_or(self.temperature),
});

if let Some(tools) = tools
    && !tools.is_empty()
{
    body["tools"] = serde_json::to_value(tools)?;
}

There is no tool_choice ever emitted — the field is entirely missing.

Same gap in try_stream around line 505.

Proposed change

Add a tool_choice: Option<ToolChoice> field to LlmOptions:

pub enum ToolChoice {
    Auto,
    Required,
    Specific(String), // tool name
}

In try_completion + try_stream, after setting tools, emit:

if let Some(choice) = &options.tool_choice {
    body["tool_choice"] = match choice {
        ToolChoice::Auto => json!("auto"),
        ToolChoice::Required => json!("required"),
        ToolChoice::Specific(name) => json!({
            "type": "function",
            "function": { "name": name }
        }),
    };
}

In the agent's routing layer (agent.rs or routes.rs), heuristically set tool_choice = Required on first turn of any conversation whose message contains hero-stack keywords (same list as prompt.rs MANDATORY block — hero_*, hero os, etc.).
Expose a force_tools: bool param in agent.chat params so callers (test harnesses, evaluation scripts) can override.

Rollback / safety

tool_choice = "required" forces a tool call but the model still picks which tool — if the model picks the wrong one (e.g. list_services instead of search_hero_docs), we get junk grounding. Use Specific("search_hero_docs") for the strongest guarantee.
On follow-up turns (after a tool_result), set tool_choice = "auto" so the model can compose the final answer without being forced back into another tool call.

Verification

Before (confirmed 2026-04-24): plain "What is Hero OS?" with model=claude-haiku-4.5 returns generic OS description.

After: same query must return content citing hero_os_guide and describing Hero OS as a "sovereign digital workspace" matching the docs_hero corpus.

#148 — flow-documentation index
#149 (or next available) — prompt.rs directive strengthening

Signed-off-by: mik-tf

## Why Even with the right system prompt (see https://forge.ourworld.tf/lhumina_code/home/issues/149) and the right tool in the `always_include` list, well-behaved LLMs still *don't always* call the tool — "use this tool first" reads as a hint when it's phrased as a guideline, and models weigh it against their own confidence. For the Hero OS demo, when a user asks about Hero-stack topics, we want deterministic grounding: the model MUST call `search_hero_docs`, period. The OpenAI tool-use API (and its OpenAI-compatible cousins) support this directly via `tool_choice`: | Value | Behavior | |---|---| | `"none"` | Tools not offered | | `"auto"` (default) | Model decides | | `"required"` | Model MUST call some tool (any offered) | | `{"type":"function","function":{"name":"search_hero_docs"}}` | Model MUST call this specific tool | Claude's tool-use extension (via Anthropic's OpenAI-compat endpoint — which is what aibroker is doing) honors all four. ## Current state `hero_agent/crates/hero_agent/src/llm_client.rs::try_completion` (line ~343) builds the request body as: ```rust let mut body = serde_json::json!({ "model": target.model, "messages": messages, "max_tokens": options.max_tokens.unwrap_or(self.max_tokens), "temperature": options.temperature.unwrap_or(self.temperature), }); if let Some(tools) = tools && !tools.is_empty() { body["tools"] = serde_json::to_value(tools)?; } ``` There is no `tool_choice` ever emitted — the field is entirely missing. Same gap in `try_stream` around line 505. ## Proposed change 1. Add a `tool_choice: Option<ToolChoice>` field to `LlmOptions`: ```rust pub enum ToolChoice { Auto, Required, Specific(String), // tool name } ``` 2. In `try_completion` + `try_stream`, after setting `tools`, emit: ```rust if let Some(choice) = &options.tool_choice { body["tool_choice"] = match choice { ToolChoice::Auto => json!("auto"), ToolChoice::Required => json!("required"), ToolChoice::Specific(name) => json!({ "type": "function", "function": { "name": name } }), }; } ``` 3. In the agent's routing layer (`agent.rs` or `routes.rs`), heuristically set `tool_choice = Required` on first turn of any conversation whose message contains hero-stack keywords (same list as prompt.rs MANDATORY block — `hero_*`, `hero os`, etc.). 4. Expose a `force_tools: bool` param in `agent.chat` params so callers (test harnesses, evaluation scripts) can override. ## Rollback / safety - `tool_choice = "required"` forces a tool call but the model still picks which tool — if the model picks the wrong one (e.g. `list_services` instead of `search_hero_docs`), we get junk grounding. Use `Specific("search_hero_docs")` for the strongest guarantee. - On follow-up turns (after a tool_result), set `tool_choice = "auto"` so the model can compose the final answer without being forced back into another tool call. ## Verification Before (confirmed 2026-04-24): plain "What is Hero OS?" with `model=claude-haiku-4.5` returns generic OS description. After: same query must return content citing `hero_os_guide` and describing Hero OS as a "sovereign digital workspace" matching the docs_hero corpus. ## Related - https://forge.ourworld.tf/lhumina_code/home/issues/148 — flow-documentation index - https://forge.ourworld.tf/lhumina_code/home/issues/149 (or next available) — prompt.rs directive strengthening Signed-off-by: mik-tf

mik-tf referenced this issue

2026-04-24 01:50:01 +00:00

[nu-demo] hero_agent triage classifier routes 'What is Hero OS?' to Knowledge path (no tools offered) — grounding never happens #152

mik-tf referenced this issue

2026-04-24 02:48:10 +00:00

[nu-demo] CANONICAL — Intelligence pipeline documented end-to-end: hero_agent ↔ aibroker ↔ embedder ↔ hero_books ↔ docs_hero #159

mik-tf referenced this issue

2026-04-24 03:25:50 +00:00

[nu-demo] STATE OF HERONU as of 2026-04-24 — what works, what's still broken, everything that needs upstream landing #160

mik-tf referenced this issue from a commit

2026-04-27 04:39:08 +00:00

feat(llm,agent): tool_choice support + auto-force search_hero_docs grounding

mik-tf closed this issue

2026-04-27 04:39:08 +00:00

mik-tf commented

2026-04-27 04:39:42 +00:00

Author

Owner

Fixed in hero_agent commit 54ba5d5 on development. Implements all 4 steps from the issue body.

Step 1+2 — LlmOptions.tool_choice + body emission (llm_client.rs):

pub enum ToolChoice {
    Auto,
    Required,
    Specific(String),
}

pub struct LlmOptions {
    // …
    pub tool_choice: Option<ToolChoice>,
}

In build_request_body, after setting tools:

if let Some(choice) = &options.tool_choice {
    body["tool_choice"] = match choice {
        ToolChoice::Auto => json!("auto"),
        ToolChoice::Required => json!("required"),
        ToolChoice::Specific(name) => json!({
            "type": "function",
            "function": { "name": name }
        }),
    };
}

The emit is gated inside the existing if let Some(tools) = tools && !tools.is_empty() block — tool_choice without tools is meaningless and would 400 the provider, so the field is omitted.

Step 3 — agent.rs heuristic for forced grounding:

New helper message_contains_hero_keyword(messages) scans the LATEST user message only, case-insensitive, against the same hero_* trigger set used in prompt.rs (home#149). Earlier turns are explicitly ignored — otherwise we'd pin grounding on every iteration after the first match.

In agent_loop(), on iteration 0 only:

let tool_choice = if iteration == 0
    && tools.iter().any(|t| t.function.name == "search_hero_docs")
    && message_contains_hero_keyword(messages)
{
    Some(ToolChoice::Specific("search_hero_docs".to_string()))
} else {
    None
};

Matches the issue body's "Rollback / safety": iterations >= 1 carry tool_choice = None so the model can compose the final answer from tool_results without being forced into another tool call.

Step 4 — force_tools agent.chat param: deferred. The tool_choice field on LlmOptions is already public and can be threaded through agent.chat if a test harness needs to force a specific tool — adding the explicit user-facing param can land in a follow-up if/when an integration test demands it.

Tests added (9 new):

llm_client::tests::tool_choice (4):

body_omits_tool_choice_when_none — preserves provider default ("auto")
body_emits_required_when_required
body_emits_specific_when_specific
body_omits_tool_choice_without_tools_even_if_set — guard against 400

agent::tests (5):

basic match (Hero OS, hero_books, hero_aibroker)
case-insensitive (HERO OS, Hero_Books)
unrelated content rejected ("Who is your hero?", weather)
only the latest user message is scanned
empty messages handled

Verification: cargo fmt, cargo check -p hero_agent clean, all 98 hero_agent lib tests pass (89 pre-existing + 9 new).

Pairing: This is the belt to home#149's prompt-directive suspenders. The prompt directive guides the model; tool_choice = Specific(name) is a hard constraint at the API level. Together they ensure deterministic grounding for hero-stack questions without breaking the agent's freedom on non-hero queries.

Meta-tracker: home#193.

Signed-off-by: mik-tf

Fixed in hero_agent commit `54ba5d5` on `development`. Implements all 4 steps from the issue body. **Step 1+2 — `LlmOptions.tool_choice` + body emission** (`llm_client.rs`): ```rust pub enum ToolChoice { Auto, Required, Specific(String), } pub struct LlmOptions { // … pub tool_choice: Option<ToolChoice>, } ``` In `build_request_body`, after setting `tools`: ```rust if let Some(choice) = &options.tool_choice { body["tool_choice"] = match choice { ToolChoice::Auto => json!("auto"), ToolChoice::Required => json!("required"), ToolChoice::Specific(name) => json!({ "type": "function", "function": { "name": name } }), }; } ``` The emit is gated inside the existing `if let Some(tools) = tools && !tools.is_empty()` block — `tool_choice` without `tools` is meaningless and would 400 the provider, so the field is omitted. **Step 3 — agent.rs heuristic for forced grounding:** New helper `message_contains_hero_keyword(messages)` scans the LATEST user message only, case-insensitive, against the same hero_* trigger set used in prompt.rs (home#149). Earlier turns are explicitly ignored — otherwise we'd pin grounding on every iteration after the first match. In `agent_loop()`, on iteration 0 only: ```rust let tool_choice = if iteration == 0 && tools.iter().any(|t| t.function.name == "search_hero_docs") && message_contains_hero_keyword(messages) { Some(ToolChoice::Specific("search_hero_docs".to_string())) } else { None }; ``` Matches the issue body's "Rollback / safety": iterations >= 1 carry `tool_choice = None` so the model can compose the final answer from tool_results without being forced into another tool call. **Step 4 — `force_tools` agent.chat param:** deferred. The `tool_choice` field on `LlmOptions` is already public and can be threaded through `agent.chat` if a test harness needs to force a specific tool — adding the explicit user-facing param can land in a follow-up if/when an integration test demands it. **Tests added (9 new):** `llm_client::tests::tool_choice` (4): - `body_omits_tool_choice_when_none` — preserves provider default ("auto") - `body_emits_required_when_required` - `body_emits_specific_when_specific` - `body_omits_tool_choice_without_tools_even_if_set` — guard against 400 `agent::tests` (5): - basic match (Hero OS, hero_books, hero_aibroker) - case-insensitive (HERO OS, Hero_Books) - unrelated content rejected ("Who is your hero?", weather) - only the latest user message is scanned - empty messages handled **Verification:** `cargo fmt`, `cargo check -p hero_agent` clean, all 98 hero_agent lib tests pass (89 pre-existing + 9 new). **Pairing:** This is the belt to home#149's prompt-directive suspenders. The prompt directive *guides* the model; `tool_choice = Specific(name)` is a *hard constraint* at the API level. Together they ensure deterministic grounding for hero-stack questions without breaking the agent's freedom on non-hero queries. Meta-tracker: home#193. Signed-off-by: mik-tf

mik-tf referenced this issue

2026-04-27 04:42:47 +00:00

[nu-demo] hero_agent triage classifier routes 'What is Hero OS?' to Knowledge path (no tools offered) — grounding never happens #152

mik-tf referenced this issue

2026-04-27 04:53:58 +00:00

[nu-demo] META — Sequential close-out plan for the remaining 36 [nu-demo] bugs (post-2026-04-26 sweep) #193

mik-tf referenced this issue from lhumina_code/hero_demo

2026-04-28 12:20:51 +00:00

[nu-demo] CANONICAL — Intelligence pipeline documented end-to-end: hero_agent ↔ aibroker ↔ embedder ↔ hero_books ↔ docs_hero #27

mik-tf referenced this issue from lhumina_code/hero_demo

2026-04-28 12:20:57 +00:00

[nu-demo] STATE OF HERONU as of 2026-04-24 — what works, what's still broken, everything that needs upstream landing #28