Kimi assistant: trim the MCP tool surface so chat actions stay fast #249
Labels
No labels
meeting-notes
meeting-transcript
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/home#249
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When the Kimi assistant is wired to planner, whiteboard and slides over MCP, it loads every tool from all three services at startup (about 91 + 102 + 152, roughly 345 tools). Reads (list and summarize) are reliable. Write actions that take a few tool calls are slow on a cold start and can take over a minute or time out, because every turn carries the full tool surface and the model reasons over all of it.
In a live chat the session stays warm, so the first message pays the load cost and later messages are fast. The slowness mainly shows up on the first action or on scripted single shot calls.
Proposal: curate or subset the MCP tools exposed to Kimi (for example a small high value set per app, or lazy tool loading), so a single action stays fast even on a cold start, while keeping all three apps reachable.
Related deploy readiness findings while wiring this up (so they are tracked):
Verified on the live demo tester: Kimi connected on the moonshotai/kimi-k2.6 model, read planner, whiteboard and slides over MCP and returned real data. Write actions succeed but are slow on a cold start as described above.
The Kimi assistant change that implements this is in hero_kimi_rust (lhumina_code/hero_kimi_rust#3): instead of sending every connected tool with its full description on every turn, the tools are kept out of the list and the model pulls only the ones it needs through a search tool, which removes the large per turn cost that made the first action slow. It has merged into that repo's main branch (commit b9210c9), so deploying that build to the tester should make assistant actions fast and close this. We will verify on the tester after deploying.
Signed-by: mik-tf mik-tf@noreply.invalid