[nu-demo] hero_aibroker_server crash-loops on missing /var/ modelsconfig.yml — agent gets 401, Settings → Environment page broken, AI Assistant 'network error' #198
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom (user-visible on herodemo, 2026-04-27)
Error: expected value at line 1 column 1— endpoint returns empty/HTML, frontend JSON parse fails. No API keys visible.Stream read error: JsValue(TypeError: network error TypeError: network error)after sending a message. Worked one minute earlier (different LLM provider routed direct).LLM streaming API error 401 Unauthorized: <html>for every Claude call (claude-3-5-sonnet-latest,claude-haiku-4.5).Root cause
The
_serverbinary has been down on herodemo since 2026-04-24. Onlyhero_aibroker_uiis running (the UI alone can't relay LLM calls, hence the 401s).The config file does exist in the source repo at
/home/driver/hero/code0/hero_aibroker/modelsconfig.yml, and in the cargo registry checkout, but it was never seeded into/home/driver/hero/var/hero_aibroker/— that directory doesn't exist at all on herodemo.Downstream fan-out
expected value at line 1 column 1Fix options
service_aibroker.nushouldcpthe bundledmodelsconfig.ymlfrom the repo into/home/driver/hero/var/hero_aibroker/on first install, with idempotent skip-if-present.modelsconfig.ymlif the var/ path is missing, and log a warning rather than failing startup.mkdir -p /home/driver/hero/var/hero_aibroker && cp /home/driver/hero/code0/hero_aibroker/modelsconfig.yml /home/driver/hero/var/hero_aibroker/recovery step in the demo runbook.Companion to #137
#137 covers the content of
modelsconfig.yml(groq-only models with no fallbacks). This issue is one layer below: the file isn't present at the expected path at all on herodemo, so hero_aibroker can't even load any routing config.Verification path
Filed during the rc.12 herodemo validation, surfaced when user tested AI Assistant. Pre-existing bug, ~3 days old; NOT caused by the rc.12 work.
Signed-off-by: mik-tf
Root cause + fix landed
The seeding logic in
service_aibroker.nuwas correct all along — it copiesmodelsconfig.ymlfrom the cloned source into$(svc_home $root)/var/hero_aibroker/. It silently no-op'd on herodemo becauseCODEROOTwas empty in the SSH login shell that drove the install.Why: Debian's default
~/.profiledoesn't source~/.bashrc. install.sh only writes the[ -f ~/hero/cfg/init.sh ] && source …snippet to~/.bashrc. SSH login shells (su - driver -c '…', cron, hero_proc child processes) read.profileand skip.bashrc, soCODEROOT/ROOTDIR/BUILDDIRnever reach them.forge_ensure_local\sensure_env_varserrors out, the seed step is skipped, but the action is still registered and crash-loops on the missing config.install.sh's
patch_shell_rc_filesalready adds a.profile → .bashrcbridge on first bootstrap (lines 741-758). Hosts upgraded viagit pullonly — like herodemo — don't pick up later additions to that logic.Fix landed
hero_skills
7ce11b3on development:ensure_shell_initfunction ininstallers.nuthat mirrors the install.sh.profilebridge logic.install_coreso everyinstall_corerun is self-healing on existing hosts (idempotent).Pragmatic patch on herodemo (used during today's session):
service_aibroker start --resetwith~/hero/cfg/init.shexplicitly sourced — broker is up and serving 42 models.Long-term verification
Next herodemo redeploy from clean development branches will exercise the
install_core → ensure_shell_initpath. If.profilealready has a hero bridge or sources.bashrc, the function logsalready bridges to hero initand skips. If neither, it appends the bash bridge snippet.Closing.
Signed-off-by: mik-tf