my_compute_zos_server service.toml missing [[env]] for TFGRID_NETWORK / TFGRID_NODE_IDS / TFGRID_MNEMONIC — supervised path defaults to mainnet regardless of hero_proc secret values #127
Labels
No labels
prio_critical
prio_low
type_bug
type_contact
type_issue
type_lead
type_question
type_story
type_task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lhumina_code/hero_compute#127
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
crates/my_compute_zos_server/service.tomlonly declares[[env]]blocks forPATH_ROOT,HERO_SOCKET_DIR, andRUST_LOG. The chain-targeting env varsTFGRID_NETWORK,TFGRID_NODE_IDS, andTFGRID_MNEMONICare not declared, so when hero_proc supervises the daemon, those values are not injected into the process environment (the executor composes env from context secrets + service spec env; missing keys are not added).Result: setting
hero_proc secret set core/TFGRID_NETWORK qathenhero_proc service restart my_compute_zos_serverdoes NOT switch the daemon to QA — the daemon readsstd::env::var(\"TFGRID_NETWORK\")at startup, getsErr(NotPresent), and falls back to the defaultmainpercrates/my_compute_zos_server/src/config.rs:42.Workaround used in s158: stop the supervised daemon, launch manually via
nohupwith explicit env (mirrors theTFGRID_DEBUG=1workaround from s157d). Loses the supervisor + restart-on-crash guardrails.Right fix: add three
[[env]]blocks in service.toml withdefault = \"\", mirroring Lessons #17 + #19 (any env var the daemon reads must be declared in service.toml so hero_proc + lab propagate it). Then the existing hero_proc supervisor context-secrets injection atcrates/hero_proc_server/src/supervisor/executor.rs:61-71will route the secret values into the process env at spawn time.Caught during s158 admin-on-TFGrid setup when pivoting from mainnet to QAnet.
Correcting the diagnosis on this issue after re-reading the canonical hero_proc architecture.
The fix is NOT to add
[[env]]blocks toservice.toml. Per thehero_proc_secrets_and_metaskill (canonical), every_adminand_serverprocess must source all configuration exclusively from the hero_proc secret store viahero_proc_sdk::secret_get, not from OS env, not from a.envfile, not fromservice.toml [[env]].The actual root cause is in
crates/my_compute_zos_server/src/util.rs::load_env()andcrates/hero_compute_sdk/src/lib.rs::load_env(). Both read~/hero/var/.envinto process env at startup.Config::from_env()then callsstd::env::var("TFGRID_NETWORK")etc. This whole chain is off-pattern:.envfiles are not a supported config source for hero-supervised processes.The proper fix is a
Config::from_hero_proc()method that connects to the hero_proc UDS and callssecret_getfor each TFGrid-namespaced key (TFGRID_MNEMONIC,TFGRID_NETWORK, plus operational settings). TheConfigstruct already has ahero_proc_client()helper and aset_secret()writer; we just need the symmetric reader. Standalone CLI modes (sign_auth, cancel_contracts, delete_all_vms) should use the same client.Plan: ship the
from_hero_proc()refactor formy_compute_zos_serverin this session, file follow-ups for the four sibling binaries (my_compute_mos_server,my_compute_explorer_server,my_compute_explorer_admin,my_compute_zos_admin) that all currently callhero_compute_sdk::load_env(). After the refactor lands,util::load_env()andhero_compute_sdk::load_env()become deprecated and can be removed across the workspace in a sweep.Reference: hero_proc_secrets_and_meta skill, Hero Compute Suite docs.
Closed by hero_compute
32a2e2e—Config::from_hero_proc()async method reads TFGRID_MNEMONIC / TFGRID_NETWORK / slice settings from the hero_proc secret store viahero_proc_sdk::secret_get. Replaces the off-patternutil::load_env() + Config::from_env()chain that read~/hero/var/.envinto process env. Per the canonicalhero_proc_secrets_and_metaskill — already explained in comment 36979. Sibling binaries (my_compute_mos_server,my_compute_explorer_server,my_compute_explorer_admin,my_compute_zos_admin) still callhero_compute_sdk::load_env()and need the same treatment — tracked as new follow-up below.