hero_office: shared OnlyOffice editing engine on the admin instance #31

Open
opened 2026-06-18 02:53:48 +00:00 by mik-tf · 4 comments
Owner

hero_office needs an OnlyOffice Document Server to edit documents. We run one OnlyOffice engine on the admin instance, shared by every member, the same way the embedding and voice engines work, instead of a container on each member (members are small and the image is heavy). Documents stay in each member's own file storage; the engine only renders and converts.

This is now proven on the admin instance itself: the engine runs (host networking, JWT enabled), hero_office is pointed at it, and a real document uploads, opens an editor session, and is fetched and converted by the engine end to end. The container runtime needed a couple of grid-VM tweaks (the native containerd snapshotter on the virtiofs light VMs, since the default one cannot create whiteout device nodes there, and starting containerd by hand because grid VMs have no systemd), all written up in the admin instance deployment runbook (Step 10.6).

Two things remain. First, serve the one engine across separate member machines: members reach it over the overlay, and because editing is callback-driven the engine has to reach back into each member to fetch and save, which must go through that member's authenticated front door and never around the login gate. Second, have the deployer set the engine address and the shared secret on each member automatically at install, the same way it already seeds the embedding and voice engine settings.

Tracking under lhumina_code/home#304

Signed-by: mik-tf mik-tf@noreply.invalid

hero_office needs an OnlyOffice Document Server to edit documents. We run one OnlyOffice engine on the admin instance, shared by every member, the same way the embedding and voice engines work, instead of a container on each member (members are small and the image is heavy). Documents stay in each member's own file storage; the engine only renders and converts. This is now proven on the admin instance itself: the engine runs (host networking, JWT enabled), hero_office is pointed at it, and a real document uploads, opens an editor session, and is fetched and converted by the engine end to end. The container runtime needed a couple of grid-VM tweaks (the native containerd snapshotter on the virtiofs light VMs, since the default one cannot create whiteout device nodes there, and starting containerd by hand because grid VMs have no systemd), all written up in the admin instance deployment runbook (Step 10.6). Two things remain. First, serve the one engine across separate member machines: members reach it over the overlay, and because editing is callback-driven the engine has to reach back into each member to fetch and save, which must go through that member's authenticated front door and never around the login gate. Second, have the deployer set the engine address and the shared secret on each member automatically at install, the same way it already seeds the embedding and voice engine settings. Tracking under https://forge.ourworld.tf/lhumina_code/home/issues/304 Signed-by: mik-tf <mik-tf@noreply.invalid>
Author
Owner

Looked into the cross-member save callback (the engine reaching back into each member). It is a real code change, not a config tweak, and the path is now clear:

  1. hero_office: the editor's file-read and save-callback endpoints currently do not check the OnlyOffice token they already receive. Have them validate that signed token (the same shared secret the engine uses), so the endpoints authenticate themselves regardless of where the request comes from.

  2. hero_proxy: today the login gate is per host, with no per-path exception except one existing, env-gated carve-out for public whiteboard shares. Add the same kind of narrow, env-gated carve-out for exactly the two office endpoints (file read and save callback), matched by path so it works even though the engine addresses the member by its overlay address rather than its public name. Everything else stays gated. The office endpoints are then reachable but authenticated by the token check from step 1, so the login gate is never bypassed.

  3. deployer: at install, point each member's office at the shared engine and the engine back at that member (engine address, the member's own callback address, and the shared token), the same way it already seeds the embedding and voice engine settings.

The engine, the same-machine editing loop, and the runbook setup are already proven (see the issue body). This callback work is the remaining piece and is a focused next step.

Signed-by: mik-tf mik-tf@noreply.invalid

Looked into the cross-member save callback (the engine reaching back into each member). It is a real code change, not a config tweak, and the path is now clear: 1. hero_office: the editor's file-read and save-callback endpoints currently do not check the OnlyOffice token they already receive. Have them validate that signed token (the same shared secret the engine uses), so the endpoints authenticate themselves regardless of where the request comes from. 2. hero_proxy: today the login gate is per host, with no per-path exception except one existing, env-gated carve-out for public whiteboard shares. Add the same kind of narrow, env-gated carve-out for exactly the two office endpoints (file read and save callback), matched by path so it works even though the engine addresses the member by its overlay address rather than its public name. Everything else stays gated. The office endpoints are then reachable but authenticated by the token check from step 1, so the login gate is never bypassed. 3. deployer: at install, point each member's office at the shared engine and the engine back at that member (engine address, the member's own callback address, and the shared token), the same way it already seeds the embedding and voice engine settings. The engine, the same-machine editing loop, and the runbook setup are already proven (see the issue body). This callback work is the remaining piece and is a focused next step. Signed-by: mik-tf <mik-tf@noreply.invalid>
Author
Owner

The cross-member save callback is implemented and proven live on one member.

The member side now authenticates the two endpoints the shared editor calls back into: the document fetch validates the OnlyOffice token in the Authorization header, and the save callback validates the token embedded in its JSON body before trusting any of its fields. The proxy on each member carries a narrow, opt-in carve-out so only those two paths reach the editor without a login (every other path stays behind the login gate), and the editor re-validates the token, so the login gate is never actually bypassed. The provisioner seeds the editor wiring per member on both install and on "Update all instances".

Verified over the overlay against a live member: a request with no token and a request with a wrong-secret token are both refused, a request with a valid token fetches the document and returns its bytes, and a save callback with a valid token is accepted while one without is rejected.

One residual to track for after this phase: the editor secret is shared across the whole organization, so a valid token from one member is replayable against another member's endpoints. The login gate still blocks anyone without the secret, and documents never leave their member, so this stays inside a single trust boundary. The follow-up is a per-member secret (optionally scoped to a single document), noted here so it is not lost.

The cross-member save callback is implemented and proven live on one member. The member side now authenticates the two endpoints the shared editor calls back into: the document fetch validates the OnlyOffice token in the Authorization header, and the save callback validates the token embedded in its JSON body before trusting any of its fields. The proxy on each member carries a narrow, opt-in carve-out so only those two paths reach the editor without a login (every other path stays behind the login gate), and the editor re-validates the token, so the login gate is never actually bypassed. The provisioner seeds the editor wiring per member on both install and on "Update all instances". Verified over the overlay against a live member: a request with no token and a request with a wrong-secret token are both refused, a request with a valid token fetches the document and returns its bytes, and a save callback with a valid token is accepted while one without is rejected. One residual to track for after this phase: the editor secret is shared across the whole organization, so a valid token from one member is replayable against another member's endpoints. The login gate still blocks anyone without the secret, and documents never leave their member, so this stays inside a single trust boundary. The follow-up is a per-member secret (optionally scoped to a single document), noted here so it is not lost.
Author
Owner

Live status from end to end testing on a member instance.

Merged and live (integration): hero_foundry creates a member's per context document store on first access (was 404 for every real member), and the deployer wires the connector to the member's https FQDN instead of mycelium [ipv6]:9997 (OnlyOffice rejects bracketed IPv6 URLs with ERR_INVALID_URL). With these, the engine successfully downloads and converts a real document from the member FQDN, the editor config and token validate, and the carve-out stays JWT gated with the login floor intact for everything else.

Remaining blocker before a member can open and edit in the desktop: the browser to engine realtime channel does not complete through the member login floor, so the engine never gets the open command and the editor reports "Download failed". Tracked at #32

Live status from end to end testing on a member instance. Merged and live (integration): hero_foundry creates a member's per context document store on first access (was 404 for every real member), and the deployer wires the connector to the member's https FQDN instead of mycelium [ipv6]:9997 (OnlyOffice rejects bracketed IPv6 URLs with ERR_INVALID_URL). With these, the engine successfully downloads and converts a real document from the member FQDN, the editor config and token validate, and the carve-out stays JWT gated with the login floor intact for everything else. Remaining blocker before a member can open and edit in the desktop: the browser to engine realtime channel does not complete through the member login floor, so the engine never gets the open command and the editor reports "Download failed". Tracked at https://forge.ourworld.tf/lhumina_code/hero_office/issues/32
Author
Owner

The shared editing engine is now confirmed working live from a member instance end to end (open and edit), with the document fetch and save callback authenticated through the member's login-gate carve-out. The last blocker (the editor showing Download failed) is fixed: see #32 .

The shared editing engine is now confirmed working live from a member instance end to end (open and edit), with the document fetch and save callback authenticated through the member's login-gate carve-out. The last blocker (the editor showing Download failed) is fixed: see https://forge.ourworld.tf/lhumina_code/hero_office/issues/32 .
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_office#31
No description provided.