[nu-demo] Roadmap: seed Office archipelago with per-library generated PDFs/docs (match context to library) #21

New issue

Closed

opened 2026-04-28 12:20:28 +00:00 by mik-tf · 0 comments

mik-tf commented

2026-04-28 12:20:28 +00:00

Vision

When a user opens Office → PDF / Documents / Spreadsheets / Presentations in a given context, they should see realistic content that matches the context's library. Generate office files from the already-cloned markdown docs:

Context	Library	Office content
`root` / `hero`	`docs_hero`	PDFs of quickstart / architecture / services / contexts / configuration / overview / local_first. Cheatsheets as slides. API reference as spreadsheets.
`geomind`	`docs_geomind`	(once Forge auth sorted) Geomind research PDFs, roadmap slides
`mycelium`	`docs_mycelium`	Mycelium peer docs PDFs, network architecture diagrams
`ourworld`	`docs_owh`	OurWorld operations PDFs

Why this is the right demo

Today Office islands show empty states or 404. A fresh visitor opening Office > PDF has nothing to click. With this seeded content, each context feels like a real workspace with domain-specific knowledge already loaded.

Combined with AI grounding (home#130 pending embedder), you get: user opens geomind → Office > PDF → sees geomind-specific docs → AI Assistant can answer geomind-specific questions grounded in those same docs. Complete loop.

Pipeline

Library markdown (already cloned by hero_books to /home/driver/code/<lib>/) → conversion → drop into ~/hero/var/hero_foundry/webdav/<context>/Documents/ and/or Office/.

Conversion candidates:

PDF: pandoc (installed via apt, works cleanly on Ubuntu). One PDF per markdown file, preserving heading structure.
Slides (.pptx or HTML reveal): pandoc -t pptx or pandoc -t revealjs. A markdown file with level-1 headings becomes a slide deck.
Spreadsheets: any markdown tables → xlsx via pandoc or a small Python script with openpyxl.

All three pipelines are short scripts that iterate /home/driver/code/docs_<lib>/ and write to ~/hero/var/hero_foundry/webdav/<context>/Documents/ (or a dedicated Office/ subdir if the archipelagos expect that).

Proper implementation

Add a new hero_books method (or a new hero_office method) office.scaffold --library X --context Y that:

Reads all *.md in the library's cloned repo.
Runs pandoc (or a pure-Rust equivalent) to generate PDF + PPTX + XLSX variants where appropriate.
Writes them to webdav/<context>/Documents/.
Optionally creates corresponding hero_osis_office entries so list_documents returns metadata.

Invoke once per (library, context) pair on first deploy, or expose a UI button in hero_office UI.

Demo workaround

(To apply once embedder + books + AI are fully wired.) A shell script on the VM that does find /home/driver/code/docs_hero -name '*.md' | xargs -I{} pandoc {} -o webdav/hero/Documents/{}.pdf — crude but fills the office with real content for the demo.

hero_books already converts markdown → book.toml format (see discover_and_convert_ebooks in hero_books_server/src/web/server.rs). Same mental model.
Foundry webdav already mounted (home#139 fix applied).
AI grounding needs this to feel real (home#130).

Filed 2026-04-23 nu-shell demo. Signed-off-by: mik-tf

Originally filed as home#144 on 2026-04-23 by mik-tf — moved to hero_demo as part of consolidating issue tracking.

## Vision When a user opens Office → PDF / Documents / Spreadsheets / Presentations in a given context, they should see **realistic content that matches the context's library**. Generate office files from the already-cloned markdown docs: | Context | Library | Office content | |---|---|---| | `root` / `hero` | `docs_hero` | PDFs of quickstart / architecture / services / contexts / configuration / overview / local_first. Cheatsheets as slides. API reference as spreadsheets. | | `geomind` | `docs_geomind` | (once Forge auth sorted) Geomind research PDFs, roadmap slides | | `mycelium` | `docs_mycelium` | Mycelium peer docs PDFs, network architecture diagrams | | `ourworld` | `docs_owh` | OurWorld operations PDFs | ## Why this is the right demo Today Office islands show empty states or 404. A fresh visitor opening Office > PDF has nothing to click. With this seeded content, each context feels like a real workspace with domain-specific knowledge already loaded. Combined with AI grounding ([home#130](https://forge.ourworld.tf/lhumina_code/home/issues/130) pending embedder), you get: user opens geomind → Office > PDF → sees geomind-specific docs → AI Assistant can answer geomind-specific questions grounded in those same docs. Complete loop. ## Pipeline Library markdown (already cloned by hero_books to `/home/driver/code/<lib>/`) → conversion → drop into `~/hero/var/hero_foundry/webdav/<context>/Documents/` and/or `Office/`. Conversion candidates: - **PDF**: `pandoc` (installed via apt, works cleanly on Ubuntu). One PDF per markdown file, preserving heading structure. - **Slides** (.pptx or HTML reveal): `pandoc -t pptx` or `pandoc -t revealjs`. A markdown file with level-1 headings becomes a slide deck. - **Spreadsheets**: any markdown tables → xlsx via `pandoc` or a small Python script with `openpyxl`. All three pipelines are short scripts that iterate `/home/driver/code/docs_<lib>/` and write to `~/hero/var/hero_foundry/webdav/<context>/Documents/` (or a dedicated `Office/` subdir if the archipelagos expect that). ## Proper implementation Add a new hero_books method (or a new `hero_office` method) `office.scaffold --library X --context Y` that: 1. Reads all `*.md` in the library's cloned repo. 2. Runs `pandoc` (or a pure-Rust equivalent) to generate PDF + PPTX + XLSX variants where appropriate. 3. Writes them to `webdav/<context>/Documents/`. 4. Optionally creates corresponding hero_osis_office entries so `list_documents` returns metadata. Invoke once per (library, context) pair on first deploy, or expose a UI button in hero_office UI. ## Demo workaround (To apply once embedder + books + AI are fully wired.) A shell script on the VM that does `find /home/driver/code/docs_hero -name '*.md' | xargs -I{} pandoc {} -o webdav/hero/Documents/{}.pdf` — crude but fills the office with real content for the demo. ## Related - hero_books already converts markdown → book.toml format (see `discover_and_convert_ebooks` in hero_books_server/src/web/server.rs). Same mental model. - Foundry webdav already mounted ([home#139](https://forge.ourworld.tf/lhumina_code/home/issues/139) fix applied). - AI grounding needs this to feel real ([home#130](https://forge.ourworld.tf/lhumina_code/home/issues/130)). Filed 2026-04-23 nu-shell demo. Signed-off-by: mik-tf --- *Originally filed as [home#144](https://forge.ourworld.tf/lhumina_code/home/issues/144) on 2026-04-23 by mik-tf — moved to hero_demo as part of consolidating issue tracking.*