# Yumi — Master Design & Gap Document

**Status:** living document · **Authored:** 2026-06-22 · **Owner:** Ben Hippler
**Audience:** Shadmehr (Shad) + the build team. This is the single source of truth for what
Yumi *is*, what it *has today* (consolidated from the baobab/MMD stack), what is *broken or
fragmented*, and what must be *built* — including the Encryption/TEE module with a detailed
plan. Read top to bottom; everything is numbered so nothing is lost.

> **Hindsight:** Yumi knowledge lives in the MMD Hindsight bank under tag `yumi` and
> `context: yumi-*`. A dedicated `yumi` **bank** is planned (§6.5, §10.4) but is an auth-router
> provisioning action, not yet minted — see §10.

---

## Table of Contents

1. Purpose & scope
2. Product vision
3. Current state (what was consolidated)
4. Architecture in one read
5. Design system consolidation ("fix all designs")
6. Naming consolidation
7. Central User Administration (CUA)
8. Surface coverage gaps (iOS / Android / Windows / macOS / web / Office)
9. Integrations & connectors
10. Memory architecture (Hindsight) — topology, separation, admin
11. LLM gateway
12. Security & trust (excluding Encryption/TEE → §14)
13. UX
14. Encryption & TEE module (detailed implementation plan)
15. File storage & processing
16. Billing & subscription
17. Observability & operations
18. Real gaps register (excluding Encryption/TEE)
19. Roadmap (phases)
20. Open decisions
21. Appendix (provenance, glossary, links)

---

## 1. Purpose & scope

This document defines Yumi end to end: the **design**, the **naming**, the **central user
administration**, the **real gaps** (not nice-to-haves), the **memory/bank architecture**,
and a **detailed implementation plan for the Encryption/TEE module**. It is written *from
scratch* from the consolidated workspace at `~/yumi` so Shad and future agents share one
mental model. Encryption/TEE is called out separately (§14) because Ben has declared it a
separate module; it is *excluded* from the "real gaps" register (§18) by design.

**Non-goals here:** the actual MMD→Yumi rename (designed in §6, execution deferred), and
live production mutations (this document plans them; none are executed in this pass).

---

## 2. Product vision

Yumi is a **subscription AI colleague** that remembers the user, understands their context,
accesses connected information, drafts content, performs tasks, executes workflows, and
interacts with connected systems — continuously improving through accumulated memory.

Pillars:
- **Frontier models** through one router (§11).
- **Persistent memory** + personal & organizational knowledge banks (§10).
- **Workflow execution** (FlowMaster engine).
- **Integrations** across email, calendar, files, cloud storage, business apps (§9).
- **Every surface, one identity, one memory, one session** — desktop, web, mobile (§8).
- **Trust** — encryption, hardware-backed security, secure enclaves (§14), so sensitive
  client/business data is safe.
- **Multi-provider SSO** — Google, Apple, Microsoft (§7).

---

## 3. Current state (what was consolidated)

On 2026-06-21 the baobab/MMD Cowork stack was copied **verbatim** into `~/yumi` (creds
scrubbed; provenance in `PROVENANCE.md`). Eight subsystems:

| Path | Role |
|---|---|
| `apps/open-cowork-mmd` | Electron desktop (macOS/Windows), local SQLite cache |
| `apps/mmd-cowork-mobile` | Web app + mobile PWA; hosts shared `/api/conversations` |
| `services/mmd-cowork-core` | `@mmd/cowork-core` — shared engine (conversations, auth, tools, connectors) |
| `services/mmd-cowork-config` | `@mmd/cowork-config` — shared static config (MCP servers, skills, surfaces) |
| `services/mmd-cowork-m365` | M365 manifests + tenant packaging |
| `services/mmd-cowork-office` | Outlook/Teams add-in runtime flavor |
| `services/mmd-llm-gateway` | LiteLLM router → `llm.baobab-ts.com` |
| `services/hindsight` | Memory system (MIT, `vectorize-io/hindsight`) |

**Honest caveat:** the cross-app backbone exists but is *not fully rolled out across every
flavor* (per the source architecture doc, 2026-06-18). Several subsystems were copied with
uncommitted local changes. Treat each subdir as its own repository until §6 is executed.

---

## 4. Architecture in one read

```
            surfaces:  desktop (Electron) · web/PWA · Office add-in · (iOS/Android: MISSING)
                              │  shared engine @mmd/cowork-core  ·  shared config @mmd/cowork-config
                              │  shared API host: mmd-cowork-mobile /api/conversations
                              ▼
        ┌─────────────────┬────────────────────────┬───────────────────────────┐
   LLM router        Hindsight memory           MCP connectors
   (LiteLLM)         per-user + org banks       M365 Graph · FlowMaster · SAP · GLPI
        │                  │  (Google/Apple/cloud: MISSING — §9)
        ▼                  ▼
   Z.ai GLM-5.2      auth-router + PATs + vault-portal  (identity: Entra only today — §7)
```

Shared pillars (from the source architecture doc):
1. **Hindsight = memory, not transcript storage.** Facts, decisions, learnings, guidance.
2. **Static config (`@mmd/cowork-config`)** is the single source for MCP servers, skills,
   surfaces, credential references.
3. **Shared conversations API** (`mmd-cowork-mobile /api/conversations`) — the seam Yumi
   exploits so desktop + web + mobile share sessions.

---

## 5. Design system consolidation ("fix all designs")

### 5.1 The fragmentation (concrete)
Five distinct styling systems coexist today:

| # | System | Where | Tokens | Status |
|---|---|---|---|---|
| 1 | `mmd-portal-letterbox-v1` | cowork mobile + office (`public/_design-system/`), mmd-portal-shell | paper `#FAFAF6`/`#F3F0E6`, ink `#0E0E0C`, baobab `#C8102E`, Fraunces/PP Neue Machina/JetBrains Mono | ✅ aligned, **strict manifest linter** enforced |
| 2 | `flowmaster-theme-letterbox-v1` | FM public/marketing surfaces | same letterbox tokens | ✅ the canonical theme #1 derives from |
| 3 | Tailwind + semantic `--color-*` + literal letterbox | `open-cowork-mmd` desktop | `--color-background/surface/accent/mcp…` + duplicated `paper/ink/baobab` literals | ❌ outlier — own scheme, not manifest-driven |
| 4 | `fm-shell` + `@flowmaster/shared` | FM portal app | Geist/Geist_Mono/Inter, `--sans: Inter` | different product surface; out of Yumi scope unless unified later |
| 5 | per-tenant brand templates | DHGS/MMD tenant templates | various | explicit brand exceptions |

### 5.2 Target: one Yumi design system
Adopt **`mmd-portal-letterbox-v1`** (to be renamed `yumi-letterbox-v1` at rebrand, §6) as the
**single Yumi DS**, because it already (a) carries the canonical letterbox tokens, (b) has a
working **strict enforcement linter** (`scripts/check-design-system.mjs`: no `styles.css`, no
inline `style=`/hex/rgb/hsl, every class must be in `design-system.manifest.json`), and (c) is
already shared by 3 of the surfaces.

Canonical tokens (the contract every surface must use):
```
--paper #FAFAF6  --paper-2 #F3F0E6  --paper-3 #ECE6D0
--ink #0E0E0C    --ink-soft #2B2A26  --ink-mute #6E6B62  --ink-faint #A6A39A
--hairline #D6CFB6   --baobab #C8102E  --baobab-bg #FBE9EC
--font-display "Fraunces"  --font-ui "PP Neue Machina"  --font-mono "JetBrains Mono"
radii ≤ 4px · no drop shadows · baobab is a SCARCE accent · dark theme via body[data-theme=dark]
```

### 5.3 Fix plan (numbered)
1. **Promote** `packages/design-system/` (created this pass) to the canonical token + manifest
   source; surfaces consume it, never redefine tokens.
2. **Migrate `open-cowork-mmd`** off its bespoke `--color-*` Tailwind scheme onto the shared
   `_design-system/` + manifest pattern (port its `--color-mcp` accent into the manifest as a
   documented component token). Keep Tailwind for layout utilities only; colors/fonts via tokens.
3. **Port the strict linter** (`check-design-system.mjs`) into every surface's CI so regressions
   to local styling fail the build.
4. **Font licensing:** PP Neue Machina / PP Mori were never found locally (caveat since
   2026-06-03) — license them or pick a documented fallback in the token contract.
5. **Accessibility baseline** for Yumi (Ben skipped WCAG for the MMD pilot, but Yumi is a
   product): define AA targets on the letterbox palette and add to the linter.

---

## 6. Naming consolidation

Rebrand execution is deferred (Ben, 2026-06-21), but the **naming rules** are fixed now so all
new code is consistent and the eventual rename is mechanical.

### 6.1 Rules
- **Product name:** Yumi (user-visible). **Org/infra names** (baobab, flow-master.ai,
  gitea.mmd01) stay until the rename phase.
- **Package scope:** `@yumi/*` for shared packages (replaces `@mmd/*` at rename). Until then,
  keep `@mmd/cowork-core`, `@mmd/cowork-config`.
- **One concept = one name.** Today "cowork", "Manzi", "sidekick", "Yumi agent" overlap.
  Canonical: the assistant product = **Yumi**; the embedded panel = **Yumi panel**; the
  back-end agent loop = **Yumi agent** (in `@mmd/cowork-core` today). Retire "Manzi/sidekick"
  naming in new code.
- **Bank naming:** per-user bank slug = identity-local-part (e.g. `kelvin-amboso`); org banks
  = `<org-slug>`; the product memory bank = `yumi` (planned, §10.4).

### 6.2 Rename map (for the rebrand phase; do not execute yet)
| Current | Target | Scope |
|---|---|---|
| `mmd-cowork-*` repos/dirs | `yumi-*` | repos, packages, dirs |
| `@mmd/cowork-core` `@mmd/cowork-config` | `@yumi/core` `@yumi/config` | package names |
| `mmd-portal-letterbox-v1` | `yumi-letterbox-v1` | DS manifest `name` |
| `MMD Cowork` app title / `open-cowork-mmd` | `Yumi` / `yumi-desktop` | desktop build identity |
| Entra app `MMD-Open-Cowork` (e99cd268) | new Yumi Entra app | identity (§7) |
| `mmd_pat_<bank>_<hex>` | `yumi_pat_<bank>_<hex>` | PAT scheme (increment version) |

### 6.3 Safe consolidation done this pass
- `packages/design-system/` created as the canonical token source (§5).
- No MMD identifiers renamed (deferred).

---

## 7. Central User Administration (CUA)

Yumi needs one place to manage users, identities, groups, per-user resources, and lifecycle —
across Google, Apple, and Microsoft. Today these capabilities exist but are **Entra-only and
scattered** across vault-portal, auth-router, CIO-Agent, oauth2-proxy, and group-based RBAC.
CUA unifies them.

### 7.1 What exists today (ground truth)
- **Entra tenant** `28621512-cef2-429b-a1fe-6ad79963f197` (primary `Fabrimetal.net`).
- **MMD-CIO-Agent** service principal (appId `dcdf6e10…`) — Cloud App Admin, Privileged Auth
  Admin, Exchange Admin, Teams Admin, etc.; Graph `User/Group/Directory.ReadWrite.All`,
  `Mail.Send`. **Can create/configure app registrations** (proven). This is the provisioning
  engine.
- **oauth2-proxy** cluster SSO at `auth.mmd01.flow-master.ai` (Entra OAuth), injects
  `X-Auth-Request-Email` / `X-Forwarded-Access-Token`.
- **Group-based RBAC:** permissions derive from group membership (e.g. `MMD-AI-Users`);
  `MMD_ENTRA_GROUP_MAP` maps groups to connector access. No per-entity branding; everyone is in
  multiple groups.
- **Per-user provisioning automation** (`mmd-ops/provisioning/opencowork/provision.py`):
  group-add → per-user Hindsight bank + PAT + auth-router-config + per-user gateway credential
  + per-user Open Cowork config + Intune deploy. Idempotent. Currently Entra-group-triggered,
  manual/cron (auto-trigger not wired).
- **vault-portal** (`hindsight.baobab-ts.com`): `/me` (user PATs), `/admin/users` + `/admin/users/new`
  (validate via CIO-Agent, create bank, gen PAT, email setup link via Graph), `/admin/banks`
  (all banks + stats + holders), `/install/<token>` (one-time PAT reveal). Admin = `ADMIN_EMAILS`
  + Group IT headers.
- **portal `/app/admin`** = Group-IT-gated entry to Hindsight admin.

### 7.2 CUA target design
A **Yumi Admin Console** (web, letterbox DS) that is provider-agnostic and the single control
plane for:

1. **Identity providers** — Google, Apple, Microsoft. One Yumi user can link multiple IdPs to
   one Yumi identity (account linking). OIDC for all three (Apple = Sign-in-with-Apple → OIDC).
   A **Yumi Identity Service** normalizes IdP subjects → canonical `user_id`.
2. **Directory** — users, groups, orgs/tenants. Source of truth = Yumi directory, **synced
   from** each connected IdP (Entra for MS tenants, Google Workspace directory, Apple is
   consumer-only so no directory sync). CIO-Agent pattern generalized to a per-provider
   provisioning adapter.
3. **Lifecycle** — joiner (auto-provision bank + gateway key + storage quota on group-add),
   mover (group change re-evaluates RBAC + bank policies), leaver (disable → revoke PATs,
   freeze bank, retain per policy).
4. **RBAC** — group → role → permission; roles include `user`, `org-admin`, `platform-admin`,
   `billing-admin`. Connector/tool access derived from groups (extend `MMD_ENTRA_GROUP_MAP` →
   `YUMI_GROUP_MAP`).
5. **Per-user resource provisioning** — Hindsight bank (§10), gateway virtual key + budget
   (§11), storage quota (§15). All driven from CUA, all idempotent.
6. **Admin RBAC & audit** — who can provision/revoke/read banks; every admin action audited
   (Ben: "keep everything" for the pilot — formal retention later).
7. **Self-service** — users see their banks/PATs/keys (`/me`), manage linked IdPs, view usage.

### 7.3 What's missing / real gaps (CUA)
- **G7-1** No Google or Apple IdP support — Entra only. (§7.2.1)
- **G7-2** No canonical Yumi identity / account linking across IdPs.
- **G7-3** Auto-provisioning trigger not wired (manual/cron today).
- **G7-4** No leaver/mover automation (banks/keys not auto-frozen on disable).
- **G7-5** Admin console is Hindsight-only (`vault-portal`); no unified admin for identity +
  banks + gateway + storage + billing.
- **G7-6** Offline/background agent actions need a **BFF token broker** (`getGraphToken(user,
  scopes)` via OBO/auth-code, encrypted refresh tokens) + an app-only daemon app for truly
  offline actions — designed but not built. Reuse Entra app `e99cd268`.
- **G7-7** Cross-surface single sign-on not unified: 4 surfaces need 1 auth (Office=NAA,
  web/mobile=auth-code+PKCE, portal iframe=token-pass/postMessage) — designed (2026-06-21), not
  implemented end-to-end.

---

## 8. Surface coverage gaps

| Surface | Status | Gap |
|---|---|---|
| Web app (`mmd-cowork-mobile`) | ✅ exists (PWA) | UX/consistency (§13) |
| macOS desktop (`open-cowork-mmd`) | ✅ exists (Electron) | DS migration (§5); builds unsigned |
| **Windows desktop** | ⚠️ Electron codebase builds for Windows (`MMD Cowork-3.3.1-win-x64.exe`, 132 MB, **unsigned**), Intune retarget committed but **not deployed** | signing/notarization, auto-update, Intune pipeline |
| **iOS native** | ❌ **missing** | must be an API client (shared conversations + memory), not a parallel store |
| **Android native** | ❌ **missing** | same — API client only |
| Office add-in (`mmd-cowork-office`) | ✅ exists | NAA auth (§7.3 G7-7) |
| PWA on mobile | ✅ exists | native shell (camera/push/background) → reason for native apps |

**G8-1 (real gap): no iOS app. G8-2: no Android app.** Yumi's "mobile-enabled" promise requires
native clients (push notifications, background sync, secure enclave for key storage, camera/voice
input). Plan: native iOS/Android as **thin clients** over the shared `/api/conversations` +
Hindsight + gateway — explicitly *not* a parallel chat/memory store (per the architecture doc's
flavor rule).

**G8-3:** Windows desktop build is unsigned and Intune delivery incomplete.

---

## 9. Integrations & connectors

### 9.1 Today
Connectors (MCP) configured in `@mmd/cowork-config`: **M365 Graph** (Mail, Calendar, Teams,
OneDrive/Files, Directory), **FlowMaster** (workflow engine, EA2), **SAP** (B1 / Angola MIS),
**GLPI** (helpdesk). Hindsight (memory). Model via `mmd-llm-gateway`.

### 9.2 Target + gaps
- **G9-1** **No Google Workspace connector** (Gmail, Calendar, Drive, Directory) — required for
  Google SSO users to get the same value as M365 users.
- **G9-2** **No Apple integration** (iCloud Mail/Calendar/Files are limited; Apple value is
  mostly identity + native device APIs).
- **G9-3** **No generic cloud-storage connector** beyond OneDrive (Google Drive, Dropbox, Box,
  S3) — needed for the "files/cloud storage" pillar.
- **G9-4** **No business-app connector framework** beyond the hand-wired set — Yumi should have
  a pluggable connector SDK (manifest + OAuth + scopes + tool allowlist) so new integrations are
  declarative, like the `@mmd/cowork-config` MCP server definitions extended.
- **G9-5** **Connector auth per user** — today desktop builds a static Hindsight token; the
  per-user delegated-token model (§7.3 G7-6) must extend to every connector, not just the model
  gateway.
- **G9-6** **No email/calendar processing** beyond passthrough MCP — e.g. digesting mail into
  memory, drafting (currently agent-driven, not a first-class pipeline).

---

## 10. Memory architecture (Hindsight)

### 10.1 Topology today (ground truth, 2026-06-21)
Two **separate Hindsight deployments**, each `ghcr.io/vectorize-io/hindsight:latest` with
**embedded Postgres 18.1** (data at `/home/hindsight/.pg0/instances/hindsight/data`, DB
`hindsight`, image-managed DB password — only `GPG_KEY` in pod env):
1. **FlowMaster bank `vault`** at `hindsight.flow-master.ai/mcp/vault/` (build01) — the large
   ~6,600-unit bank; **get_bank timed out 120 s on 2026-06-21 = slow/unhealthy**.
2. **MMD instance** on `mmd01` ns `hindsight-mmd` — healthy. Serves per-person banks
   (`adilson-bartolomeu`, `kelvin-amboso`, `ben`, `ben-hippler`, `it`, `mmd`, …) + the shared
   `mmd` bank (15,967 units / 938 docs). Backed by auth-router + vault-portal + nightly
   maintenance.

Ben asked (2026-06-21) to **merge `mmd` + `flowmaster` into one bank** — open, not done.

### 10.2 Auth model (the part that works well)
`auth-router` (~90-line Starlette) intercepts `/mcp/<bank>/*`, validates `Bearer <PAT>` against
`pats.json` (Secret `auth-router-config`), **enforces URL-bank == PAT-bank (403 on mismatch)**,
then rewrites to the server bearer + `X-Bank-Id` + `X-Forwarded-User`. PAT format
`mmd_pat_<bank>_<hex>`. **This is the bank-separation primitive Yumi will reuse.**

### 10.3 Bank separation model for Yumi
- **Personal bank** per user (slug = identity local-part) — private memory.
- **Org/team bank** per group (`it`, `finance`, …) — shared, group-gated.
- **Product/system bank `yumi`** — Yumi's own canonical knowledge (architecture, rules,
  methodology) — *the dedicated section Ben asked for*.
- **Isolation enforced at the router:** a PAT is bank-scoped; cross-bank reads require a
  separate, audited privileged token.

### 10.4 The `yumi` bank / "dedicated Hindsight section"
Ben asked for a dedicated Hindsight section called **yumi**. Reality:
- My MCP client has **no `create_bank` tool** exposed (only `get_bank/update_bank/delete_bank` +
- directives/mental-models). The backend *does* expose `create_bank` via
  `HINDSIGHT_API_MCP_ENABLED_TOOLS`, but not to this client.
- So a real `yumi` bank is an **ops provisioning action** (vault-portal `/admin/users/new`
  mechanism / `PUT /v1/default/banks/yumi` + PAT in `auth-router-config`) — a **live multi-user
  mutation that needs Ben's go**.
- **Done now (safe equivalent):** all Yumi knowledge is retained under **tag `yumi`** and
  **`context: yumi-*`** in the MMD bank, so it is scoped and recallable as a section.
- **Planned (needs go):** provision a real `yumi` bank on the mmd01 instance (healthy), wire a
  PAT, point Yumi surfaces + this doc's memory at it. Track in §19 P1.

### 10.5 Hindsight admin — what exists, what Yumi needs
Exists: vault-portal (`/me`, `/admin/users`, `/admin/banks`, `/install`), CIO-Agent
provisioning, nightly maintenance, `/app/admin` entry. Missing for Yumi:
- **G10-1** No **bank policies/quotas** (size, retention, who-can-write).
- **G10-2** No **billing-aware provisioning** (bank create gated by plan/seat).
- **G10-3** No admin UI for **non-Entra** users (Google/Apple) — vault-portal identity is
  Entra/oauth2-proxy only.
- **G10-4** The two deployments are **drifting/split** (vault unhealthy); the merge decision is
  unresolved.
- **G10-5** **Hindsight customization layer for Yumi** not extracted — Yumi should ship a thin
  custom layer (bank policies, Yumi directives, retention) on top of upstream MIT Hindsight,
  forked cleanly (not vendored ad hoc). Currently `services/hindsight/src` is the raw upstream
  tree.
- **G10-6** **Privacy regression not live:** `mmd-cowork-mobile` per-user Hindsight routing fix
  exists in source (2026-06-20) but is **not rolled out** — live runtime still used a global
  Hindsight token → cross-user memory leak risk. Must ship before Yumi is multi-user.

### 10.6 Telepathy (for completeness)
Telepathy (structured Hindsight-backed views between agents) is **our own work, part of
SideQuest** (not third-party, not license-blocked — corrected 2026-06-21). It enters Yumi with
SideQuest (§19 P4). It is the future cross-agent memory-sharing primitive.

---

## 11. LLM gateway

`services/mmd-llm-gateway` = **LiteLLM** proxy, config-driven (`config/config.yaml`), public at
`llm.baobab-ts.com`. Today routes `glm-5.2`, `claude-sonnet-4-6`, `claude-haiku-4-5`,
`claude-opus-4-7` → **Z.ai GLM-5.2** (Anthropic dialect). Provider keys via env, never inline.

Provides: virtual keys, **per-key $ budgets**, spend/token metering (Postgres),
routing/fallback/retry/caching, provider-key encryption at rest. Chosen 2026-06-21 over
Bifrost/Portkey/Helicone because budgets+keys are in the MIT core. Supersedes the old
`cowork-gateway` (`ai-gw`) and Portkey.

Gaps:
- **G11-1** Real **billing** (plans/invoices) layered later on an OSS metering engine (OpenMeter,
  Apache-2.0) — not built.
- **G11-2** Per-user **virtual keys not auto-provisioned** from CUA (must be wired to §7).
- **G11-3** Provider expansion (real Anthropic/OpenAI, not just Z.ai aliases) pending keys.
- **G11-4** Confidential inference (keys + user data protected in TEE) — §14.

---

## 12. Security & trust (excluding Encryption/TEE → §14)

What's in place: oauth2-proxy SSO, Entra JWT validation + group-gate at the gateway, PAT-based
bank isolation (URL==PAT), SealedSecrets, cert-manager TLS, per-user gateway metering.

Real gaps (non-TEE):
- **G12-1** **Secrets sprawl:** `.env`/`.sesskey`/embedded-cred remotes found across the copied
  repos (scrubbed on copy, but the source trees still carry local secrets). Need a central secret
  store + rotation.
- **G12-2** **Cross-user memory isolation not enforced live** (G10-6) — highest-priority trust
  gap.
- **G12-3** **Audit logging** is partial (gateway per-turn log; vault-portal actions) — no
  unified, tamper-evident audit across identity/banks/gateway/storage.
- **G12-4** **Data residency / multi-region** not designed (single mmd01 cluster today).
- **G12-5** **Key management** for Hindsight-at-rest and file encryption is ad hoc, not a KMS.
- **G12-6** **Supply chain:** container images mostly pinned, but no signed-image / admission
  control policy (becomes critical with TEE, §14).

---

## 13. UX

The letterbox system is strong and consistent on web/mobile/office. Gaps:
- **G13-1** Desktop UX divergence (§5 — different token system) breaks cross-surface consistency.
- **G13-2** No **onboarding/first-run** flow for a brand-new Yumi user (the desktop has a
  Welcome/Sign-in; web/PWA do not uniformly).
- **G13-3** Mobile **native affordances** missing (push, background, voice input) → tied to G8-1/2.
- **G13-4** **Accessibility** not a first-class target (Ben skipped WCAG for the MMD pilot);
  Yumi-as-product needs AA on the letterbox palette + the linter (§5.3.5).
- **G13-5** **Voice** (product vision) — no surface exists.
- **G13-6** **Empty-state / connection-management UX** for connectors is inconsistent across
  surfaces.

---

## 14. Encryption & TEE module (detailed implementation plan)

> Separate module per Ben. Goal: "rent GPUs as a service with access to enclave TEE, and secure
> the environment" — so sensitive client/business data is protected in transit, at rest, **and
> in use**, and so Yumi can offer confidential AI compute as a billed capability.

### 14.1 Threat model & scope
- **Adversary:** a compromised host/hypervisor, a cloud operator, or a co-tenant on shared
  hardware must **not** read user data, model provider keys, or decrypted memory.
- **In use** is the new surface RAG/transcript/file data is exposed during inference and
  indexing — exactly Yumi's sensitive payload.
- **In scope for the TEE:** (a) LLM inference, (b) the agent runtime when handling decrypted
  user data/files, (c) file processing/indexing, (d) Hindsight write paths that touch decrypted
  memory. **Out of scope (now):** the entire control plane (identity, billing, routing) — those
  stay on normal confidential-friendly nodes.

### 14.2 Technology selection (decision required, recommendation given)
- **CPU TEE:** AMD **SEV-SNP** or Intel **TDX** confidential VMs (Azure CC, GCP C-Confidential,
  AWS Nitro Enclaves for isolation-only). Recommendation: **SEV-SNP/TDX confidential containers**
  for portability.
- **GPU TEE:** NVIDIA **Hopper H100/H200 Confidential Computing** (CC mode) or AMD MI300X. This
  is what "GPU-as-a-service with enclave TEE" means concretely. Recommendation: Hopper CC for
  inference, with attestation-gated key release.
- **Attestation:** a **Verdictd/attestation-service** that verifies TEE measurement (launch
  measurement, attestation report) before releasing keys.
- **KMS:** a KMS (HashiCorp Vault / cloud KMS) that releases data keys **only** to an attested
  TEE matching a tenant policy.
- **Packaging:** confidential containers (Kata + confidential hardware), signed images,
  encrypted image pulls, image-digest pinning at admission.

### 14.3 Phased implementation plan

**T1 — Threat model, scope freeze, tech decision (1–2 wks)**
- T1.1 Write the threat model + data-flow diagram (where plaintext user data, provider keys,
  decrypted memory exist).
- T1.2 Decide CPU TEE (SEV-SNP vs TDX) and GPU TEE (Hopper CC vs MI300X); pick cloud(s).
- T1.3 Define the **attestation policy** schema (what measurements/claims a TEE must present per
  tenant).
- T1.4 Exit gate: signed-off threat model + tech decision.

**T2 — Attestation service + KMS key-release (2–3 wks)**
- T2.1 Stand up the attestation service (verify SEV-SNP/TDX/GPU reports).
- T2.2 KMS integration: key release only on valid attestation matching policy; per-tenant
  Customer-Managed Keys.
- T2.3 Define the **key hierarchy**: master (KMS) → tenant data key → object/file key →
  session/inference key (all inside TEE).
- T2.4 Exit gate: a key is refused outside an attested TEE; granted inside one.

**T3 — Confidential containers for agent runtime + router (3–4 wks)**
- T3.1 Confidential-container base image; signed; encrypted pull.
- T3.2 Admission control: only signed, attested images on confidential nodes.
- T3.3 Move the **agent runtime** (the `@mmd/cowork-core` loop when decrypting files/processing)
  into a confidential container; provider keys injected via T2 key-release.
- T3.4 Network: mTLS into the enclave; no plaintext egress of user data (allow only model calls +
  encrypted writes).
- T3.5 Exit gate: agent runs in TEE; secrets/data unreadable from host.

**T4 — Confidential GPU inference + per-tenant key release (3–4 wks)**
- T4.1 Provision Hopper-CC GPU nodes; enable CC mode.
- T4.2 Provider model key + per-tenant context released into the GPU TEE on attestation.
- T4.3 Route inference for sensitive tenants through the confidential GPU path (others via normal
  `mmd-llm-gateway`).
- T4.4 Exit gate: inference of tenant data on GPU TEE; host/GPU memory encrypted, attested.

**T5 — Confidential file processing + Hindsight-at-rest encryption (2–3 wks)**
- T5.1 Decrypt + process/index user files **only inside the TEE** (ties to §15).
- T5.2 Hindsight bank at-rest encryption with TEE-only decryption for sensitive banks.
- T5.3 Exit gate: files/memory decrypt only in attested enclave.

**T6 — GPU-as-a-service control plane (3 wks)**
- T6.1 Scheduling: allocate confidential GPU capacity; isolate tenants (MIG / CCS / time-slicing
  with attestation per slice).
- T6.2 Metering → billing: feed confidential-GPU usage into LiteLLM/OpenMeter metering (§11/§16)
  as a billable resource.
- T6.3 Quotas from CUA (§7) gate who can request confidential compute.
- T6.4 Exit gate: a tenant can request, run, and be billed for confidential GPU.

**T7 — Verification, audit, compliance posture (2 wks)**
- T7.1 Log attestation evidence (what ran, where, measured) to the audit log (G12-3).
- T7.2 Continuous attestation; alert on measurement drift.
- T7.3 Document the confidentiality posture (what is protected in use / at rest / in transit) —
  note: do **not** fabricate certifications (SOC2/ISO/HIPAA) — state actual capabilities only.

### 14.4 Dependencies & risks
- Depends on §15 (file storage) and §11 (gateway metering) and §7 (CUA quotas).
- **Risk:** GPU TEE availability/cost (Hopper CC supply); have a CPU-TEE fallback path.
- **Risk:** attestation key-release latency on every cold start — cache session keys inside the
  enclave, not outside.
- **Risk:** don't let "confidential" become a false sense of security — the app code inside the
  TEE must not leak data via logs/telemetry (side-channel hygiene).

---

## 15. File storage & processing

**Today:** no dedicated Yumi file store/processing. Files are handled via the **OneDrive** MCP
connector (M365) only; processing is ad hoc inside agent turns.

**Gap (real):**
- **G15-1** No **centralized, encrypted object store** for user/org files independent of M365
  (needed for Google/Apple users and for Yumi-native data).
- **G15-2** No **document-processing pipeline** (ingest → extract → chunk → embed/index) feeding
  memory — RAG-over-files is not a first-class capability.
- **G15-3** No **per-user/org storage quotas** tied to CUA/billing.
- **G15-4** No **confidential processing** path (resolved by §14 T5).

**Design:** a Yumi **Object Store** (S3-compatible, per-tenant buckets, CMK encryption) + a
**Processing service** that runs inside the TEE (§14 T5) for sensitive content. Files become
memory via the processing pipeline (with explicit user consent + bank tagging).

---

## 16. Billing & subscription

Metering exists (LiteLLM spend/token per key; per-user gateway Prometheus metering). Real billing
does not.
- **G16-1** No plan/seat/invoice system — planned on **OpenMeter** (Apache-2.0) consuming
  gateway + storage + confidential-GPU meters.
- **G16-2** No subscription lifecycle (trial → paid → cancel) tied to CUA (§7) and per-user
  provisioning.
- **G16-3** No usage dashboards for users/orgs.

---

## 17. Observability & operations

In place: k3s + ArgoCD GitOps (`mmd-ops`), SealedSecrets, cert-manager, Prometheus/Grafana,
nightly Hindsight maintenance. Known ops facts: ArgoCD watches a separate bare git repo from
gitea (they diverge); Gitea Actions CI is dead (manual kaniko/docker builds on build01).

Gaps:
- **G17-1** CI is broken (no act-runner) — must restore for Yumi build/test/deploy.
- **G17-2** No SLOs / on-call runbooks for Yumi surfaces.
- **G17-3** Two Hindsight deployments drifting (§10) — ops risk.

---

## 18. Real gaps register (excluding Encryption/TEE)

Only real gaps (not nice-to-haves), each with severity and owning phase.

| ID | Gap | Severity | Phase |
|---|---|---|---|
| G7-1 | No Google/Apple IdP (Entra only) | Critical | P1 |
| G7-2 | No canonical Yumi identity / account linking | Critical | P1 |
| G7-3 | Auto-provisioning trigger not wired | High | P1 |
| G7-4 | No leaver/mover automation | High | P2 |
| G7-5 | No unified admin console (identity+banks+gateway+storage+billing) | High | P2 |
| G7-6 | No BFF token broker for delegated/offline connector access | High | P2 |
| G7-7 | Cross-surface SSO not unified (NAA/PKCE/token-pass) | High | P2 |
| G8-1 | No iOS native app | Critical | P2 |
| G8-2 | No Android native app | Critical | P2 |
| G8-3 | Windows build unsigned + Intune incomplete | High | P1 |
| G9-1 | No Google Workspace connector | High | P2 |
| G9-3 | No generic cloud-storage connector (Drive/Dropbox/Box/S3) | High | P2 |
| G9-4 | No pluggable connector SDK | Medium | P2 |
| G9-5 | Per-user delegated auth not on all connectors | High | P2 |
| G10-1 | No Hindsight bank policies/quotas | Medium | P2 |
| G10-3 | Hindsight admin not multi-IdP | High | P2 |
| G10-4 | Two Hindsight deployments drifting; merge unresolved | High | P1 |
| G10-5 | No clean Yumi Hindsight customization layer (raw upstream) | Medium | P2 |
| G10-6 | Per-user Hindsight routing fix **not live** (cross-user leak risk) | **Critical** | P1 |
| G11-2 | Gateway virtual keys not auto-provisioned from CUA | High | P2 |
| G12-1 | Secrets sprawl / no central secret store + rotation | High | P1 |
| G12-3 | No unified tamper-evident audit log | High | P2 |
| G12-5 | No KMS (ad hoc key management) | High | P2 |
| G13-1 | Desktop DS divergence | Medium | P1 |
| G13-4 | No accessibility (AA) target | Medium | P2 |
| G15-1 | No central encrypted object store | High | P2 |
| G15-2 | No file→memory processing pipeline | High | P2 |
| G16-1 | No billing/subscription system | High | P3 |
| G17-1 | CI broken (no act-runner) | High | P1 |

**Nice-to-haves deliberately excluded** (per Ben): WCAG-for-MMD-pilot, AMEX/ECB-FX (MMD-specific),
Moodle LMS path, multi-entity branding.

---

## 19. Roadmap (phases)

- **P0 — Consolidation ✅ (2026-06-21/22).** Stack copied into `~/yumi`; docs; canonical DS
  tokens package; `yumi` Hindsight tag/context established.
- **P1 — Trust + identity foundation.** Ship G10-6 (live per-user Hindsight isolation —
  critical). Provision the real `yumi` bank (§10.4, needs Ben's go). Add Google+Apple IdP +
  canonical identity (G7-1/2). Restore CI (G17-1). Central secrets (G12-1). Resolve the
  Hindsight merge (G10-4). Windows signing (G8-3). Desktop DS migration (G13-1).
- **P2 — Product completeness.** iOS + Android native (G8-1/2). Unified admin console (G7-5).
  BFF token broker (G7-6). Cross-surface SSO (G7-7). Google Workspace + cloud-storage
  connectors + connector SDK (G9-1/3/4/5). File store + processing (G15-1/2). Hindsight
  policies/admin/merge (G10-1/3/5). Unified audit + KMS (G12-3/5). Accessibility (G13-4).
- **P3 — Monetize + harden.** Billing/subscription (G16). Lifecycle automation (G7-4).
- **P4 — Confidential + agentic.** Encryption/TEE module (§14). Voice (G13-5). SideQuest +
  Telepathy integration.
- **Later — Rebrand.** Execute §6.2 rename.

---

## 20. Open decisions (need Ben / Shad)

1. **Hindsight topology:** merge `mmd`+`vault` into one bank (Ben asked 2026-06-21)? And provision
   a real `yumi` bank now (live mutation — needs go)?
2. **IdP scope:** single Yumi Entra tenant, or new tenant? Same redirect-domain set for
   Google/Apple?
3. **TEE tech:** SEV-SNP vs TDX; Hopper CC vs MI300X; which cloud(s) for confidential GPU?
4. **Native apps:** React Native (one codebase) vs separate native iOS/Android?
5. **Share with Shad:** push `~/yumi` to Gitea, or send this doc via email/Teams? Push
   collaborator or reviewer?
6. **Connector SDK:** build vs adopt an existing MCP-based framework.

---

## 21. Appendix

- **Provenance:** `~/yumi/PROVENANCE.md`.
- **Architecture deep-dive:** `~/yumi/docs/architecture.md`, `services/mmd-cowork-core/docs/COWORK_ARCHITECTURE.md`.
- **Design tokens:** `~/yumi/packages/design-system/`.
- **SSO runbook:** `~/yumi/config/sso/README.md`.
- **Glossary:** Yumi (product) · Cowork (the assistant family, pre-rebrand) · Hindsight (memory)
  · auth-router (bank-scoped PAT gateway) · vault-portal (Hindsight admin UI) · CIO-Agent
  (Entra provisioning principal) · LiteLLM gateway (model router) · letterbox (the DS) ·
  Telepathy (our cross-agent memory views, part of SideQuest).
- **Key infra:** `mmd01` (k3s), `build01` (registry `65.21.71.186:30500`), Gitea
  `gitea.mmd01.flow-master.ai`, `hindsight.baobab-ts.com`, `llm.baobab-ts.com`,
  `portal.baobab-ts.com`.
