Architecture
Sister docs: PRD (intent), Implementation (deep-dive), Notes (decision log).
System view
flowchart TB
classDef client fill:#cce0e8,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
classDef edge fill:#e0d5ed,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
classDef server fill:#faedd6,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
classDef store fill:#f4d6db,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
subgraph Clients["👤 User devices (Mac + iPhone)"]
Desktop["Claude Desktop
(legacy bearer)"]
Web["Claude.ai web
(OAuth 2.0)"]
iOS["Claude iOS app
(inherits web)"]
Bridge["mcp-remote bridge
(npx)"]
Desktop --> Bridge
end
Bridge -->|localhost:8080| Server
Web --> CF
iOS --> CF
subgraph CF["☁️ Cloudflare (optional — remote-device access)"]
Tunnel["Named Tunnel
DNS CNAME · TLS · Bot mgmt"]
end
subgraph Host["🖥️ MacBook Pro M2 Max · launchd"]
CFD["cloudflared agent"]
Server["uvicorn + Starlette
FastMCP server (tako 0.6.0-s3)"]
Tools["MCP Tools
kb_search_ll · kb_search_mindx
kb_search_personal · kb_search_shared
kb_search_canon · kb_ingest · kb_stats · kb_health"]
Embed["bge-m3 (multilingual)
+ bge-reranker-v2-m3"]
OAuth["OAuth 2.0
/authorize /token /register"]
Blob["BlobStore abstraction
(s3:// + file:// dual-scheme)"]
Watcher["tako-mount-watcher
(fswatch → kb-ingest-file.py)"]
CFD --> Server
Server --> OAuth
Server --> Tools
Tools --> Embed
Tools --> Blob
end
Tunnel --> CFD
subgraph Storage["🗄️ Local stores"]
PG["Postgres 16 + pgvector
HNSW · 42K sources · 182K chunks · 3.2 GB"]
MinIO["MinIO S3
tako-kb bucket
primary blob store"]
FS["~/Documents/KB-s3/ mount
filesystem mirror (hourly)"]
end
Tools -.->|asyncpg| PG
Blob -.->|s3 SDK| MinIO
Blob -.->|fallback| FS
Watcher -.watches.-> FS
subgraph Backup["📦 Backup jobs (launchd)"]
PGBackup["tako-pg-backup
daily pg_dump"]
FSBackup["tako-fs-backup
hourly MinIO → FS mirror"]
end
PG -.dump.-> PGBackup
MinIO -.mirror.-> FSBackup
class Desktop,Web,iOS,Bridge client
class Tunnel edge
class CFD,Server,Tools,Embed,OAuth,Blob,Watcher server
class PG,MinIO,FS,PGBackup,FSBackup store
Workspaces
The corpus is partitioned into 6 workspaces, each with a dedicated scoped MCP tool. Routing happens at the client (Claude reasons about which workspace fits the query).
| Workspace | Purpose | MCP tool | Notes |
|---|
ll | Work knowledge (PRDs, Jira, Confluence, meetings, email, Slack) | kb_search_ll | Multi-tenant: client_filter ∈ {PCF, NewLife, Ilham, BBL} |
mindx | Consulting engagement | kb_search_mindx | |
_personal | Side projects, finance, health, learning, memory | kb_search_personal | |
_shared | Cross-cutting research notes, cheatsheets, prompt patterns | kb_search_shared | Subjective curation |
_canon | External vendor authoritative docs (Anthropic, xAI, HuggingFace, arxiv) | kb_search_canon | Crawled by the AI-Canon-Crawler |
_secrets | Encrypted credentials | (no MCP tool) | Vault flow only, never exposed via RAG |
Data flow — Ingest
[1] Trigger event (one of):
- AI session ends → user-shell hook archives transcript
- Chat exporter / wiki crawler runs → writes file
- Mail digest task runs → writes file
- Meeting transcriber runs → writes file
- Ticket crawler renders MD → writes file
- AI-Canon-Crawler renders MD → writes file (→ _canon workspace)
│
▼
[2] Markdown file written to <kb-mount>/<workspace>/<folder>/<slug>.md
(filesystem mount = canonical source-of-truth; URI scheme s3://tako-kb/...
if under MinIO mount, file://... if under legacy FS path)
│
▼
[3] tako-mount-watcher (launchd + fswatch) detects write:
python3 <hooks-dir>/kb-ingest-file.py <full-path>
│
▼
[4] Helper:
- reads file → text
- classify workspace + source_type from path
- auto-tag from folder hierarchy
- HTTPS POST localhost:8080/mcp tools/call kb_ingest
│
▼
[5] MCP server kb_ingest tool:
- SHA-256(text) → content_hash
- SELECT WHERE source_uri = :u
- row exists + same hash → return {skipped: true} ← idempotency
- row exists + diff hash → DELETE old chunks, UPDATE source
- new → INSERT source RETURNING id
- chunk_text(text) → list[str]
- bge-m3 encode → list[float[1024]]
- INSERT INTO kb_chunks ... (executemany)
- persist blob to BlobStore (s3:// primary, file:// fallback)
- COMMIT
│
▼
[6] Return {source_id, chunks_inserted, skipped, hash, blob_uri}
Logged to <hooks-dir>/kb-ingest-file.log
Data flow — Query
[1] User → Claude (any client):
"Anything new about <topic> in PCF this week?"
│
▼
[2] Claude routes by workspace → calls scoped MCP tool, e.g.
kb_search_ll(query="...", client_filter="PCF", top_k=5)
POST https://<host>/mcp (or localhost:8080 direct)
Authorization: Bearer <access_token>
Accept: text/event-stream
│
▼
[3] Cloudflare tunnel (if remote) → uvicorn → middleware auth check
- validates Bearer token via FileOAuthProvider.load_access_token()
│
▼
[4] FastMCP routes to kb_search_<workspace> handler:
- qvec = bge_m3.encode("query: <user-query>") # 1024-dim
- SQL: SELECT c.text, s.source_uri, s.source_type, s.title, s.tags,
c.chunk_idx,
1 - (c.embedding <=> :q) AS score -- pgvector cosine
FROM kb_chunks c JOIN kb_sources s ON s.id = c.source_id
WHERE s.workspace = :ws
[AND s.client IN (:client_filter, '_generic', '_multi')]
ORDER BY c.embedding <=> :q
LIMIT 20 -- candidate pool
(HNSW index on c.embedding)
- bge-reranker-v2-m3 cross-encode → re-score top 20
- return top_k after rerank
│
▼
[5] Format results:
[
{score: 0.91, text: "...", source_uri: "s3://tako-kb/ll/pcf/...",
source_type: "meeting", title: "...", tags: [...], chunk_idx: 0},
...
]
│
▼
[6] Streamable HTTP returns SSE event:
event: message
data: {"jsonrpc": "2.0", "id": N, "result": {...}}
│
▼
[7] Claude reads results, synthesizes natural-language answer with citations.
Data flow — Backup
[Hourly] [Daily 03:00 local]
│ launchd tako-fs-backup.plist │ launchd tako-pg-backup.plist
▼ ▼
[fs-backup script] [pg_dump]
│ rclone sync minio:tako-kb │ pg_dump ragkb → .sql.gz
│ → ~/Documents/KB-s3/ │ rotate, keep last 14
▼ ▼
[Filesystem mirror] [Local snapshot folder]
~/Documents/KB-s3/ ~/Documents/KB-s3-backups/
The hourly mirror gives a continuously up-to-date filesystem snapshot of MinIO. The daily Postgres dump captures the vector index + metadata. Together, either side can be rebuilt from the other.
Data flow — Disaster recovery
Scenario A: MinIO corrupt / wiped
│
▼
rclone sync ~/Documents/KB-s3/ → minio:tako-kb
(FS mirror is canonical source for blobs)
│
▼
Restart tako-mount-watcher → re-emits ingest events for any drift
Scenario B: Postgres corrupt / wiped
│
▼
psql -d ragkb -f latest pg dump (from ~/Documents/KB-s3-backups/)
│ OR re-ingest from FS:
│ find ~/Documents/KB-s3/ -name "*.md" | xargs -n1 kb-ingest-file.py
▼
Restart MCP server
Component responsibilities
| Component | Owns | Doesn’t own |
|---|
Filesystem mount (~/Documents/KB-s3/) | Source of truth, plain markdown, dual-scheme URI mapping | Search, embedding, indexing |
| Sync agents (skills/hooks/crawlers) | Pulling from external sources, writing MD files | Embedding, storage |
tako-mount-watcher | fswatch on mount path → invokes kb-ingest-file.py | Classification, embedding |
kb-ingest-file.py | Path → workspace + source_type classification → kb_ingest API call | Embedding, vector storage |
| MCP server (tako) | Auth, chunking, embedding, rerank, Postgres + BlobStore coordination | Source acquisition |
| bge-m3 + bge-reranker-v2-m3 | Vectorize text + cross-encoder rerank | Storage, retrieval orchestration |
| Postgres 16 + pgvector | Persist sources + chunks + vectors, ANN search via HNSW | Embedding, business logic |
| MinIO S3 | Primary blob store | Search, embedding |
| FS mirror | Hourly snapshot of MinIO + fallback blob store | Live serving |
| Cloudflare tunnel | Public HTTPS endpoint (optional, for remote-device access only) | Auth (delegated to MCP server) |
| Claude clients | UI, LLM reasoning, tool routing | Embeddings, retrieval |
Failure modes & recovery
| Failure | Detect | Recovery | Time |
|---|
| MCP server crash | launchd KeepAlive | Auto-restart | <10s |
| cloudflared crash | launchd | Auto-restart, tunnel re-establishes | <30s |
| MinIO down | Blob writes fail | BlobStore falls back to file:// mirror path automatically; restart MinIO | ~1 min |
| Postgres corruption | kb_health returns error | Restore from latest pg dump | ~5 min |
| Filesystem mirror drift | Daily reconcile job | Re-run rclone sync from MinIO | <2 min |
| Mount-watcher hung | Healthcheck script | Restart launchd job | <1 min |
| Cloudflare tunnel UUID lost | DNS resolves but no backend | Recreate tunnel, update DNS | ~5 min |
| OAuth state file corrupted | All logins fail | Delete state file → all clients re-register on next use | <1 min user impact |
| Password forgotten | /login always 401 | Regenerate .oauth_env, restart server | <2 min |
Why these choices
| Decision | Alternative considered | Why this won |
|---|
| MCP over Telegram bot | Build custom Telegram bot | Native Claude integration, multi-client (web/desktop/iOS) for free |
| Postgres + pgvector over Oracle ADB | ADB free tier | Local control, no cloud quota / eviction risk, pgvector mature, HNSW available |
| bge-m3 over multilingual-e5-small | e5-small | Stronger VN+EN+code-mixed retrieval; reranker stack natively pairs with it |
| Reranker (bge-reranker-v2-m3) on top-N | Single-stage | +3.2pp Hit@1 on held-out eval; latency cost negligible at N=20 |
| Local MBP over cloud VM | Cloud VM | $0 ongoing, lower latency, hardware already paid for; daemon footprint is 500 MB |
| MinIO over S3-managed | AWS S3 | Local-first; same API; FS fallback when MinIO down |
Workspace-scoped tools over single kb_search | One global tool | Forces Claude to route correctly; prevents cross-workspace bleed; enables per-workspace policy |
| Filesystem-mount-as-canonical | DB-first | Filesystem is portable, durable, human-inspectable; DB is re-buildable |
| Cloudflare named tunnel over public IP | Direct IP exposure | No firewall management, free TLS, easy DNS, persistent URL |
| FastMCP SDK auth over hand-rolled OAuth | Hand-roll RFC 8414 + 7591 | SDK provides metadata + DCR + token endpoints out-of-box |
See also