← Back to project
● M1 done P0 Size S Foundation

Personal-RAG — Architecture

System diagrams, data flows, component responsibilities, failure modes.

Architecture

Sister docs: PRD (intent), Implementation (deep-dive), Notes (decision log).

System view

flowchart TB
    classDef client fill:#cce0e8,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef edge fill:#e0d5ed,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef server fill:#faedd6,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef store fill:#f4d6db,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px

    subgraph Clients["👤 User devices (Mac + iPhone)"]
        Desktop["Claude Desktop
(legacy bearer)"] Web["Claude.ai web
(OAuth 2.0)"] iOS["Claude iOS app
(inherits web)"] Bridge["mcp-remote bridge
(npx)"] Desktop --> Bridge end Bridge -->|localhost:8080| Server Web --> CF iOS --> CF subgraph CF["☁️ Cloudflare (optional — remote-device access)"] Tunnel["Named Tunnel
DNS CNAME · TLS · Bot mgmt"] end subgraph Host["🖥️ MacBook Pro M2 Max · launchd"] CFD["cloudflared agent"] Server["uvicorn + Starlette
FastMCP server (tako 0.6.0-s3)"] Tools["MCP Tools
kb_search_ll · kb_search_mindx
kb_search_personal · kb_search_shared
kb_search_canon · kb_ingest · kb_stats · kb_health"] Embed["bge-m3 (multilingual)
+ bge-reranker-v2-m3"] OAuth["OAuth 2.0
/authorize /token /register"] Blob["BlobStore abstraction
(s3:// + file:// dual-scheme)"] Watcher["tako-mount-watcher
(fswatch → kb-ingest-file.py)"] CFD --> Server Server --> OAuth Server --> Tools Tools --> Embed Tools --> Blob end Tunnel --> CFD subgraph Storage["🗄️ Local stores"] PG["Postgres 16 + pgvector
HNSW · 42K sources · 182K chunks · 3.2 GB"] MinIO["MinIO S3
tako-kb bucket
primary blob store"] FS["~/Documents/KB-s3/ mount
filesystem mirror (hourly)"] end Tools -.->|asyncpg| PG Blob -.->|s3 SDK| MinIO Blob -.->|fallback| FS Watcher -.watches.-> FS subgraph Backup["📦 Backup jobs (launchd)"] PGBackup["tako-pg-backup
daily pg_dump"] FSBackup["tako-fs-backup
hourly MinIO → FS mirror"] end PG -.dump.-> PGBackup MinIO -.mirror.-> FSBackup class Desktop,Web,iOS,Bridge client class Tunnel edge class CFD,Server,Tools,Embed,OAuth,Blob,Watcher server class PG,MinIO,FS,PGBackup,FSBackup store

Workspaces

The corpus is partitioned into 6 workspaces, each with a dedicated scoped MCP tool. Routing happens at the client (Claude reasons about which workspace fits the query).

WorkspacePurposeMCP toolNotes
llWork knowledge (PRDs, Jira, Confluence, meetings, email, Slack)kb_search_llMulti-tenant: client_filter ∈ {PCF, NewLife, Ilham, BBL}
mindxConsulting engagementkb_search_mindx
_personalSide projects, finance, health, learning, memorykb_search_personal
_sharedCross-cutting research notes, cheatsheets, prompt patternskb_search_sharedSubjective curation
_canonExternal vendor authoritative docs (Anthropic, xAI, HuggingFace, arxiv)kb_search_canonCrawled by the AI-Canon-Crawler
_secretsEncrypted credentials(no MCP tool)Vault flow only, never exposed via RAG

Data flow — Ingest

[1] Trigger event (one of):
    - AI session ends                    → user-shell hook archives transcript
    - Chat exporter / wiki crawler runs  → writes file
    - Mail digest task runs              → writes file
    - Meeting transcriber runs           → writes file
    - Ticket crawler renders MD          → writes file
    - AI-Canon-Crawler renders MD        → writes file (→ _canon workspace)



[2] Markdown file written to <kb-mount>/<workspace>/<folder>/<slug>.md
    (filesystem mount = canonical source-of-truth; URI scheme s3://tako-kb/...
     if under MinIO mount, file://... if under legacy FS path)



[3] tako-mount-watcher (launchd + fswatch) detects write:
    python3 <hooks-dir>/kb-ingest-file.py <full-path>



[4] Helper:
    - reads file → text
    - classify workspace + source_type from path
    - auto-tag from folder hierarchy
    - HTTPS POST localhost:8080/mcp tools/call kb_ingest



[5] MCP server kb_ingest tool:
    - SHA-256(text) → content_hash
    - SELECT WHERE source_uri = :u
       - row exists + same hash → return {skipped: true} ← idempotency
       - row exists + diff hash → DELETE old chunks, UPDATE source
       - new                    → INSERT source RETURNING id
    - chunk_text(text)                              → list[str]
    - bge-m3 encode → list[float[1024]]
    - INSERT INTO kb_chunks ... (executemany)
    - persist blob to BlobStore (s3:// primary, file:// fallback)
    - COMMIT



[6] Return {source_id, chunks_inserted, skipped, hash, blob_uri}
    Logged to <hooks-dir>/kb-ingest-file.log

Data flow — Query

[1] User → Claude (any client):
    "Anything new about <topic> in PCF this week?"



[2] Claude routes by workspace → calls scoped MCP tool, e.g.
    kb_search_ll(query="...", client_filter="PCF", top_k=5)

    POST https://<host>/mcp  (or localhost:8080 direct)
    Authorization: Bearer <access_token>
    Accept: text/event-stream



[3] Cloudflare tunnel (if remote) → uvicorn → middleware auth check
    - validates Bearer token via FileOAuthProvider.load_access_token()



[4] FastMCP routes to kb_search_<workspace> handler:
    - qvec = bge_m3.encode("query: <user-query>")     # 1024-dim
    - SQL: SELECT c.text, s.source_uri, s.source_type, s.title, s.tags,
                  c.chunk_idx,
                  1 - (c.embedding <=> :q) AS score   -- pgvector cosine
             FROM kb_chunks c JOIN kb_sources s ON s.id = c.source_id
            WHERE s.workspace = :ws
              [AND s.client IN (:client_filter, '_generic', '_multi')]
            ORDER BY c.embedding <=> :q
            LIMIT 20                                   -- candidate pool
      (HNSW index on c.embedding)
    - bge-reranker-v2-m3 cross-encode → re-score top 20
    - return top_k after rerank



[5] Format results:
    [
      {score: 0.91, text: "...", source_uri: "s3://tako-kb/ll/pcf/...",
       source_type: "meeting", title: "...", tags: [...], chunk_idx: 0},
      ...
    ]



[6] Streamable HTTP returns SSE event:
    event: message
    data: {"jsonrpc": "2.0", "id": N, "result": {...}}



[7] Claude reads results, synthesizes natural-language answer with citations.

Data flow — Backup

[Hourly]                                  [Daily 03:00 local]
    │ launchd tako-fs-backup.plist            │ launchd tako-pg-backup.plist
    ▼                                         ▼
[fs-backup script]                        [pg_dump]
    │ rclone sync minio:tako-kb            │ pg_dump ragkb → .sql.gz
    │   → ~/Documents/KB-s3/               │ rotate, keep last 14
    ▼                                         ▼
[Filesystem mirror]                       [Local snapshot folder]
    ~/Documents/KB-s3/                    ~/Documents/KB-s3-backups/

The hourly mirror gives a continuously up-to-date filesystem snapshot of MinIO. The daily Postgres dump captures the vector index + metadata. Together, either side can be rebuilt from the other.

Data flow — Disaster recovery

Scenario A: MinIO corrupt / wiped


    rclone sync ~/Documents/KB-s3/ → minio:tako-kb
    (FS mirror is canonical source for blobs)


    Restart tako-mount-watcher → re-emits ingest events for any drift

Scenario B: Postgres corrupt / wiped


    psql -d ragkb -f latest pg dump  (from ~/Documents/KB-s3-backups/)
    │ OR re-ingest from FS:
    │   find ~/Documents/KB-s3/ -name "*.md" | xargs -n1 kb-ingest-file.py

    Restart MCP server

Component responsibilities

ComponentOwnsDoesn’t own
Filesystem mount (~/Documents/KB-s3/)Source of truth, plain markdown, dual-scheme URI mappingSearch, embedding, indexing
Sync agents (skills/hooks/crawlers)Pulling from external sources, writing MD filesEmbedding, storage
tako-mount-watcherfswatch on mount path → invokes kb-ingest-file.pyClassification, embedding
kb-ingest-file.pyPath → workspace + source_type classification → kb_ingest API callEmbedding, vector storage
MCP server (tako)Auth, chunking, embedding, rerank, Postgres + BlobStore coordinationSource acquisition
bge-m3 + bge-reranker-v2-m3Vectorize text + cross-encoder rerankStorage, retrieval orchestration
Postgres 16 + pgvectorPersist sources + chunks + vectors, ANN search via HNSWEmbedding, business logic
MinIO S3Primary blob storeSearch, embedding
FS mirrorHourly snapshot of MinIO + fallback blob storeLive serving
Cloudflare tunnelPublic HTTPS endpoint (optional, for remote-device access only)Auth (delegated to MCP server)
Claude clientsUI, LLM reasoning, tool routingEmbeddings, retrieval

Failure modes & recovery

FailureDetectRecoveryTime
MCP server crashlaunchd KeepAliveAuto-restart<10s
cloudflared crashlaunchdAuto-restart, tunnel re-establishes<30s
MinIO downBlob writes failBlobStore falls back to file:// mirror path automatically; restart MinIO~1 min
Postgres corruptionkb_health returns errorRestore from latest pg dump~5 min
Filesystem mirror driftDaily reconcile jobRe-run rclone sync from MinIO<2 min
Mount-watcher hungHealthcheck scriptRestart launchd job<1 min
Cloudflare tunnel UUID lostDNS resolves but no backendRecreate tunnel, update DNS~5 min
OAuth state file corruptedAll logins failDelete state file → all clients re-register on next use<1 min user impact
Password forgotten/login always 401Regenerate .oauth_env, restart server<2 min

Why these choices

DecisionAlternative consideredWhy this won
MCP over Telegram botBuild custom Telegram botNative Claude integration, multi-client (web/desktop/iOS) for free
Postgres + pgvector over Oracle ADBADB free tierLocal control, no cloud quota / eviction risk, pgvector mature, HNSW available
bge-m3 over multilingual-e5-smalle5-smallStronger VN+EN+code-mixed retrieval; reranker stack natively pairs with it
Reranker (bge-reranker-v2-m3) on top-NSingle-stage+3.2pp Hit@1 on held-out eval; latency cost negligible at N=20
Local MBP over cloud VMCloud VM$0 ongoing, lower latency, hardware already paid for; daemon footprint is 500 MB
MinIO over S3-managedAWS S3Local-first; same API; FS fallback when MinIO down
Workspace-scoped tools over single kb_searchOne global toolForces Claude to route correctly; prevents cross-workspace bleed; enables per-workspace policy
Filesystem-mount-as-canonicalDB-firstFilesystem is portable, durable, human-inspectable; DB is re-buildable
Cloudflare named tunnel over public IPDirect IP exposureNo firewall management, free TLS, easy DNS, persistent URL
FastMCP SDK auth over hand-rolled OAuthHand-roll RFC 8414 + 7591SDK provides metadata + DCR + token endpoints out-of-box

See also