← Back to project
● M1 done P0 Size S Foundation

Personal-RAG — Notes

Chronological decision log, gotchas, and working-session hours.

Notes & Decision Log

Format: YYYY-MM-DD — context — decision/finding.

Decisions

  • 2026-05-25 — Added _canon workspace + kb_search_canon MCP tool to host external vendor authoritative docs (Anthropic, xAI, HuggingFace, arxiv). Populated by the AI-Canon-Crawler (Mode C of the audit pipeline). Distinct from _shared (Marc’s own subjective research notes) — _canon is objective vendor truth. Workspace registered server-side; classifier label + playbook section updated. Extensible pattern for pm_canon / design_canon later.
  • 2026-05-24S3 migration complete. LocalStack Hobby trial revealed empirical persistence gating (freemium license behavior didn’t match docs) — pivoted to MinIO. Shipped tako 0.6.0-s3 with BlobStore abstraction (MinIO primary + filesystem mirror fallback) and dual-scheme URI support (s3://tako-kb/<key> for mount path, file://<abs> for legacy). 1000-row regression test ran with 100% parity vs pre-migration. Three launchd jobs in production: tako-mount-watcher (fswatch auto-ingest), tako-pg-backup (daily), tako-fs-backup (hourly MinIO → FS mirror). Post-migration canonical write target is the mount path ~/Documents/KB-s3/.
  • 2026-05-21Phase 2 dojo eval complete on the personal workspace. Built held-out 93-query eval set; measured Hit@3 = 97.8%, MRR = 0.948 with bge-m3 alone. Reranker swap (bge-reranker-v2-m3 cross-encoder on top-20) gave marginal +3.2pp Hit@1 at negligible latency cost — shipped as default. Routing test: 100% no-miss (every query found via correct scoped tool). Decision: skip the embedding bake-off — bge-m3 is near-SOTA, swap lift would be 1–3pp without clear ROI. Reranker is the better marginal investment.
  • 2026-05-xMulti-workspace refactor. Single global kb_search → 6 workspaces with scoped tools (kb_search_ll, kb_search_mindx, kb_search_personal, kb_search_shared, kb_search_canon, kb_search_all). LL workspace adds multi-tenant client_filter for PCF / NewLife / Ilham / BBL with _generic + _multi shared buckets. Orchestration playbook moved server-side (sent in serverInfo.instructions on init handshake) so all clients get consistent routing rules without per-device CLAUDE.md drift.
  • 2026-05-xEmbedder + DB swap. Migrated from cloud ADB 23ai → local Postgres 16 + pgvector; multilingual-e5-small → bge-m3 (1024-dim). Reasons: zero cloud dependency, MPS-accelerated embedder on M2 Max, stronger VN+EN+code-mixed retrieval. HNSW index rebuilt; latency p95 dropped to 840 ms warm with the local hop.
  • 2026-04-29 (Day 8) — OAuth 2.0 via the MCP SDK’s OAuthAuthorizationServerProvider Protocol. Single-user, file-backed JSON store, PBKDF2 200K rounds. Legacy bearer token still works through the provider’s load_access_token() fallback (Claude Desktop unbroken).
  • 2026-04-29 (Day 7) — Backup architecture v1: weekly dump + restore script. (Later superseded by daily tako-pg-backup + hourly tako-fs-backup launchd jobs in the S3 milestone.)
  • 2026-04-27 (Day 6) — Persistent tunnel via Cloudflare named tunnel on a custom domain. DNS CNAME → tunnel UUID. Free tier sufficient. Tunnel is optional, used only when accessing from a device other than the host.
  • 2026-04-27 (Day 4-5) — Filesystem-first ingest rule: skill writes .md → calls kb-ingest-file.py. Filesystem = canonical, vector store = derived index. Sync 1-way only.
  • 2026-04-27 (Day 4-5) — Generic helper kb-ingest-file.py. Auto-detects workspace + source_type + tags from path rules. Idempotent via server hash check.
  • 2026-04-27 (Day 3b) — First multilingual swap (BGE-en → e5-small). VN query verbatim score jumped 0.69–0.77 → 0.85+. (Later upgraded again to bge-m3 in the local-Postgres migration.)
  • 2026-04-27 (Day 3) — Bulk migrate strategy: rsync .md files only, then run script with local DB connection — much faster than HTTPS-per-call.
  • 2026-04-27 (Day 3) — Chunk size 600 → 256 tokens + 32 overlap. Reason: at the time using a CPU embedder where sequence length scaled ~quadratically; 256 = 3× faster.
  • 2026-04-27 (Day 3) — Cap chunks/file. Huge transcripts only embed the head (≈12.8K tokens, ~25 dense pages).
  • 2026-04-27 (Day 2) — Schema final: kb_sources (1 row/file) + kb_chunks (FK, vector). Used source_uri UNIQUE + content_hash for idempotency rather than a synthetic doc_id hash.
  • 2026-04-27 (Day 1) — Pivot Telegram bot → MCP Streamable HTTP server. Reason: Claude clients (Desktop/web/iOS) call tools natively, no Bot UI needed. Same/better UX, less code.
  • 2026-04-27 (Day 1) — Bearer token auth first; OAuth deferred to Day 8.
  • 2026-04-26 — Foundation: chose vector-built-in DB pattern over Qdrant + local LLM serving.

Gotchas

  • 2026-05-24rclone config create echoes credentials plaintext by default — must redirect stdout/stderr to /dev/null when scripting MinIO bootstrap to avoid leaking via shell history / process listing. Rotate creds if leaked.
  • 2026-05-xlaunchd jobs that run binaries inside ~/Documents/ hang silently until Full Disk Access is granted per-binary in System Settings. Fix: grant FDA to /bin/bash (and any other interpreter launchd invokes) or copy scripts outside ~/Documents/.
  • 2026-05-x — Memory files referenced by the MCP server must live under the workspace they’re served from. Moved memory/ into ~/Documents/KB-s3/_personal/memory/ with a reverse-compat symlink; smoke-test P@3 went 73% → 100%.
  • Day 8OAuthAuthorizationServerProvider is a Protocol (Generic), not an ABC. Implement via duck-typing — no need to subclass. Pydantic models for OAuthClientInformationFull, AuthorizationCode, etc.
  • Day 4-5 — Cloudflare bot management blocked the default Python urllib User-Agent with 403. Fix: set a custom User-Agent header. (curl works out of the box.)
  • Day 4-5 — Hook timeout of 30s wasn’t enough for large transcripts (300+ chunks). Bumped to 120s.
  • Day 4-5 — Files with a newline character in the filename made the reconcile script falsely flag an orphan. Hash check confirmed validity. Reconcile script needed escape-aware comparison.
  • Day 3 — torch defaults to 2 threads on a 4-core machine. Fix: torch.set_num_threads(N) + OMP_NUM_THREADS=N. (Replaced by MPS acceleration on M2 Max in the local-Postgres era.)
  • Day 1 — MCP SDK DNS rebinding protection defaults to True → blocked through Cloudflare proxy. Disable: TransportSecuritySettings(enable_dns_rebinding_protection=False). Bearer token is the security boundary.
  • Day 1 — FastMCP streamable_http_app() mount path is /mcp, not /mcp/. Mounting at /mcp/ causes a 307 redirect that drops the Authorization header.
  • Day 1 — Claude.ai web custom connectors REQUIRE OAuth — they don’t accept a raw bearer token. Worked around with Claude Desktop + mcp-remote bridge until Day 8 added proper OAuth.
  • Day 1mcp-remote bridge --header "Name:value" syntax has issues with spaces. Workaround: ${AUTH_HEADER} env var injection in Claude Desktop config.
  • Day 1 — Print buffering: bulk migrate script print() calls invisible until process exits. Fix: sys.stdout.reconfigure(line_buffering=True) + PYTHONUNBUFFERED=1.

Working-session log

DateHoursWhatOutcome
2026-04-26~3 hCloud foundation (early VM + ADB attempt)Compute + DB ready.
2026-04-27 morning~2 hDay 1 MCP scaffold + tunnelkb_health live
2026-04-27 midday~3 hDay 2 schema + tools + bench4 tools working
2026-04-27 afternoon~2 h (+ ~95 min bg)Day 3 bulk migrateinitial 5K+ sources
2026-04-27 late~3 h (+ ~131 min bg)Day 3b first multilingual upgradeVN parity
2026-04-27 evening~3 hDay 4-5 sync refactorfilesystem-first auto-ingest
2026-04-28~1 hDay 6 persistent tunnelnamed tunnel live
2026-04-29 morning~2 hDay 7 backup + DRrestore tested
2026-04-29 afternoon~3 hDay 8 OAuthmobile + web access
M1 subtotal~22–25 hFoundation ready
2026-05-xongoingMulti-workspace refactor + local Postgres + bge-m36 workspaces, scoped tools
2026-05-21~4 hDojo eval Phase 2Hit@3=97.8%, reranker shipped
2026-05-24~5 hS3 migration (LocalStack→MinIO pivot)BlobStore, dual-URI, 100% regression parity
2026-05-25~2 h_canon workspace + crawler integrationvendor docs queryable via kb_search_canon