← Back to project
● Shipped P0 Size M Vertical app

Mail-Assistant — Architecture

System diagrams, data flow (pull → classify → surface → action), component responsibilities, failure modes.

Architecture

Sister docs: PRD (intent), Implementation (deep-dive), Notes (decision log).

System view

flowchart TB
    classDef client fill:#cce0e8,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef edge fill:#e0d5ed,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef server fill:#faedd6,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef store fill:#f4d6db,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px
    classDef ext fill:#d6e8d6,stroke:#1a1a1d,color:#1a1a1d,stroke-width:2px

    subgraph Mac["💻 Operator Mac"]
        App["Mail-Assistant.app
(SwiftUI · NSPanel)"] end App --> CF subgraph CF["☁️ Cloudflare (private tunnel)"] Tunnel["Named tunnel
TLS · optional CF Access"] end subgraph VM["🖥️ cron-host VM · systemd"] CFD["cloudflared agent"] API["FastAPI
/threads /actions /healthz"] T1["timer: poll-imap
every 5 min"] T2["timer: classify-batch
every 5 min"] T3["timer: sync-actions
every 2 min"] CFD --> API end Tunnel --> CFD subgraph DB["🗄️ Postgres 16 (same VM)"] Threads["threads"] Class["classifications"] Actions["actions"] end API -.-> DB T1 -.-> DB T2 -.-> DB T3 -.-> DB subgraph Ext["🌐 External APIs"] IMAP["IMAP × 3 accounts
(work + 2 personal)"] Gmail["Gmail API
(label + archive)"] Jira["Jira REST
(ticket status)"] Claude["Claude Haiku 4.5
(classifier)"] end T1 -.pull.-> IMAP T2 -.read latest.-> IMAP T2 -.lookup.-> Jira T2 -.classify.-> Claude T3 -.apply.-> Gmail class App client class Tunnel edge class CFD,API,T1,T2,T3 server class Threads,Class,Actions store class IMAP,Gmail,Jira,Claude ext

Three timers, single DB

The backend deliberately splits work across three systemd timers rather than a single long-running loop. Each timer is idempotent, runs in <30s, and writes its progress to Postgres. A failure in one timer never corrupts the others.

TimerCadenceJob
poll-imap.timerevery 5 minConnect IMAP × 3 → fetch new UIDs → upsert into threads (latest message snapshot per thread)
classify-batch.timerevery 5 minSelect unclassified threads → run share-noti hard-rule → fetch Jira status + Sent staleness → call Haiku → write into classifications
sync-actions.timerevery 2 minRead unsynced actions rows → Gmail API: archive INBOX + apply Mail-Assistant/<class> label → mark synced

Data flow — Pull

[poll-imap.timer fires every 5 min]


For each of 3 IMAP accounts:
    │ open IMAPS connection
    │ select INBOX
    │ UID SEARCH SINCE <last_seen_uid>


For each new UID:
    │ FETCH (BODY.PEEK[HEADER] BODY.PEEK[TEXT])
    │ parse message-id, in-reply-to, references, from, subject, date
    │ resolve thread_id (RFC 5322 references chain; fall back to subject normalize)


UPSERT threads
    (thread_id, account, latest_uid, latest_from, latest_subject,
     latest_snippet, latest_date, jira_ticket_extracted)
    WHERE latest_date > existing.latest_date  ← thread-latest-only


INSERT INTO threads_audit (raw_message_id, fetched_at)  ← idempotency log

Data flow — Classify

[classify-batch.timer fires every 5 min]


SELECT thread_id FROM threads
  WHERE NOT EXISTS (
    SELECT 1 FROM classifications c
    WHERE c.thread_id = threads.thread_id
      AND c.classified_for_latest_uid = threads.latest_uid
  )
LIMIT 50;                                    ← bounded batch (cost guardrail)


For each thread:

    ├── [Hard-rule pass] share-notification detector
    │   regex: r'\(via (Google Sheets|Notion|Dropbox|Figma|...)\)' on sender
    │   body contains "shared with you"
    │   → write classification=NOISE, reason="share-noti", skip LLM

    ├── [Cross-channel pass] enrich context
    │   - jira_ticket_extracted? → GET /rest/api/3/issue/<key>?fields=status
    │     if status in ('Done','Closed','Cancelled') → bias toward NOISE
    │   - check Sent folder for replies in last 30 min on same thread
    │     if found → bias toward NOISE
    │   - future-tense gate: if subject contains "scheduled for"/"will happen"
    │     and target_date > now() → bias toward P2 (not P0)

    ├── [LLM pass] call Claude Haiku 4.5
    │   system_prompt: "Default class = NOISE. Promote only if a human reply
    │                   is required within 24h. Output JSON:
    │                   {class: NOISE|P0|P1|P2, action_summary, reason}"
    │   input: latest_snippet + sender + jira_status + sent_staleness
    │   ← parse JSON response


INSERT INTO classifications
    (thread_id, classified_for_latest_uid, class, action_summary,
     reason, tokens_used, cost_usd, classified_at)

Data flow — Surface

[Mac app opens]


GET https://<tunnel-host>/threads
Authorization: Bearer <token>


[FastAPI handler]
    SELECT t.thread_id, t.latest_from, t.latest_subject,
           c.class, c.action_summary, c.reason
      FROM threads t
      JOIN classifications c USING (thread_id)
     WHERE c.classified_for_latest_uid = t.latest_uid
       AND c.class IN ('P0','P1','P2')                  ← NOISE never surfaces
       AND NOT EXISTS (
         SELECT 1 FROM actions a
         WHERE a.thread_id = t.thread_id AND a.kind IN ('done','archive')
       )
     ORDER BY CASE c.class WHEN 'P0' THEN 0 WHEN 'P1' THEN 1 ELSE 2 END,
              t.latest_date DESC;

[SwiftUI renders 3 collapsible sections]
    P0 ▼   (2 rows)
      ▸ Reply to vendor re: SLA breach           [D] [A]
      ▸ Confirm interview slot with candidate    [D] [A]
    P1 ▼   (3 rows)
    P2 ▶   (2 rows, collapsed by default)

Data flow — Action

[User presses D on a row]


POST https://<tunnel-host>/actions
{ thread_id, kind: "done" }


INSERT INTO actions
    (thread_id, kind, requested_at, synced_at: NULL)

Row disappears from UI immediately (optimistic)

    ─── async ───

[sync-actions.timer fires every 2 min]


SELECT * FROM actions WHERE synced_at IS NULL;

For each row:
    │ Gmail API:
    │   - users.messages.modify
    │       removeLabelIds: ["INBOX"]
    │       addLabelIds: ["Label_MailAssistant_Done"]
    │ UPDATE actions SET synced_at = now()

Component responsibilities

ComponentOwnsDoesn’t own
IMAP accountsSource of truth for raw emailClassification, action state
poll-imap.timerPulling new messages, thread resolution, latest-only upsertClassification, LLM calls
classify-batch.timerShare-noti hard-rules, cross-channel enrich, Haiku calls, writing classificationsIMAP fetching, action sync
sync-actions.timerPropagating Done/Archive back to GmailLLM, classification
FastAPI serverHTTP surface for the Mac app, auth, read-after-write consistencyBackground work (delegated to timers)
Postgres 16All persistent state (threads, classifications, actions)Email content (kept ephemeral; only snippet stored)
Mail-Assistant.appSurface, keyboard shortcuts, optimistic action UIBackground polling, classification
Cloudflare tunnelPublic HTTPS endpoint, TLS, optional CF AccessAuth (delegated to FastAPI bearer check)
Claude Haiku 4.5Per-thread classification given enriched contextCross-channel enrichment (done by timer before call)
Jira REST + Gmail APIExternal truth (ticket status, label state)App state

Failure modes & recovery

FailureDetectRecoveryTime
IMAP timeout / connection dropTimer log + threads_audit gapNext timer fire retries; UID-based dedupe prevents double-ingest<10 min
Jira API downEnrich call returns 5xxClassifier skips Jira signal for this thread, uses degraded context; flag in reasonSelf-heals
Haiku rate-limit or 5xxTimer logs the exceptionBatch row stays unclassified; next timer fire retries<5 min
Postgres downAPI 500, timers failsystemd restarts pg; timers re-run<2 min
FastAPI crashsystemd Restart=on-failureAuto-restart in 5s<10s
cloudflared crashsystemdAuto-restart, tunnel re-establishes<30s
Gmail token expiredsync-actions logs 401Manual refresh-token rotation; actions stay queued (synced_at NULL)<5 min ops
Bad classifier prompt changeFalse-positive P0 spikeRollback prompt; subset-test next time (see Notes)hours of triage pain
Mac app stale dataUser pulls to refreshAPI is the source of truth; no client cache beyond session<1s

Why these choices

DecisionAlternative consideredWhy this won
3 systemd timers over 1 long-running loopSingle asyncio loopEach timer fails independently; idempotent; easy to debug in isolation
Postgres over SQLiteSQLite single-file3 timers + 1 API process = concurrent writers; pg handles that natively
IMAP over Gmail API for pullGmail API across 3 accountsIMAP is uniform across the 3 different providers I use; Gmail API only used for the write side (label + archive)
Haiku 4.5 over local Llama 3.2Local LLMLocal model false-positive on share-notis even after prompt tuning; Haiku gets share-noti + ambiguity right at $0.04/day
NOISE-default over P2-default4 visible classesFirst version landed 70% in P2 = same triage problem. Default NOISE forces the classifier to justify promotion
Thread-latest-only over full threadFull conversation historyOld messages bias the classifier toward stale states; latest-only gave +12% accuracy on a 200-row eval
Share-noti hard-rules over LLMTrust the LLMHard-rules are deterministic, free, and remove 22% of daily volume before any LLM call
Native SwiftUI over ElectronReact + ElectronDaily-driver tool — wanted NSPanel + Liquid Glass aesthetic, no resource overhead, fast keyboard shortcuts
Single surface (Mac only) over Mac+Slack+iOSMulti-surface notificationsEach extra surface re-creates the original noise problem; less surface = less triage debt
Done/Archive only over Done/Archive/SnoozeAdd Snooze buttonSnooze is “anxiety as a feature” — it surfaces the same row again tomorrow with no new context
Action verb row title over raw subjectShow the email subjectThe row should be the action I’d take, not the marketing email’s subject line
Cloudflare tunnel over local-onlyBind to localhostOccasional triage from a different laptop; bearer + optional CF Access is enough
Private tunnel over public nameSubdomain on a personal zoneSingle-user system; not advertising the endpoint

See also