← Back to project
● Shipped P0 Size M Foundation

Knowledge-Audit — Enterprise

Five enterprise adaptations: B2B SaaS doc consistency, fintech compliance drift, edtech policy sync, healthcare clinical protocol drift, CX KB hygiene.

Enterprise patterns

The personal version proves the drift mechanism is universal: any corpus that accumulates across multiple authoring surfaces (memory, runbooks, configs, playbooks, recordings) will silently contradict itself within months. Every enterprise vertical has its own version of “the Oracle A1 hallucination”. This page documents five adaptations — same 3-layer × 4-tier × judge architecture, different source mix, different compliance constraints.

What stays vs. what changes

The hot path — scope → static check → cross-source LLM scan → judge verify → severity-tagged digest — is identical across every use case below. The deltas are around source mix, isolation, ingest velocity, compliance, and remediation policy, not detection mechanics.

Migration matrix: Personal → Enterprise

AspectPersonalEnterprise
Sources auditedMemory + CLAUDE.md + project NOTES + email/Slack/meetings + KB notes (~80 files)Policy docs + runbooks + configs + Confluence + Salesforce + Jira + ticketing + transcripts (10K–500K docs)
Trigger cadence4 tiers (launchd)4 tiers + per-source webhook (CDC); SLA on event-tier <60s
Detector modelQwen 4B/8B/32B local on M2 MaxGPU pool with batched inference; per-tenant model selection
JudgeGrok 4.3 (99% LLM-judged)Grok 4.3 or per-tenant choice; audit log of judge verdict for every finding
Auto-fixSafety heuristics + git snapshot, opt-in per fileWorkflow integration (Jira ticket / GitHub PR / ServiceNow change record) + 4-eye approval gate
ComplianceNone (personal)SOC2, GDPR DSR (purge subject’s history on request), audit-log retention, region pinning, BYOK
IdentitySingle userSSO (SAML/OIDC) + RBAC on which sources each role can audit
DeliveryTelegramSlack / Teams / email + dashboard + JIRA ticket creation
Cost~$5/mo$500–$50K/mo depending on corpus + scan frequency
SLABest-effortFindings surface to ticketing within X hours of source write

The detection layers don’t change. Only the scope assembly, the delivery integration, and the remediation workflow scale up.


Use case A — B2B SaaS multi-tenant documentation consistency

Problem

A B2B SaaS like LittleLives serves multiple distinct customer-clients (PCF, NewLife, Ilham, BBL) with overlapping but not identical feature requirements. Each client has: client-specific config (tax rate, calendar, billing rules), shared product docs, internal playbook (BA → Eng → CS handover), Confluence space. A BA writes “PCF uses tax rate 7%” in 3 places (client config, playbook, training doc), one updates to 8% after a regulation change, the others stay stale. CS quotes the wrong rate in an escalation. Customer files a complaint. Real impact: per LittleLives’ multi-tenant client requirements pattern (PCF/NewLife/Ilham/BBL differ on same feature), drift between client config and shared docs surfaces as customer-visible bugs.

Why audit matters

  • Each client has 5–20 places the same fact appears (config + 2–3 docs + 1–2 playbooks + 1 training video transcript)
  • Cross-client feature toggles drift differently per tenant
  • Quarterly regulation changes (tax, holiday calendar, billing terms) propagate at different speeds across sources
  • A single stale fact in the BA→Eng→CS chain costs ~5 person-hours to debug + customer trust

Stack mapping

  • Sources: per-client Confluence space, shared product Confluence, Salesforce contract terms, BA handover playbook, CS macro library, training video transcripts (Zoom + Otter)
  • Detection scope assembly: cross-source scan grouped by (client, fact_topic) — Layer 2 within client, Layer 3 across clients to catch “PCF says 7%, NewLife says 7.5%, but the regulation applies to both”
  • Judge: Grok 4.3 with client-context prompt (“verify this contradiction is on the same fact, not different products”)
  • Auto-fix: opens Jira ticket assigned to source owner; never auto-edits client config (high-risk)
  • Delivery: Slack channel #kb-audit-<client> + weekly digest to CPO

Cost estimate (mid-sized B2B SaaS, 4 clients × 5K docs)

  • Corpus: 20K docs, ~200K chunks
  • Detection: ~$200/mo (Grok 4.3 judge dominates)
  • Storage: re-use existing Confluence + vector index (~$300/mo managed)
  • Integration eng: 1 month one-time + 0.2 FTE ongoing
  • All-in ~$600/mo vs ~$30K/mo cost of customer-trust incidents prevented = 50× ROI

Compliance angle

  • GDPR: per-tenant data residency for EU clients; redact PII before LLM judge call
  • SOC2: full audit log of every finding + judge verdict + remediation action (90-day retention)
  • Per-client RBAC: PCF auditors cannot see NewLife findings

Use case B — Fintech compliance drift (PCI-DSS, T+N settlement, audit trails)

Problem

A payment platform maintains: PCI-DSS controls documentation (auditor-facing), implementation runbooks (eng-facing), product PRDs, customer-facing settlement SLA, internal incident playbook. An auditor on-site finds the runbook says “T+1 settlement for merchant tier X” but the PRD says “T+2”, and the customer contract attached to a sample merchant says “T+1”. Which is true? If the answer is “T+2 but the runbook is stale”, the auditor flags it as a control gap. If the answer is “T+1 and the PRD is stale”, that’s a finding too. Either way, drift between policy + implementation + customer commitment = compliance risk before the auditor walks in the door.

Why audit matters

  • PCI-DSS Annual Report on Compliance (RoC) cycle = quarterly internal sweep that’s mostly manual today
  • Settlement SLA drift between contract + product + ops = direct revenue + regulatory exposure
  • Encryption key rotation policy across docs (KMS rotation docs vs runbook vs incident playbook) drift silently
  • “Auditor finds drift” = expensive remediation; “audit catches drift internally” = cheap, scheduled fix

Stack mapping

  • Sources: SharePoint policy docs, Confluence runbooks, PRD repo (Notion or Markdown in git), Salesforce CPQ for customer contracts, ServiceNow change records, key-rotation log
  • Detection scope assembly: pre-RoC quarterly sweep + weekly full + daily on changed-in-last-24h
  • Judge: Grok 4.3 with regulatory prompt (“is this contradiction on the same control objective, or different controls that look similar?”)
  • Auto-fix: NEVER auto-applies; opens a ServiceNow change record with proposed reconciliation + assigns to control owner
  • Delivery: dashboard for compliance team; email digest to CISO; pre-audit “control health” report 30 days before audit

Cost estimate (mid-sized fintech, 50K compliance-relevant docs)

  • Corpus: 50K docs, ~500K chunks
  • Detection: ~$1,500/mo (larger context windows for regulation text)
  • Storage: managed Postgres + pgvector tenant-pinned (~$600/mo)
  • Integration eng: 2 months + 0.3 FTE ongoing
  • All-in ~$3K/mo vs a single RoC remediation finding = ~$200K + reputational; preventing one/quarter = 200× ROI

Compliance angle

  • PCI-DSS req 12.1 (documented policies + procedures): direct support
  • SOC2 CC1.4 (commits to control awareness): audit log of findings = evidence
  • Data residency: detector + judge both regional; no cross-border flow of regulated docs
  • BYOK: customer-managed KMS keys encrypt every cached chunk + finding

Use case C — EdTech multi-campus policy synchronization

Problem

A multi-campus EdTech (think 15 campuses × 200 staff) runs: PowerSchool SIS for student data + role-based access, Odoo for accounting + procurement, Jamf MDM for device permissions, internal data-privacy policy doc, FERPA training material, parent-facing privacy notice. Roles are defined in 4+ places that must stay consistent. A new role “Tutor-Substitute” is added in PowerSchool with grade-view access; the data-privacy policy is updated weeks later; the Jamf permission profile never gets updated, so substitute tutors can install random apps; the parent-facing privacy notice never mentions substitutes have access at all. Drift = a FERPA violation waiting to be filed by a parent.

Why audit matters

  • FERPA compliance requires consistency between stated policy + actual technical enforcement + parent disclosure
  • Each campus may have local overrides that drift from corporate norms
  • Quarterly role-permission audits today are manual + spotty
  • A single stale config = potential regulatory complaint with student/family PII implications

Stack mapping

  • Sources: PowerSchool role config (export to JSON), Jamf permission profiles (export), Odoo permission groups, data-privacy policy doc, FERPA training transcript, parent-facing notices (PDF + web)
  • Detection scope assembly: cross-source within campus + cross-campus to catch corporate vs local drift
  • Judge: Grok 4.3 with FERPA prompt (“verify role permission stated in source A matches access granted in source B for the same student data category”)
  • Auto-fix: never; opens Jira ticket to campus IT lead + corporate compliance copy
  • Delivery: per-campus Slack + corporate compliance weekly digest + pre-audit campus health-check

Cost estimate (15-campus EdTech, ~10K policy/config docs)

  • Corpus: 10K docs, ~100K chunks
  • Detection: ~$300/mo
  • Storage: ~$200/mo
  • Integration eng: 1.5 months one-time + 0.2 FTE
  • All-in ~$600/mo vs a single FERPA finding = $5–35K fine + reputational + remediation = 50–500× ROI per incident prevented

Compliance angle

  • FERPA: role/permission consistency = direct support
  • COPPA (under-13 students): parent-disclosure consistency
  • State-level privacy laws (CCPA-equivalents): regional source pinning
  • Data minimization principle: audit log itself must be scoped to compliance team only

Use case D — Healthcare clinical protocol drift

Problem

A hospital network maintains: clinical pathway docs (per condition, written by physicians), EHR order-set templates (what the EHR pre-populates for a doctor placing orders), nurse-station SOPs (printed and laminated), pharmacy formulary, junior-doctor onboarding curriculum. The clinical pathway for “community-acquired pneumonia” says “first-line: amoxicillin”; an EHR template was updated 6 months ago to “first-line: doxycycline” based on local resistance data, but the pathway doc never got updated, and the SOP at one nurse station still says amoxicillin. A junior doctor reads the pathway, places the amoxicillin order, the pharmacist dispenses it, the patient does not improve. Drift in clinical decision support = patient safety incident.

Why audit matters

  • Clinical guidelines evolve quarterly; multiple authoring surfaces lag asymmetrically
  • Junior staff and night-shift staff rely on documentation more than experienced day-shift staff (no senior to ask)
  • Joint Commission audits include clinical-decision-support consistency
  • Patient safety events from documentation drift are reportable + investigable

Stack mapping

  • Sources: clinical pathway doc repo (markdown or Word in SharePoint), EHR (Epic/Cerner) order-set export, nurse-station SOP repo, pharmacy formulary, onboarding LMS content
  • Detection scope assembly: per-condition cross-source bundle; weekly + post-EHR-config-change event trigger
  • Judge: Grok 4.3 with clinical prompt (“verify this is the same patient-care decision in both sources, and the disagreement is on the same intervention not different conditions”); strict bias toward false-positive over false-negative
  • Auto-fix: NEVER for clinical content; opens a ticket to the relevant clinical-informatics committee + escalates to medical director if not triaged in 48h
  • Delivery: clinical informatics dashboard + on-call escalation for RED findings + monthly Joint-Commission-prep report

Cost estimate (mid-sized hospital network, ~20K clinical docs)

  • Corpus: 20K docs, ~200K chunks (clinical text is dense, longer chunks)
  • Detection: ~$800/mo (longer context windows for clinical narrative)
  • Storage: HIPAA-compliant managed Postgres + region-pinned (~$1,200/mo)
  • Integration eng: 3 months one-time (EHR integration is the slowest) + 0.5 FTE ongoing
  • All-in ~$3K–4K/mo vs a single preventable patient safety event = $100K–$10M + reputational; preventing one/year = 30–3,000× ROI

Compliance angle

  • HIPAA: PHI redaction mandatory before LLM call; audit log immutable
  • Joint Commission: documented consistency-check process = direct support
  • Patient safety / risk management: findings feed into Failure Modes and Effects Analysis (FMEA) program
  • Clinical governance: physician + pharmacist + nursing co-sign for any cross-source reconciliation

Use case E — Customer-support knowledge-base hygiene (CX platforms)

Problem

A CX platform like CX Genie powers support for hundreds of merchants. Each merchant has: KB articles (customer-facing), macro responses (agent-facing template library), agent training material, an AI chatbot grounded on the KB. A merchant updates the KB article for “return policy” from 14 days to 30 days; the macro library still uses “14 days”; the chatbot prompt hasn’t been re-grounded; the agent training module still says 14 days. Now the chatbot tells customer A “14 days”, the agent tells customer B “14 days from macro”, the new KB article says 30 days — three different answers from the same merchant, and the customer-facing source is the only correct one. Trust in the support stack erodes per conversation.

Why audit matters

  • AI chatbots ground on the same KB; stale answers = direct customer harm
  • Macro libraries are how agents scale; drift between macros + KB = inconsistent answers
  • Training content updated annually; product changes weekly; built-in drift
  • Customer experience metrics (CSAT, FCR) directly impacted

Stack mapping

  • Sources: KB CMS (Zendesk Guide / Intercom Articles / custom), macro library, agent training LMS, chatbot prompt + grounding config, release notes
  • Detection scope assembly: per-merchant scope (multi-tenant isolation) + per-product-area within merchant
  • Judge: Grok 4.3 with customer-impact prompt (“which of these answers is customer-facing? if they disagree, which is most recent + most authoritative?”); RED severity for any contradiction where chatbot grounding is the stale source
  • Auto-fix: macro library = auto-update with merchant approval; chatbot grounding = auto-regenerate after KB article update detected; agent training = ticket to L&D
  • Delivery: per-merchant dashboard + Slack to CX ops + weekly digest to merchant success manager + chatbot re-grounding webhook

Cost estimate (CX platform, 500 merchants × ~500 docs each = 250K docs)

  • Corpus: 250K docs, ~2.5M chunks
  • Detection: ~$3K/mo (volume scales linearly)
  • Storage: managed multi-tenant Postgres + pgvector ($2K/mo)
  • Integration eng: 2 months one-time + 0.5 FTE ongoing
  • All-in ~$6K/mo for the platform = $12/merchant/mo absorbed into SaaS pricing; protects CSAT scores that drive renewal = effectively zero marginal cost vs renewal value preserved

Compliance angle

  • GDPR DSR: when merchant offboards, full purge of audit findings + cached chunks
  • Per-merchant data isolation: row-level security on every finding query
  • AI transparency (EU AI Act, where applicable): audit log of “which version of the KB grounded which chatbot answer” = direct support for explainability requirements
  • Customer PII in tickets: mandatory redaction before LLM judge call

Cross-cutting patterns

These appear in 3+ use cases above and form a second-tier reusable layer:

  1. Per-tenant scope assembly: source bundling grouped by (tenant, topic) with cross-tenant Layer 3 sweep for corporate-vs-local drift
  2. Compliance-grade audit log: every finding + judge verdict + remediation action stored append-only with 90-day to 7-year retention depending on regulation
  3. Workflow integration not direct apply: Jira / ServiceNow / GitHub PR ticket creation instead of git-snapshot-and-apply for high-stakes verticals (fintech, healthcare)
  4. Pre-audit health-check report: scheduled before regulatory audit cycles (PCI-DSS RoC, Joint Commission, FERPA, SOC2)
  5. Per-source-type judge prompts: clinical contradiction prompt ≠ tax-rate contradiction prompt; tune the judge per vertical
  6. PII / PHI redaction is mandatory not optional: every LLM call goes through the redaction layer; CI gate fails the build if redaction tests don’t pass

Building these once = 8–12 weeks engineering. Then each new vertical adapter = 4–6 weeks instead of 16+.

Go-to-market thinking

The architecture supports 3 plausible business models:

ModelTargetPricingSales motion
B2B SaaS (per-tenant)100–10K-employee companies with regulated or multi-team docsPer-source-volume + tier (compliance-pack add-on)PLG signup → 30-day trial → upgrade. AE for regulated industries
Vertical bundleFintech / healthcare / edtech specific packagePer-tenant + integration feeDirect enterprise sales, 6-month cycles, partnership with EHR / SIS vendors
Embedded in CX/HR/Compliance platformOEM into existing platforms (Zendesk, ServiceNow, Workday)Revenue share or platform licensePartner sales, slower but stickiest

The vertical bundle route has the cleanest premium-pricing story (healthcare / fintech can support $50K+/yr per tenant). B2B SaaS is highest-velocity. Embedded is brand-leveraging but slowest revenue.

What’s NOT in the personal version that enterprise needs

GapEffortPriority
SSO / SAML for auditor role-based access2–4 weeksP0
Multi-tenant scope isolation tests1 weekP0
Audit log + retention policy + query UI2 weeksP0
GDPR DSR (export + purge subject’s findings)2 weeksP0 (EU)
HIPAA PHI redaction (beyond personal PII patterns)1 weekP0 (healthcare)
SOC2 Type II controls3–6 monthsP1 (mid-market+)
Workflow integration (Jira / ServiceNow / GitHub)3 weeks per integrationP1
Per-tenant judge prompt + model selection2 weeksP1
Pre-audit health-check report generator2 weeksP1
Dashboard (drift trend, finding-by-source, MTTR)4 weeksP1
Customer-managed encryption keys (BYOK)2 weeksP2 (regulated)
Multi-region deployment2–4 weeksP2
White-label / embed mode4 weeksP2 (OEM go-to-market)

Total to enterprise-ready MVP: ~4 months of 1 engineer + 1 month design + ~$10K compliance audit prep (varies sharply by vertical).

See also

  • Architecture — the unchanged 3-layer × 4-tier × judge backbone that scales across all verticals
  • Implementation — the code that ships personal; enterprise version extends, doesn’t rewrite
  • PRD — original problem framing; enterprise framing is a superset
  • Personal-RAG enterprise patterns — companion enterprise doc; the audit layer sits on top of the RAG layer