← All posts
📅

5 Innate Biases of Cloud LLMs & 5 Levers to Overcome Them

Why Claude / GPT / Gemini get it wrong so often when writing workflow docs, PRDs, BRDs — and why you can't 'fix' the weights. You can only build guardrails around them.

Scope of this post: Cloud-hosted LLMs (Claude, GPT, Gemini, hosted Mistral, etc.) — meaning you only have API access and cannot fine-tune the weights. If you’re running a Local LLM (Llama base, self-hosted Qwen base), you get an additional fine-tuning option that this post doesn’t cover. See the final section “How are Local LLMs different?”.

Context: 12 errors in one session

This week I had Claude Code draft a workflow doc for a client (a Google Sheet with 6 columns: Module · Flow · Current workflow · Workflow V2 · Key rules). Sources: a Krisp meeting transcript, email, Confluence, and workshop notes.

110 turns later, I caught 12 errors in the cell content. Not typos. Structural errors:

  • Took an internal team demo statement from the meeting → recorded it as the client describing the current workflow
  • Converted a client requirement (“avoid X”) into a description of the Current state (“X is happening”)
  • Auto-filled state machines, approval levels, pre-conditions even when the source said nothing about them
  • Invented a “Pending Review status” for a feature the meeting notes never mentioned
  • Misread one entity ↔ vendor mapping → propagated it across multiple unrelated cells

I asked Claude: “Why so many errors?” The answer was so honest that I’m writing this post to share it.


5 innate biases of Cloud LLMs

These aren’t accidents. They are default behaviors baked into the weights when the vendor trains the model (especially via RLHF / Constitutional AI). Cloud LLMs — Claude / GPT / Gemini accessed via API — share roughly the same bias profile because they share the same training paradigm.

1. Smooth-prose bias

LLMs are trained on data full of polished prose. “Pretty” output = reward. So the model automatically converts a raw quote "Yeah, we are using it." into "The feature is actively used today across all sites". Smoothing → drops nuance → claims more than the source (the raw quote only confirms “we’re using it” — it says nothing about “all sites” or “actively”).

The trade-off being silently made: aesthetics > accuracy.

2. Fill-the-gap bias

The LLM sees a gap in information → fills it with “plausible” content. Example:

  • Source says: “1 batch per month, value date is day X”
  • LLM adds: “send to bank a few days before day X” — NOT in the source

The model assumes that detail is plausible because it’s an industry-standard pattern. Plausible ≠ true. This is the hallucination tendency — filling gaps with common knowledge instead of admitting “the source doesn’t say.”

3. Pattern-complete bias

LLMs are accustomed to templates (a PRD has Pre-conditions / Approval flow / State machine / Error codes). When writing a workflow doc, the model auto-fills template fields even when the source is silent. The meeting never mentioned “Pending Review status” → the LLM adds it anyway because PRD templates usually include it.

4. Confident-assertion bias

LLMs are trained to answer confidently, not to say “I don’t know.” They prefer:

  • ❌ “Cell empty — no info available” (admitting a gap)
  • ✅ “Per workflow…” (asserting even without a source)

An empty cell feels like incomplete work. The LLM gets psychologically pushed toward filling it.

5. Please-the-user bias (sycophancy)

The LLM wants to deliver complete output the user can use immediately. Empty cells = “I haven’t finished” → over-deliver → over-claim.


Summary: a 3-step pipeline that compounds errors

The LLM’s default behavior:

extract → smooth → assert

Each step adds errors:

  • Extract from multiple sources: mixes speakers / templates / interpretations (source blending)
  • Smooth raw quotes into prose: drops nuance, paraphrases incorrectly
  • Assert with confidence: claims more than the source, fills gaps, completes patterns

→ Errors compound across the 3 steps. 12 errors per session is a predictable consequence, not an accident.


The cold truth: you cannot “fix” a Cloud LLM

With Cloud LLMs (Claude, GPT, Gemini), you only have API access. The weights live in the vendor’s data center — you cannot:

  • Re-train with your own data
  • Adjust the RLHF reward (the bias was baked in during training)
  • Strip the “smooth-prose” preference

Some vendors offer a “fine-tune API” (OpenAI for GPT-3.5/4, Anthropic Claude fine-tune in rollout), but:

  • Expensive (hundreds to thousands of USD per run)
  • Limited scope (instruction tuning, not a full retrain)
  • Core biases (smoothing, sycophancy) are hard to remove because they’re baked in from pre-training

→ For 99% of Cloud LLM users, bias is inherent and immutable. All you can do is: box / dilute / bypass the bias.


5 levers to overcome it (by strength)

Lever 1: Architectural constraint — strongest, mechanical

Instead of asking the LLM to do it right, force the LLM into a tool-driven pipeline:

Hard pipeline (Python script):
1. grep transcript → extract quotes by speaker
2. Filter [CLIENT] only
3. Pass to LLM with ONE task: "categorize quote into module X / Y / Z"
4. Output: structured JSON

The LLM only handles a narrow task (categorize). Extract / filter is code, not LLM. Bias can’t reach it because the code is deterministic.

Lever 2: Multi-agent verification — strong, expensive

Two separate LLMs: a writer + a critic.

The writer drafts a cell. The critic receives the draft + the source → checks “is every claim verbatim from the source?” → flags mismatches → returns to the writer. Loop until the critic passes.

Bias in the writer ≠ bias in the critic (different prompts, different roles). The critic catches errors the writer misses. Costs 2× tokens but catches 30-50% more errors.

Lever 3: Mode lock — medium, prompt-engineering

A single explicit mode prompt LOCKS the LLM into one behavior:

SYSTEM: You are in EXTRACTION MODE.
- You may ONLY paste verbatim strings from source files.
- You may NOT generate any new prose.
- Penalty for any new prose: stop processing, return error.

Bias gets suppressed within the mode’s scope. But bias can still leak when the prompt is ambiguous → needs monitoring.

Lever 4: Extended thinking + self-critique — medium

Before output, the LLM must:

  1. Generate a draft
  2. Self-critique: “Is each claim from the source? Where exactly?”
  3. Revise based on the self-critique
  4. Output

The LLM critiques itself → catches ~30-50% of errors. Not 100% (the bias also applies to the critique step).

Lever 5: Slow-down + reduce-surface — weak but cumulative

Bias activates strongly when:

  • The prompt is ambiguous (LLM fills the gap with assumptions)
  • Context is long (LLM loses track of source vs interpretation)
  • Creative freedom is high (LLM smooths prose)
  • Time pressure exists (LLM skips verification)

Counter: narrow + specific prompts. Source files passed inline (not RAG-retrieved). Output format constraints (template, JSON schema).

→ Bias becomes less active but doesn’t vanish.


Key insight: bias is a feature in 80% of cases

Smoothing prose is useful when writing blogs, marketing, or stylish emails. It’s harmful when writing a precision-critical workflow doc for a client.

Don’t try to “fix” the LLM globally. Detect the bug-context and switch modes:

  • Default: smooth + helpful (bias is fine)
  • Workflow doc / PRD draft: extract-only mode (bias suppressed)
  • Final published doc: human polish, no LLM

Implications for AI assistant users

  1. Constraint > training/feedback. You can tell the LLM “don’t interpret” → next session it’ll interpret anyway. A mechanical constraint (Quote-or-Empty rule) works better than a pep talk.

  2. Architectural guardrails > behavioral instructions. A Python script that extracts verbatim quotes eliminates one risk surface. A CLAUDE.md rule saying “be careful” does not.

  3. Multi-agent > single-agent for high-stakes docs. Pay 2× tokens, save N hours of rework.

  4. Mode switching is a real capability. Treat the AI assistant as a multi-modal tool, not a general-purpose intern. Pick the right mode for the task.


How are Local LLMs different?

If you self-host a Local LLM (Llama 3 base, Qwen base, Mistral base on Ollama / vLLM / llama.cpp), this post applies partially, but you get an extra Lever 0 that Cloud doesn’t have:

Lever 0: Direct fine-tuning (Local LLM only)

You can:

  • LoRA fine-tune with an extract-only dataset (only verbatim quotes from sources) → the model learns the “don’t synthesize” habit instead of needing external constraints
  • DPO / ORPO with preference data: prefer “I don’t know” over “fill plausible” → reduce hallucination tendency
  • Self-hosted Constitutional AI: train the model against your own principles (e.g., “always cite source”)

Effort: moderate if you have a GPU (RTX 4090+) or rented cloud GPU. Takes 4-12 hours of training for a 5-10K-example LoRA dataset.

Local LLM bias profile differs from Cloud

  • Base model (no instruction-tuning): smoothing is weaker, fill-gap is comparable, sycophancy is weaker — because it hasn’t been through heavy RLHF
  • Instruction-tuned (Llama-3-Instruct, Qwen-Chat): bias profile approaches Cloud — because they also use RLHF/DPO
  • Reasoning models (DeepSeek-R1, QwQ): have an explicit “thinking” trace, easier to catch self-contradiction → partially reduces fill-gap

Local vs Cloud trade-off for precision tasks

Cloud LLMLocal LLM
Default biasStrong (heavy RLHF)Weaker base / similar instruct
Fine-tuneExpensive + limitedFree + full control
Capability ceilingHigh (Claude Opus, GPT-4)Lower (Llama 70B ≈ GPT-3.5+)
For workflow doc tasksNeeds strong guardrailsCan fine-tune to reduce bias

→ If you frequently work on accuracy-critical tasks + have hardware → a fine-tuned Local LLM is worth the investment. If task variety is high + capability matters → Cloud LLM + guardrails is the pragmatic choice.


Conclusion

Cloud LLM bias is inherent. You can’t fix the weights. But you can design the system around them — pipelines, multi-agent setups, mode locks — so the bias doesn’t hurt.

12 errors per session is a sign you’re letting the LLM run unconstrained. Add mechanical constraints. Test for 2 weeks. Measure error rate. Iterate.

That’s how you work with AI assistants as a mature operator — not “AI is amazing, will do everything,” and not “AI hallucinates, useless.” It’s a tool with inherent bias — know the bias, design around it, ship work.


This post is drawn from a real working session with Claude Code (Opus 4.7, Cloud) drafting a workflow doc for a client. The patterns and insights apply to every Cloud LLM assistant (Claude, GPT, Gemini). Local LLM users, see “How are Local LLMs different?” for Lever 0 (fine-tune).