DEV Community

Cover image for tautology problem — AI confirming itself.
Hung Nguyen Van
Hung Nguyen Van

Posted on

tautology problem — AI confirming itself.

Yesterday I posted about senior devs spending 25 minutes reviewing a single AI-generated PR. Someone DMed me: "Just replace the senior with an AI reviewer." That's the trap.

AI writes the code. AI writes the tests. AI reviews the code. Three layers, each one "smart." The problem: all three share the same source of reasoning.

If the AI misreads the spec — the code is wrong, the tests pass with wrong code, the review approves wrong code. All three layers green. Spec still violated.

This is the tautology problem — AI confirming itself.

In April 2026, Anthropic published a postmortem most people didn't read carefully. They admitted: AI-generated regressions in their own codebase slipped past human review, automated review, unit tests, end-to-end tests, automated verification, and dogfooding. Anthropic's full stack — still missed it.

If Anthropic's stack can't catch it — the honest question for any team shipping AI-assisted code: how much is your stack actually catching?


The industry has tried several approaches. None of them solves tautology:

  • Test frameworks (Jest, Pytest…) — tests written by the same AI, same source
  • Linters / SAST (SonarQube, Semgrep) — don't read the spec, only pattern-match code
  • AI code review (Copilot, CodeRabbit, Qodo) — review code-vs-codebase, not code-vs-original-spec
  • Manual senior review — doesn't scale, returns you to 25 min/PR (see yesterday's post)

This is why we built DQA — a Trust Layer for AI-generated code. Not a fifth review tool. A structurally different layer.

DQA compiles rules directly from the spec document — no AI interpretation in the loop. Every commit AI ships gets cross-checked:

  • Does this feature trace back to an original requirement?
  • Does it violate any structural constraint?
  • Is there a signed, timestamped evidence chain for audit?

It sits between "AI writes code" and "code merges to production." A third party, structurally independent — not sharing the same source of reasoning as code-AI, test-AI, or review-AI.

tautology problem — AI confirming itself


If you're shipping AI-assisted code actively in production and want to compare notes on verification patterns your team is hitting — DM me.

I'm in conversations with three dev teams this week, ~30 min each. No pitch deck. You share your pain, I share patterns from other teams. If it fits, I'll suggest a next step. If not, you walk away with 30 minutes of insight into how others are handling this.

👉 DM me or comment "DM" — I'll message you first.

Top comments (0)