DEV Community

Cover image for How to Catch Hallucinated Dependencies Before They Break Production
Alan West
Alan West

Posted on

How to Catch Hallucinated Dependencies Before They Break Production

The bug that cost me three hours

Last month, a PR landed in our repo with a single new import: lodash-utils. Not lodash. The author was a senior engineer I trust. The diff was small. The tests passed locally. CI was green. Two days later, a customer reported that our debouncing logic on a search input had stopped working.

Three hours of bisecting later, I realized what happened. The package lodash-utils exists on npm. It's a squatter — an empty shell someone registered hoping for exactly this scenario. The AI assistant suggested it, the autocomplete accepted it, and our review process never caught it because the name looked fine.

This is not a one-off. After running into variations of this across three different projects, I'm convinced hallucinated dependencies are one of the quieter problems creeping into codebases right now. Here's how to catch them before they cost you a weekend.

Root cause: LLMs predict, they don't verify

Language models don't query npm or PyPI before suggesting an import. They generate token sequences that look statistically plausible based on training data. If the model has seen from utils import debounce and from lodash import debounce and import { debounce } from 'lodash-es' thousands of times, it will happily produce import { debounce } from 'lodash-utilities' — a package that may or may not exist.

There's a name for the security flavor of this problem: slopsquatting. Attackers register packages using names that LLMs commonly hallucinate, then wait. When you install one, you're running their code in your build pipeline. Security researchers have shown this is reproducible and that certain hallucinated names get suggested consistently across model versions.

The scariest part is that hallucinated imports often pass tests. If the squatter package exports nothing, your import statement succeeds but the usage fails silently — or worse, it works by coincidence because tree-shaking strips the unused symbol.

Step 1: Lock your registry and freeze your lockfile

First, make sure nothing gets installed without going through your lockfile. This sounds obvious but I keep finding projects where it isn't enforced.

For Node, use npm ci (or pnpm install --frozen-lockfile / yarn install --immutable) in CI. This refuses to install if package.json and the lockfile disagree:

# .github/workflows/ci.yml
- name: Install deps
  run: npm ci  # fails if package-lock.json is out of sync
Enter fullscreen mode Exit fullscreen mode

For Python, pin everything:

# Generate fully-pinned requirements with hashes
pip-compile --generate-hashes requirements.in

# In CI, require hashes
pip install --require-hashes -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

The --require-hashes flag is the part most people skip. Without it, a malicious package update can swap content under the same version.

Step 2: Verify package legitimacy in pre-commit

The trick is catching the bad import before it gets installed locally and committed. Here's a small pre-commit hook I've been using. It scans staged JavaScript and TypeScript files for new imports and checks them against npm's registry API.

// scripts/check-imports.js
import { execSync } from 'node:child_process';
import fs from 'node:fs';

// Get all staged JS/TS files
const staged = execSync('git diff --cached --name-only --diff-filter=ACM')
  .toString()
  .split('\n')
  .filter(f => /\.(js|ts|jsx|tsx)$/.test(f));

// Match bare imports (skip relative ./ and absolute /)
const importRegex = /(?:from|require\()\s*['"]([^./'"][^'"]*)['"]/g;
const seen = new Set();

for (const file of staged) {
  const content = fs.readFileSync(file, 'utf8');
  let m;
  while ((m = importRegex.exec(content)) !== null) {
    // Normalize scoped packages: @scope/name → @scope/name
    const pkg = m[1].startsWith('@')
      ? m[1].split('/').slice(0, 2).join('/')
      : m[1].split('/')[0];
    seen.add(pkg);
  }
}

for (const pkg of seen) {
  try {
    // Hit the registry; 404 = doesn't exist
    execSync(`npm view ${pkg} name`, { stdio: 'pipe' });
  } catch {
    console.error(`Unknown package: ${pkg}`);
    process.exit(1);
  }
}
Enter fullscreen mode Exit fullscreen mode

This catches the obvious case — the package doesn't exist at all. It won't catch a real but malicious squatter, which is why step 3 matters.

Step 3: Cross-check against a known-good source

For packages that do exist, you want signals that they're legitimate. Weekly downloads, age, and maintainer history are decent proxies. The official npm registry returns this metadata, and so does the PyPI JSON API.

A quick reputation check:

// scripts/check-reputation.js
async function checkPackage(name) {
  const meta = await fetch(`https://registry.npmjs.org/${name}`)
    .then(r => r.json());

  // First-published date — squatters tend to be young
  const created = new Date(meta.time.created);
  const ageDays = (Date.now() - created) / 86400000;

  // Downloads as a sanity check
  const dl = await fetch(
    `https://api.npmjs.org/downloads/point/last-week/${name}`
  ).then(r => r.json());

  if (ageDays < 30 || dl.downloads < 1000) {
    console.warn(
      `[suspicious] ${name}: age=${ageDays.toFixed(0)}d, weekly=${dl.downloads}`
    );
  }
}
Enter fullscreen mode Exit fullscreen mode

This is not foolproof — sometimes a brand-new package is exactly what you want. But it forces a conversation. If a freshly-created package with 12 downloads shows up in a PR, a human should approve it explicitly.

Step 4: Treat your AI suggestions like an untrusted PR

The higher-order fix is cultural. When AI suggests an import you don't recognize, do the same thing you'd do with a junior dev's PR:

  • Look up the package on its official registry page
  • Check the GitHub repo (does it exist? when was it last updated?)
  • Read the actual exports — does the symbol you're importing exist?
  • For Python, run pip show <package> and check the homepage

This takes 30 seconds per unfamiliar import. It will save you hours when you skip the wrong one.

Prevention: defaults that catch the next one

A few things I now set up on every new project:

  • Enforce lockfile parity in CI. No silent installs from package.json drift.
  • Pin Python deps with hashes. pip-compile --generate-hashes is the lowest-friction path.
  • Run a registry check in pre-commit. Even the naive "does this package exist" check catches the common hallucinations.
  • Audit on a schedule. npm audit signatures for npm; pip-audit for Python. Run them weekly, not just on install.
  • Mirror your registry if you can. A private registry with allow-listed upstreams kills most of the slopsquatting risk outright.

I haven't tested this thoroughly yet, but I've also been experimenting with running a socket.dev-style supply-chain scanner on every PR. It catches things like sudden permission changes between versions, which is another vector worth watching.

The bigger lesson, honestly, is that AI-assisted code needs the same skepticism we apply to any external contribution. The convenience of autocomplete makes us forget that a suggestion is just a suggestion — not a verified fact. The package you're about to install might not exist, might be a trap, or might just be wrong. A few minutes of guardrails up front beats a three-hour debug session every time.

Top comments (0)