Why your Vercel bill is higher than expected (hint: AI bots)

#webdev #vercel #devops #javascript

I run a small side project on Vercel. Last month
my bandwidth bill was 3x higher than usual. No
viral traffic, no new features. Just bots.

Here is what I found and how to fix it.

The "Hidden" Bandwidth Tax

In 2026, the internet is no longer just for humans.
AI crawlers — GPTBot, ClaudeBot, Bytespider, and
PerplexityBot — are aggressively indexing the web
to train new models and power real-time search
engines.

While traditional search engines like Google are
relatively polite, AI bots are often greedy. They
don't just index your home page — they hit your
dynamic routes, parse your JSON, and re-crawl your
assets multiple times a day.

Why this kills you on Vercel

Vercel is incredible for performance, but bandwidth
egress (the data sent from your server to the
visitor) is where costs scale.

A polite bot: Hits your page once a week = negligible cost
An AI swarm: Hits your high-data routes 20 times an hour = a $50 bill by end of week

How to see the "ghost" traffic

Most developers just look at total request counts.
You need to look at the User-Agent strings in your
logs.

If you see Bytespider (the TikTok bot) or
GPTBot appearing thousands of times, you are
effectively paying Vercel to serve data that trains
someone else's AI model.

Export your Vercel logs from your dashboard
(Project → Logs → Export) and open them in a text
editor. Search for these strings:

GPTBot — OpenAI training crawler
ClaudeBot — Anthropic training crawler
Bytespider — ByteDance, known to ignore robots.txt
CCBot — Common Crawl, used by many AI companies
PerplexityBot — Perplexity search crawler

If you see any of these hitting hundreds or
thousands of times, that is your bill spike.

How to stop the leak

Option 1 — robots.txt

The first line of defense. Add this to your
public/robots.txt:
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /

Warning: Many aggressive AI bots in 2026 ignore
robots.txt entirely. Bytespider is a known offender.

Option 2 — Next.js middleware

Add this to middleware.ts in your project root:

import { NextResponse } from 'next/server'

const BLOCKED_BOTS = [
  'GPTBot', 'ClaudeBot', 'Bytespider', 
  'CCBot', 'PerplexityBot'
]

export function middleware(req) {
  const ua = req.headers.get('user-agent') || ''
  if (BLOCKED_BOTS.some(bot => 
    ua.toLowerCase().includes(bot.toLowerCase())
  )) {
    return new NextResponse('Forbidden', { status: 403 })
  }
  return NextResponse.next()
}

export const config = {
  matcher: ['/((?!_next/static|_next/image|favicon.ico).*)'],
}

This blocks known AI bots at the edge before they
consume your bandwidth.

Option 3 — Cloudflare WAF

If you use Cloudflare in front of Vercel, create a
custom WAF rule:
(http.user_agent contains "GPTBot") or
(http.user_agent contains "ClaudeBot") or
(http.user_agent contains "Bytespider") or
(http.user_agent contains "CCBot")

Set action to Block. This is the most reliable
method since it stops bots before they even reach
Vercel.

The faster way

Manually hunting through log files is tedious.
I built a free browser-based tool that does it
automatically.

Drop your Vercel log export into botcost.dev and
it shows you exactly which bots are hitting you,
how much bandwidth each one consumed, and generates
the exact WAF rule or middleware code to block them.

Everything runs locally in your browser. Your log
file never leaves your device. Free, no account
required.

botcost.dev

Have you seen unusual bot traffic in your Vercel
logs? Drop the bot names in the comments — curious
what people are seeing in the wild.