DEV Community

manja316
manja316

Posted on • Originally published at github.com

Your Polymarket Bot Is Lying To You About P&L. I Built A Free Auditor. Here's What I Found.

Your Polymarket Bot Is Lying To You About P&L. I Built A Free Auditor. Here's What I Found.

If your Polymarket bot tells you it's up $34 on the week, there's a decent chance your wallet is actually down $90. The CLOB doesn't fill the way your post_order response suggests it does, and most bots never reconcile the difference.

I ran into this problem on my own crash-recovery bot, lost real money to it, and built a tool to find the gap. It's a one-line install. You give it a wallet address and it tells you how much your fill slippage is actually costing you.

pip install pnl-truthteller
pnl-truthteller --wallet 0xYourPolymarketProxy --output report.md
Enter fullscreen mode Exit fullscreen mode

Read-only. No API keys. No private key. Wallet address only.

The actual gap, on my own bot

My bot logged every closed trade into SQLite with a theoretical_pnl column computed from (exit_price - entry_price) × shares. After 320 trades it claimed +$34.31. The on-chain reality:

Source Trades DB-reported P&L On-chain P&L Hidden slippage
My bot 320 $+34.31 $-90.72 $-125.03
Random wallet 0x1417… 65 $+32.36 $-30.29 $-62.66

The second row is the part that matters. I picked a stranger's wallet off the public CLOB feed, reconstructed what their bot's DB would have said, and got the same shape: small positive reported, real negative on-chain, a multi-bps gap that nobody is watching.

That gap is the entire thesis of this tool. It's not just my bot. It's the default behavior of every bot that records P&L on order submission instead of fill confirmation.

Why it happens

Polymarket's CLOB fills in stages. FOK rejects, partial fills, sweep retries at deeper prices, dust left behind, idempotency issues on retry loops. If your code is shaped like this:

resp = client.post_order(args)
if resp["success"]:
    db.execute("INSERT INTO trades (theoretical_pnl, ...) VALUES (?, ...)", ...)
Enter fullscreen mode Exit fullscreen mode

…then theoretical_pnl is computed against the intended fill price. The actual fills can be 5–25% worse, especially on sweep retries. Your DB shows the intention. The chain shows the execution. The delta is your real cost of doing business.

What the report actually looks like

# Slippage Report — 2026-04-28T14:30:00+00:00

## TL;DR
- Closed trades total: 308
- Lifetime theoretical P&L: +$33.49
- Lifetime actual P&L (on-chain fills): -$89.01
- Total slippage cost: -$122.50 (-365.8% of theoretical)
- Trades with stranded dust on-chain: 31 (47.3 shares dust)

## By exit reason
| Reason | n | Theoretical | Actual | Slippage |
|---|---|---|---|---|
| TIMEOUT | 142 | -$18.00 | -$84.50 | -$66.50 |
| TARGET | 71 | +$28.40 | +$22.10 | -$6.30 |
| RECOVERY_TRAILED | 50 | +$15.20 | +$12.40 | -$2.80 |
| STOP | 39 | +$7.89 | +$0.99 | -$6.90 |
Enter fullscreen mode Exit fullscreen mode

The "by exit reason" breakdown is the most actionable column. In my case TIMEOUT exits were the entire bleed: my bot was waiting too long to flatten and the book moved against it before each forced close. Once I knew that, the fix was an hour of work. Without the report I would have kept blaming "noise."

How the matching works (the only non-obvious part)

For each closed trade the tool:

  1. Finds the actual BUY fills by matching token_id + a timestamp window, deduplicated by orderID.
  2. Finds the actual SELL fills that closed the position the same way.
  3. Computes theoretical = (exit_price - entry_price) × shares (what the bot thinks).
  4. Computes actual = sum(sell_takingAmount) - sum(buy_makingAmount) (what the chain says).
  5. Slippage = actual - theoretical. Negative = your fill ladder walked the book down.

The dedup-by-orderID step is the one most rolled-your-own scripts miss. Sweep retries — where your bot tries 5%, 15%, 25% off the reference price — frequently log the same orderID multiple times if you call post_order from a retry loop without idempotency checks. Without dedup you double-count proceeds and your slippage looks fine when it isn't.

Three input modes, depending on how you log

1. Wallet address only (zero setup):

pnl-truthteller --wallet 0xYourProxyAddress --output report.md
Enter fullscreen mode Exit fullscreen mode

Pulls every fill for the wallet from Polymarket's public CLOB API, groups by token + direction, produces the report. Works on any wallet — yours, a competitor's, a random one off the feed.

2. From your bot's SQLite (if you've been logging raw CLOB responses):

pnl-truthteller --sqlite ~/bot/trades.db \
                --positions ~/bot/positions.json \
                --output report.md
Enter fullscreen mode Exit fullscreen mode

Expects a live_trades table with token_id, side, timestamp, raw_response. The raw_response should be the JSON string returned by client.post_order().

3. From JSONL (for non-Python stacks):

pnl-truthteller --trades trades.jsonl --fills fills.jsonl --output report.md
Enter fullscreen mode Exit fullscreen mode

What this is not

  • It does not execute trades. Read-only.
  • It does not need your private key. Wallet address only.
  • It does not store your data anywhere. Local file in, local file out.
  • It does not promise to fix your bot. It finds the gap; you fix the bot.

Get it

If you're running anything on the Polymarket CLOB and have not reconciled DB vs chain in the last 30 days, run this against your wallet today. The 30-second answer is worth more than the next feature you were going to add.

Top comments (0)