Maksat Ramazan

Posted on May 12

Why Your Retry Logic Is Silently Charging Customers Twice

#go #microservices #architecture #fintech

You've seen this bug. Maybe you've shipped it.

Client sends a payment request. Network hiccups. Client doesn't get a response, so it retries. Your handler runs twice. Two charges. One very angry customer.

This isn't a race condition. It's not a bug in your database layer. It's a missing contract between the client and the server: idempotency.

What idempotency actually means in HTTP

An HTTP endpoint is idempotent if calling it multiple times with the same input produces the same result — and more importantly, the same effect.

GET is idempotent by definition. DELETE usually is. POST /payments is not — unless you build it that way.

The pattern Stripe, Adyen, and most serious payment APIs use is the Idempotency-Key header: the client generates a unique key (usually a UUID) and sends it with every request. The server caches the response and returns it on any subsequent request with the same key, without re-executing the handler.

Simple idea. Surprisingly few Go libraries implement it correctly.

I built once because I wanted something that handles the edge cases right — per-key locking, double-check pattern, configurable caching — without pulling in a framework. If you already have a solution you're happy with, this article might still be useful for the reasoning behind the implementation. If you don't, here's one that works.

The problems that make this non-trivial

Before jumping to code, here's what makes a naive implementation break in production.

1. Thundering herd on the first request

Ten retries arrive simultaneously. The key doesn't exist in the store yet. All ten pass the cache check, all ten acquire the handler. You've now processed the request ten times.

The fix is a per-key lock: the first goroutine acquires it, executes the handler, stores the result. The other nine wait, then hit the cache. But locking introduces its own problem — you need a double-check after acquiring the lock, because another goroutine may have already written the result while you were waiting.

2. What to cache when the handler fails

If your handler returns a 500, do you cache that? If yes, retries get a cached error instead of a real retry. If no, the thundering herd problem is back for any request that fails.

The right answer: make it configurable. Cache only 200, 201, 204 by default. Let the caller decide what counts as a final response.

3. Same key, different body

Client sends key abc with amount=100. Something goes wrong client-side. It resends key abc with amount=200. If you just return the cached response, you've silently ignored a changed request.

Stripe returns 422 in this case. You should too — optionally, since the hash check has a cost.

4. Concurrent lock on the same key in progress

If a request is actively being processed (lock held), a duplicate arriving at that exact moment shouldn't wait — it should get 409 Conflict immediately, so the client knows to back off and retry later rather than pile up.

Building it as middleware

I wanted something that drops into any net/http stack without dependencies, with a pluggable store interface — the same approach I used for circuitbreaker.

The core interface:

type Store interface {
    Get(ctx context.Context, key string) (*Response, bool)
    Set(ctx context.Context, key string, resp *Response, ttl time.Duration) error
    Lock(ctx context.Context, key string) (unlock func(), err error)
}

The middleware logic, simplified:

func New(store Store, opts ...Option) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            key := r.Header.Get(cfg.header)

            // No key: pass through or reject, depending on config
            if key == "" {
                if cfg.requireKey {
                    http.Error(w, "Idempotency-Key header is required", http.StatusBadRequest)
                    return
                }
                next.ServeHTTP(w, r)
                return
            }

            // Cache hit: return stored response
            if cached, found := store.Get(ctx, key); found {
                cached.writeTo(w)
                return
            }

            // Acquire per-key lock
            unlock, err := store.Lock(ctx, key)
            if err == ErrLocked {
                http.Error(w, "Request in progress", http.StatusConflict)
                return
            }
            defer unlock()

            // Double-check after lock (another goroutine may have written it)
            if cached, found := store.Get(ctx, key); found {
                cached.writeTo(w)
                return
            }

            // Execute handler, capture response
            rw := newResponseWriter(w)
            next.ServeHTTP(rw, r)

            // Store only if status is considered final
            if cfg.cacheableStatus[rw.statusCode] {
                store.Set(ctx, key, rw.toResponse(requestHash), cfg.ttl)
            }
        })
    }
}

The double-check after acquiring the lock is what separates a correct implementation from one that looks correct in testing and fails at 3am.

Response capture

To cache what the handler wrote, you need to intercept writes to http.ResponseWriter. The standard approach — wrap it:

type responseWriter struct {
    http.ResponseWriter
    statusCode int
    body       []byte
}

func (rw *responseWriter) WriteHeader(code int) {
    rw.statusCode = code
    rw.ResponseWriter.WriteHeader(code)
}

func (rw *responseWriter) Write(b []byte) (int, int) {
    rw.body = append(rw.body, b...)
    return rw.ResponseWriter.Write(b)
}

This captures both the status code and body, so you can store a complete response and replay it exactly on cache hits.

Usage

store := once.NewMemoryStore()
defer store.Stop()

middleware := once.New(store,
    once.WithTTL(24*time.Hour),
    once.WithRequireKey(true),
    once.WithRequestHashCheck(true),
)

http.Handle("/payments", middleware(paymentHandler))

Client side:

POST /payments HTTP/1.1
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Content-Type: application/json

{"amount": 100, "currency": "USD"}

First request: handler executes, response cached, X-Idempotent-Replayed: false.

Any repeat with the same key: cached response returned, X-Idempotent-Replayed: true, handler never touched.

HTTP semantics

Status	Meaning
`200`	Cached response replayed
`400`	Key required but missing
`409`	Same key currently in flight
`422`	Key exists but request body differs

Request body validation

When WithRequestHashCheck(true) is set, the middleware computes a SHA-256 of the request body and stores it alongside the response. On subsequent requests with the same key, it compares hashes — if they differ, it returns 422 instead of the cached result.

func hashRequestBody(r *http.Request) string {
    body, _ := io.ReadAll(r.Body)
    r.Body = io.NopCloser(bytes.NewReader(body)) // restore for handler
    h := sha256.Sum256(body)
    return hex.EncodeToString(h[:])
}

One detail: after reading the body for hashing, you must restore r.Body — otherwise the handler sees an empty body. Easy to miss, painful to debug.

Benchmarks

Measured on Intel Core i7-1355U:

Scenario	ns/op
Cache hit	2,834
Cache hit (parallel)	3,427
Cache miss (first write)	4,429
Passthrough (no key)	2,716
With body hash check	4,632
Thundering herd (100 goroutines)	618,231 total / 6,182 per goroutine

Cache hits are cheap. The thundering herd test launches 100 concurrent goroutines for the same key — only one executes the handler, the rest wait and read from cache. The total time looks large; the per-goroutine cost is reasonable.

C shared library

There's a cgo subpackage that compiles once into a shared library (libonce.so) with a C API:

void once_init(void);
int  once_check(char* key, char* response, int responseLen);
int  once_store(char* key, int statusCode, char* body, int bodyLen, int ttlSeconds);
int  once_lock(char* key);
void once_unlock(int lockId);

This lets you use the same idempotency logic from PHP, Python, or any language with FFI — useful if you have a mixed-stack service where you want consistent behavior without reimplementing the semantics.

What this doesn't cover

In-memory store means the cache doesn't survive restarts and isn't shared across instances. For multi-instance deployments, you need a Redis-backed store. The Store interface is designed for that — a Redis implementation is a thin wrapper around SET NX PX for locking and regular GET/SET for caching.

Also: this middleware handles HTTP-level idempotency. If your handler does external calls (third-party payment APIs, email sends), those need their own idempotency handling. The middleware can't protect you from non-idempotent side effects outside your service.

The library

github.com/aqylsoft/once

go get github.com/aqylsoft/once

Zero external dependencies for the core package. MIT license.

If you're building anything that takes money, sends notifications, or mutates state in response to external events — you probably need this.

DEV Community