Auction systems look simple on the surface.
They are not. Working on real-time bidding and pricing systems at Bolt — where delivery promotions compete for budget in milliseconds — gave me a deep appreciation for how hard concurrent state updates really are.
They combine:
- High write concurrency
- Real-time correctness requirements
- Financial incentives (which attract fraud)
- Global latency sensitivity
- Hard consistency boundaries
This is not a CRUD system.
This is a correctness-under-load system.
In this article, I’ll walk through how I would design it — focusing on trade-offs, consistency models, and scaling strategy.
1. Clarifying Requirements
Functional
- Create auctions
- Place bids
- Highest valid bid wins
- Auctions have start/end times
- Notify users when outbid or auction ends
- Admin moderation support
Non-Functional
- Low latency (<100ms bid response)
- Strong consistency for highest bid
- Horizontal scalability
- Auditability
- High availability
- Fraud resistance
2. The First Senior Insight: Identify the True Bottleneck
The hardest part of the system is:
Concurrent bid updates on the same auction.
Everything else is standard microservices work.
So we optimize around:
- Atomicity
- Serialization of competing bids
- Low latency validation
3. High-Level Architecture
flowchart LR
User([User]) --> AG[API Gateway]
AG --> BS[Bid Service]
BS --> Redis[(Redis)]
BS --> Kafka[[Kafka]]
Kafka --> PW[Persistence Worker]
PW --> DB[(Database)]
BS --> NS[Notification Service]
Why this architecture?
- Redis handles real-time, atomic bid updates.
- Kafka decouples durability from latency (for a deeper dive into Kafka ingestion patterns and streaming pipelines, see the Ad Click Aggregator post).
- DB stores immutable audit history.
- Services scale horizontally.
4. The Critical Path: Placing a Bid
This is the heart of the system.
Request
POST /api/auctions/{auctionId}/bids
{
"amount": 150.00
}
5. TypeScript Implementation (Core Logic)
This is simplified but production-oriented.
interface PlaceBidRequest {
auctionId: string;
userId: string;
amount: number;
}
class BidService {
constructor(
private redis: RedisClient,
private kafka: KafkaProducer
) {}
async placeBid(req: PlaceBidRequest) {
const script = `
local status = redis.call("GET", KEYS[1])
if status ~= "ACTIVE" then
return "INVALID_AUCTION"
end
local current = tonumber(redis.call("GET", KEYS[2]) or "0")
local newBid = tonumber(ARGV[1])
if newBid <= current then
return "BID_TOO_LOW"
end
redis.call("SET", KEYS[2], newBid)
redis.call("SET", KEYS[3], ARGV[2])
return "OK"
`;
const result = await this.redis.eval(script, {
keys: [
`auction:${req.auctionId}:status`,
`auction:${req.auctionId}:highestBid`,
`auction:${req.auctionId}:highestBidder`
],
arguments: [req.amount.toString(), req.userId]
});
if (result !== "OK") {
throw new Error(result);
}
await this.kafka.publish("bid_created", {
auctionId: req.auctionId,
userId: req.userId,
amount: req.amount,
timestamp: Date.now()
});
return { success: true };
}
}
Why Lua?
Because Redis guarantees:
Lua scripts execute atomically.
That eliminates race conditions without database locking.
6. What Happens if Two Bids Arrive at the Same Millisecond?
Redis serializes execution.
One script runs first. The second runs after.
Only one wins.
This guarantees:
- No double highest bid
- No inconsistent state
- Deterministic behavior
If bids are equal, business logic defines tie-breaking (timestamp or deterministic ID ordering).
7. Sequence Diagram (Full Bid Flow)
sequenceDiagram
actor User
participant AG as API Gateway
participant BS as Bid Service
participant Redis
participant Kafka
participant Worker
participant DB as Database
User->>AG: POST /bids
AG->>BS: placeBid()
BS->>Redis: atomic Lua validation
alt Valid Bid
BS->>Kafka: publish bid_created
BS-->>AG: 200 OK
else Invalid Bid
BS-->>AG: 400 Error
end
Worker->>Kafka: consume event
Worker->>DB: insert bid record
8. Scaling to 1 Million Concurrent Bids
You do not scale this with bigger machines.
You scale it horizontally.
Strategy
1. Shard by auction_id
const shard = hash(auctionId) % totalShards;
Each shard has:
- Dedicated Redis instance
- Dedicated BidService cluster
This prevents hot auctions from blocking others.
2. Cache-First Architecture
Redis = source of truth for active auctions Database = historical durability layer
You trade immediate durability for throughput.
That’s intentional.
3. Backpressure Strategy
Under extreme load:
- Rate limit per user
- Reject bids if queue depth exceeds threshold
- Apply request TTL (e.g., 2 seconds)
- Use circuit breakers if downstream fails
Fail fast > fail catastrophically.
9. Fraud & Shill Bidding Prevention
Shill bidding = fake bids to inflate price.
This is not a backend-only problem. This is a data science + behavioral problem.
Detection Approaches
- Same IP/device fingerprint across accounts
- Bid clustering on single seller
- Abnormal bid escalation patterns
- Graph-based account relationship analysis
- ML anomaly detection
Every bid must be immutably logged.
Never trust surface-level heuristics alone.
10. Multi-Region Global Scaling
Now it gets interesting.
The Problem:
- Users are global
- Auctions are time-sensitive
- Latency matters
Strategy:
- Geo-partition auctions by origin region
- Route bids to home region
- Replicate bid events asynchronously
- Use timestamp-based conflict resolution
- Region-local notifications
You do NOT want cross-region locking.
That kills latency.
11. Trade-Off Discussion (Senior-Level Framing)
| Decision | Trade-Off |
|---|---|
| Redis as live state | In-memory risk vs ultra-low latency |
| Async persistence | Eventual durability vs throughput |
| Partition by auction | Complexity vs isolation |
| Region-based ownership | Simpler consistency vs flexibility |
Every design decision here is about:
Reducing contention while preserving correctness.
12. Final Thoughts
Auction systems are deceptively complex because the core challenge isn’t throughput — it’s correctness under contention. Two users bidding on the same item at the same millisecond must produce a deterministic, auditable result every single time.
That’s what makes this different from a typical high-write system. Financial incentives mean every bid must be immutably logged, every race condition eliminated, and every edge case (shill bidding, region failover, tie-breaking) explicitly handled. You can’t paper over bugs with eventual consistency when real money is on the line.
The architecture here — Redis Lua for atomic validation, Kafka for durable event streaming, sharding by auction ID — is designed around one principle: serialize the contention point, parallelize everything else.