MONOPOLY — Design Patterns Deep Dive
Table of Contents
- CQRS
- Event Sourcing
- Saga Pattern
- Circuit Breaker
- Bulkhead
- Strangler Fig
- Sidecar / Service Mesh
- Outbox Pattern
- Consistent Hashing
- Backpressure
- Leader Election
- Two-Phase Commit
1. CQRS (Command Query Responsibility Segregation)
What it is: Separate the read model (Query) from the write model (Command) into distinct services, databases, or code paths.
When to use:
- Read load is 10×+ write load (most web apps)
- Read queries are complex aggregations over write data
- Need to optimize read and write paths independently
- Domain model is complex (DDD contexts)
Implementation:
Write Path: Client → Command API → Write DB (normalized, PostgreSQL)
Read Path: Client → Query API → Read DB (denormalized, Redis / Elasticsearch)
Sync: Write DB → CDC (Debezium) → Message Queue → Read DB updater
Trade-offs:
- ✅ Independent scaling of read and write
- ✅ Optimized schemas for each operation type
- ❌ Eventual consistency between write and read models
- ❌ Increased complexity; two models to maintain
Real-world users: Amazon (order service), LinkedIn (feed)
2. Event Sourcing
What it is: Store state as a sequence of immutable events rather than current state. Rebuild current state by replaying events.
When to use:
- Full audit trail is a regulatory requirement (fintech, healthcare)
- Need to replay history for debugging or analytics
- Complex domain with many state transitions
- Need to derive multiple read projections from same data
Implementation:
Event Store: append-only log (Kafka, EventStoreDB)
Snapshots: periodic snapshots to speed up state rebuild
Projections: consumers build read models from events
Trade-offs:
- ✅ Complete audit history; perfect for compliance
- ✅ Replay and time-travel debugging
- ❌ Querying current state requires projection maintenance
- ❌ Event schema evolution is hard
- ❌ High storage overhead over time
3. Saga Pattern
What it is: Manage distributed transactions across microservices via a sequence of local transactions, each publishing an event. If a step fails, compensating transactions undo previous steps.
Two variants:
- Choreography: Services react to events autonomously (decentralized)
- Orchestration: A central Saga Orchestrator coordinates steps (centralized)
When to use:
- Multi-service workflows where ACID across services is impossible
- Long-running business transactions (order → payment → inventory → shipping)
- Need rollback across service boundaries
Choreography Example:
OrderService creates order →
[event: OrderCreated] →
PaymentService charges card →
[event: PaymentProcessed] →
InventoryService reserves stock →
[event: StockReserved] →
ShippingService books courier
Compensating Transactions (on failure):
ShippingService fails →
[event: ShippingFailed] →
InventoryService releases stock →
PaymentService refunds card →
OrderService marks order failed
Trade-offs:
- ✅ No distributed locking; high availability
- ✅ Scales well across services
- ❌ Hard to debug; distributed trace required
- ❌ Compensating transactions are complex to implement correctly
4. Circuit Breaker
What it is: A proxy that monitors calls to a service. If failure rate exceeds threshold, the circuit "opens" and calls fail fast instead of waiting for timeout.
States:
CLOSED → calls pass through; monitor failure rate
OPEN → calls fail immediately; no calls to downstream
HALF-OPEN → let a probe call through; if success, close; if fail, stay open
When to use:
- Calling any external service (payment gateway, SMS, email)
- Microservices calling each other
- Preventing timeout cascade when downstream is slow
Implementation tools: Hystrix (deprecated), Resilience4j, Polly (.NET), Envoy proxy
Thresholds (starting point):
- Open after 50% failure rate over 10 requests
- Stay open for 30 seconds
- Half-open: allow 1 probe request
Trade-offs:
- ✅ Prevents cascade failures
- ✅ Gives downstream time to recover
- ❌ Adds latency overhead for monitoring
- ❌ Requires fallback behavior when circuit is open
5. Bulkhead
What it is: Isolate components so a failure in one doesn't consume resources of others. Named after the watertight compartments in ship hulls.
Types:
- Thread Pool Bulkhead: Separate thread pools per service call
- Semaphore Bulkhead: Limit concurrent calls per service
- Process Bulkhead: Separate processes/containers per service type
When to use:
- Multiple tenants sharing infrastructure (SaaS)
- One slow service consuming all connection pool slots
- Protecting critical services from being starved by non-critical ones
Example:
Without bulkhead:
[Recommendation Service hangs] → fills shared thread pool → [Payment Service starves]
With bulkhead:
[Recommendation Service hangs] → fills its own thread pool (10 threads) → [Payment Service unaffected, has its own 50 threads]
6. Strangler Fig Pattern
What it is: Incrementally replace a legacy monolith by routing new functionality to new microservices, while keeping the monolith alive for unchanged features.
Migration steps:
Phase 1: Deploy proxy in front of monolith (no user impact)
Phase 2: Route one feature to new microservice
Phase 3: Verify; deprecate that feature in monolith
Phase 4: Repeat for each feature
Phase 5: Monolith is empty; decommission
When to use:
- Migrating legacy monolith to microservices
- Can't do a big-bang rewrite (too risky)
- Need to ship new features during migration
Trade-offs:
- ✅ Zero downtime migration
- ✅ Incremental risk
- ❌ Dual maintenance burden during migration (monolith + new services)
- ❌ Proxy adds latency; must be managed carefully
7. Outbox Pattern
What it is: Solve the dual-write problem (write to DB AND publish to queue atomically) by writing the event to an "outbox" table in the same DB transaction, then having a separate process relay it to the queue.
Problem it solves:
❌ WRONG (dual-write race):
BEGIN;
UPDATE orders SET status='paid';
COMMIT;
// Crash here → event never published, DB and queue are inconsistent
publish(PaymentProcessed);
✅ CORRECT (outbox):
BEGIN;
UPDATE orders SET status='paid';
INSERT INTO outbox (event_type, payload) VALUES ('PaymentProcessed', {...});
COMMIT;
// Relay process reads outbox and publishes to Kafka
// At-least-once delivery guaranteed; make consumers idempotent
Relay options: Debezium (CDC), polling relay, transaction log tailing
8. Consistent Hashing
What it is: A hashing scheme where adding or removing nodes requires only K/N keys to be remapped (K = keys, N = nodes), instead of remapping all keys.
When to use:
- Distributing cache keys across Redis cluster nodes
- Routing requests to servers in a distributed system
- Partitioning data across database nodes
Virtual nodes: Assign multiple positions per physical node on the hash ring to ensure even distribution even with few nodes.
9. Backpressure
What it is: A mechanism for consumers to signal producers to slow down when they can't keep up, preventing memory exhaustion and cascade failures.
Strategies:
- Drop: Discard overflow messages (acceptable for metrics, logs)
- Buffer: Queue up to a limit, then block or drop
- Block: Producer waits until consumer catches up (simplest, may cause timeout)
- Rate Limit: Throttle producers at ingestion point
When to use:
- Message queue consumers are slower than producers
- Real-time data pipeline ingestion spikes
- API rate limiting for upstream clients
10. Leader Election
What it is: In a distributed system, elect a single node to perform a privileged task (e.g., writing to DB, sending scheduled jobs, coordinating work).
Algorithms:
- Raft: Used by etcd, CockroachDB, Consul. Practical and well-understood.
- ZooKeeper (ZAB): Used by Kafka, HBase. Mature but operationally heavy.
- Bully Algorithm: Simple; highest ID wins. Not fault-tolerant.
When to use:
- Scheduled jobs that should only run once (cron replacement)
- Primary/replica database failover coordination
- Distributed lock management
Tools: etcd, ZooKeeper, Consul, Redis (Redlock — use with caution)
11. Two-Phase Commit (2PC)
What it is: A distributed algorithm that ensures all participants in a transaction either all commit or all abort.
Phases:
Phase 1 (Prepare): Coordinator asks all participants "can you commit?"
All say YES → proceed to Phase 2
Any says NO → abort
Phase 2 (Commit): Coordinator tells all participants to commit
When to use (sparingly):
- Strong consistency is an absolute requirement across services
- Data loss is catastrophic (financial settlements)
Why to avoid:
- Coordinator is a SPOF
- Blocks on participant failure
- Very low throughput under contention
- Prefer Saga Pattern in most microservice architectures
12. Read-Through / Write-Through / Write-Behind Cache
Read-Through:
Client → Cache (miss) → Cache fetches from DB → Returns to client
Cache is always populated on miss. Simple for clients. Risk: cold start.
Write-Through:
Client → Cache → Cache writes to DB synchronously → Confirms
Strong consistency. Higher write latency. Good for read-heavy with consistency need.
Write-Behind (Write-Back):
Client → Cache → Confirms immediately → Async flush to DB
Very low write latency. Risk of data loss if cache fails before flush. Good for high-throughput counters, analytics.
Cache-Aside (Lazy Loading):
Client → Cache (miss) → Client fetches from DB → Client writes to Cache
Most common. Application owns cache logic. Risk: thundering herd on cold start.
Limitations
- This is a reference document and may not cover all edge cases. Always verify architectures before production.