Topic 1

Cache-Aside vs Write-Through

01

Cache-Aside: The Industry Default

Cache-aside (also called lazy loading or look-aside) is the most widely used caching pattern. The application owns the caching logic: it checks the cache before querying the database, and populates the cache on a miss. The cache is a passive store — it does not know about the database and does not load data on its own.

Cache-aside detailed flow — read path and write path step by step
Figure 1: Cache-aside detailed flow — the application manages all cache interactions; the cache is populated lazily on first access

Read Path (Step by Step)

  1. Application receives a read request for user:42.
  2. Application calls Redis: GET user:42.
  3. Cache HIT: Redis returns the data. Application sends it to the client. Done in <1ms.
  4. Cache MISS: Redis returns null. Application queries PostgreSQL: SELECT * FROM users WHERE id = 42.
  5. Application stores the result in Redis: SET user:42 <data> EX 3600 (1-hour TTL).
  6. Application returns the data to the client. Next request for user:42 will hit cache.

Write Path: Delete, Do Not Update

When data changes (UPDATE, DELETE), the application writes to the database first, then deletes the cache key. It does NOT update the cache with the new value. Why? Because deleting is simpler and safer: if the database write succeeds but the cache update fails, the cache has stale data. With delete, the worst case is a cache miss on the next read, which repopulates from the authoritative database.

Why Delete Instead of Update? The Race Condition

Imagine two concurrent writes: Thread A updates price to $99, Thread B updates price to $129.

With cache UPDATE: Thread A writes DB ($99), Thread B writes DB ($129), Thread A updates cache ($99), Thread B updates cache ($129). Cache shows $129, DB shows $129 — correct. But if Thread B's cache update happens before Thread A's: cache shows $99, DB shows $129 — WRONG.

With cache DELETE: Both threads delete the key. Next read gets $129 from DB. Always correct.

02

Write-Through: Always-Fresh Cache

In write-through caching, every write operation updates both the cache and the database synchronously. The client does not receive confirmation until both writes succeed. This guarantees the cache is always consistent with the database — there is never a stale read. The trade-off is higher write latency (two writes per operation) and the fact that all data is cached, even data that may never be read.

Write-through caching — synchronous double write to cache and database
Figure 2: Write-through — every write goes to both cache and database synchronously, guaranteeing zero stale reads at the cost of 2x write latency

When to use Write-Through over Cache-Aside:

  • Data must never be stale (user permissions, feature flags, pricing)
  • Write frequency is low
  • You need the cache to always be warm

Stick with Cache-Aside when:

  • You have a read-heavy workload (90%+ reads)
  • You cannot afford the write latency overhead
  • You only want to cache data that is actually requested (memory efficiency)
All four caching patterns compared — cache-aside, write-through, write-behind, read-through
Figure 3: All four caching patterns side by side — read path, write path, consistency guarantee, and best use case for each
PatternReadWriteConsistencyLatencyBest For
Cache-AsideApp → Cache → DBApp → DB → DEL cacheEventual (TTL)Low reads, normal writes90% of systems (default)
Write-ThroughApp → CacheApp → Cache + DBStrongNormal reads, 2x writesConsistency-critical data
Write-BehindApp → CacheApp → Cache (async DB)WeakFast writesMetrics, view counts
Read-ThroughApp → Cache (auto-load)VariesDependsLow readsSimplified app code
Refresh-AheadApp → Cache (pre-warmed)N/AStrongZero missesPredictable access patterns
Interview Tip

Explicitly name the pattern and state your read:write ratio reasoning. "I'll use cache-aside because this is a read-heavy workload — 95% reads, 5% writes — and I want to only cache what's actually requested." This specificity separates strong candidates from average ones.

Topic 2

Cache Invalidation Deep Dive

Five cache invalidation strategies compared
Figure 4: Five invalidation strategies — from TTL expiry (simplest) to stale-while-revalidate (best UX)
03

Five Invalidation Strategies Ranked

1. TTL Expiry (Simplest)
Every cache key has a Time-To-Live. After the TTL expires, the key is automatically deleted. The stale window is exactly the TTL. Set shorter TTLs for frequently changing data (inventory: 10s) and longer TTLs for stable data (product descriptions: 24h). No code needed beyond setting the TTL at write time.

2. Delete on Write (Most Common)
When the application writes to the database, it also deletes the corresponding cache key. The stale window is essentially zero. This is the standard approach for cache-aside and should be your default for any data that changes via your application's write path.

3. Event-Driven Invalidation (Best for Microservices)
When data changes in the database, a change event is published to a message queue (Kafka). A cache invalidation consumer listens for these events and deletes the corresponding cache keys. This decouples the writing service from the caching logic.

LinkedIn Example

When the Profile Service updates a user's job title in PostgreSQL, it publishes a profile.updated event to Kafka. The Feed Service, Search Service, and Messaging Service each have consumers that invalidate their respective cache entries for that user. No service needs to know about the others' caches.

4. Version Stamping (No Explicit Deletion)
Append a version number to the cache key: user:42:v7. When the user's data changes, increment the version to v8. The application always reads from the latest version key. The old key user:42:v7 expires naturally via its TTL. This approach never requires explicit cache deletion — the old data simply becomes unreachable.

5. Stale-While-Revalidate (Best User Experience)
The cache serves the stale value immediately (zero latency for the user) while triggering a background refresh. The user gets the old value instantly, and the next request gets the fresh value. HTTP's Cache-Control: stale-while-revalidate header implements this at the CDN/browser level.

04

Cache Stampede Solutions

A cache stampede (thundering herd) occurs when a popular cache key expires and hundreds or thousands of concurrent requests all experience a cache miss simultaneously, potentially overwhelming the database.

Cache stampede — thundering herd problem and solutions
Figure 5: Cache stampede — a single key expiry can generate thousands of simultaneous DB queries. Solutions: lock + single refill, jittered TTL, stale-while-revalidate.

Solution 1: Lock + Single Refill
First request acquires a Redis SETNX lock, queries DB, repopulates cache. Other requests wait 10–50ms and retry from cache. Result: 1 database query instead of 5,000. This is the most effective solution for high-concurrency keys.

Solution 2: Jittered TTL
TTL = 3600 + random(0, 300). Keys expire at slightly different times, preventing simultaneous mass-expiry events. Very low complexity — just add a random offset when setting the TTL. Best for systems with many similar keys (product pages, user profiles).

Solution 3: Stale-While-Revalidate
Track a "soft TTL" (data is stale) and "hard TTL" (data is deleted). Return stale data + trigger background refresh. The key never truly expires, so there is no stampede. Best user experience — the user always gets a fast response.

SolutionComplexityEffectivenessUse When
Lock + Single RefillMediumExcellentHigh-concurrency keys (product pages, feeds)
Jittered TTLLowGood for mass expiryMany keys expiring at similar times
Stale-While-RevalidateHighBest UXLatency-critical user-facing data
Pre-warmingLowPrevents cold startPredictable traffic patterns (scheduled)
Interview Tip

Proactively mention stampede prevention when discussing caching — it shows production awareness. Say: "I'd add jitter to TTLs to prevent mass simultaneous expiry, and use a lock-based refill pattern for the most popular keys."

Topic 3

The Hot Key Problem

05

When One Key Gets All the Traffic

The hot key problem occurs when a single cache key receives disproportionately high traffic, overwhelming the Redis node that stores it. Even though Redis can handle 100,000+ ops/sec, a single hot key during a flash sale or viral event can generate 1,000,000+ reads/sec to one key.

Hot Key vs Cache Stampede — Key Difference

Cache Stampede: occurs when a key expires and many requests miss simultaneously. The key is absent.
Hot Key: occurs when a key exists but is read so frequently it overwhelms the Redis node. The key is always present — the node is overloaded by volume of reads.

Hot key problem — single Redis node overwhelmed by extreme read traffic
Figure 6: Hot key problem — one key generates millions of reads per second to a single Redis node, overwhelming its capacity
Real-World Hot Key Examples
  • Flash Sale (Amazon/Flipkart): product:iphone15 gets 1 million reads/sec during a sale launch.
  • Viral Content (Twitter/Instagram): tweet:12345 for a viral post gets 500K reads/sec.
  • Global Config (Feature Flags): feature_flags:global is read by EVERY request in the system. With 100K RPS, this single key gets 100K reads/sec on one Redis node.
Hot key solutions — local cache, key replication, key splitting
Figure 7: Three solutions for hot keys — two-tier local cache (best default), key replication across nodes, and key splitting for extreme traffic

Solution 1: Two-Tier Caching (Local + Distributed) — The Default
Add a small in-process LRU cache (local cache) within each application server, in front of Redis. Hot keys are cached locally with a very short TTL (5–10 seconds). With 10 app servers, Redis receives at most 2 reads/second for that key (10 servers × 1 miss per 5 seconds) instead of 1,000,000 reads/sec. That is a 500,000x reduction in Redis load.

Solution 2: Key Replication (Read Replicas)
Replicate the hot key to multiple Redis nodes: product:iphone15:r0 through product:iphone15:r4, all containing the same data. Application hashes a random suffix to choose a replica. With 5 replicas, each handles 1/5th of the traffic.

Solution 3: Key Splitting (Sharding a Single Key)
Split the hot key's data across multiple keys: product:iphone15:0 through product:iphone15:9. Each sub-key maps to a different Redis node via consistent hashing. The application randomly picks a sub-key for each read. Trade-off: writes must update all 10 sub-keys.

SolutionRead ReductionWrite ImpactStalenessComplexityBest For
Local LRU Cache~99.9%None5–10 secondsLowDefault solution for all hot keys
Key Replication1/N per replicaN writes per updateMillisecondsMediumKnown hot keys with fast updates
Key Splitting1/N per shardN writes per updateMillisecondsHighExtreme traffic (1M+ RPS)
Rate LimitingControlledNoneNoneLowProtect Redis from abuse
Instagram's Hot Key Detection

When a key exceeds 10K reads/sec, it is automatically promoted to a 'hot key list' that gets cached locally on every app server with a 5-second TTL. The local cache is an in-process Python dict with LRU eviction (max 1,000 keys). This happens transparently without any manual configuration.

Interview Tip

Proactively raise the hot key problem for any system with caching. Say: "One concern I'd flag is hot keys — during a flash sale, a single product key could get millions of reads per second on one Redis node. I'd add a local in-process cache with a 5-second TTL on each app server to absorb 99% of that traffic." Raising problems before the interviewer asks shows senior-level thinking.

Topic 4

CDN Architecture Deep Dive

06

Inside a CDN Edge: What Happens at Each Step

A CDN (Content Delivery Network) is a globally distributed network of edge servers (Points of Presence or PoPs) that cache content close to users. A CDN PoP handles TLS termination, security filtering, caching, compression, and intelligent routing — all in the 5–10ms between receiving a request and returning a response.

CDN architecture — PoP internals, Anycast DNS, TLS, WAF, and caching layers
Figure 8: CDN PoP internals — Anycast routing, TLS termination, WAF/DDoS filtering, edge cache lookup, and response optimization in sequence

Step-by-Step CDN Request Flow:

  1. DNS Resolution (Anycast): CDN uses Anycast routing so the DNS query returns the IP of the nearest PoP (e.g., Mumbai PoP at 10ms vs. Virginia origin at 150ms).
  2. TLS Termination at Edge: The PoP decrypts the TLS connection at the edge, eliminating the TLS round-trip to the origin. Saves 40–80ms for users far from origin.
  3. WAF and DDoS Filtering: Web Application Firewall blocks SQL injection, XSS, and known attack patterns. DDoS mitigation absorbs attack traffic at the edge. Cloudflare's edge can absorb multi-terabit attacks.
  4. Edge Cache Lookup: PoP checks local cache. Cache HIT: response returned immediately (~5ms total). Cache MISS: request forwarded to origin. Edge hit rate is typically 60–80% for well-configured sites.
  5. Origin Fetch (on miss): Edge forwards to origin server. Origin response is stored in edge cache (per Cache-Control headers) and returned to user.
  6. Response Optimization: PoP compresses (gzip/Brotli), converts images (WebP), minifies HTML/CSS/JS, and adds security headers (HSTS, CSP).
07

CDN Cache Invalidation

CDN caches are distributed across hundreds of PoPs worldwide. CDN providers offer a Purge API that invalidates content across all PoPs simultaneously, but propagation takes 5–30 seconds.

CDN cache invalidation — purge API propagation across global PoPs
Figure 9: CDN invalidation — purge API propagates to all PoPs in 5–30 seconds. Versioned URLs eliminate the need for purging static assets entirely.

Best Practice: Versioned URLs + TTL
Use fingerprinted URLs for all static assets (e.g., style.a3f2b1.css). When the file changes, the filename changes and CDN naturally fetches the new version. Set max-age=31536000 (1 year) for static assets. For dynamic content, use s-maxage=60 to s-maxage=300 with stale-while-revalidate.

Content TypeCDN StrategyTTLInvalidation
JS / CSS / FontsFingerprinted URL1 yearNew deploy = new filename
ImagesFingerprinted or versioned1 yearNew upload = new URL
Product page HTMLTTL + stale-while-revalidate5 min + 1hr staleAuto-refresh on access
API response (public)Cache-Control: s-maxage1–10 minTTL expiry or Purge API
API response (private)Cache-Control: private, no-store0Never cached at CDN
User-specific dataNever cache at CDNN/ACache-Control: private
Cloudflare's Architecture: 300+ PoPs

Cloudflare operates 300+ PoPs in 100+ countries. Each PoP runs: Anycast BGP routing, TLS termination (with 0-RTT session resumption), a Lua-based WAF, a tiered cache (RAM → SSD → origin), Argo Smart Routing, and Workers (serverless functions at the edge). A typical Cloudflare-served request completes in under 10ms globally.

Putting It All Together

Production Caching Architecture

In a production system, caching is a pipeline of progressively more expensive lookups. Each layer catches a fraction of requests, dramatically reducing load on layers below. Only ~2% of all read requests ever reach the database.

Complete production caching architecture — all six layers with hit rates and latencies
Figure 10: Complete production caching architecture — browser → CDN → reverse proxy → local cache → Redis → database. Each layer's hit rate, latency, and TTL strategy.
LayerTechnologyHit RateLatencyTTL StrategyEviction
BrowserHTTP Cache-Control~50%0msmax-age per content typeN/A
CDN EdgeCloudflare / CloudFront~25%1–5mss-maxage + stale-while-revalidateTTL + purge API
Reverse ProxyNginx proxy_cache~10%1–2msproxy_cache_validLRU (keys_zone)
Local CacheIn-process LRU (dict)~5%<0.01ms5–10 seconds (short!)LRU (fixed size)
RedisRedis Cluster~8%0.5–1ms1 hour (data-dependent)allkeys-lru
DatabasePostgreSQL buffer pool~2%5–50msAutomaticAutomatic (LRU pages)

Class Summary

Four Topics, One Framework

Cache-Aside vs Write-Through: Cache-aside is the default for 90% of systems (lazy loading, delete on write). Write-through is for consistency-critical data (synchronous double write). Write-behind is for write-heavy metrics (async flush). Always delete cache keys on write, never update — avoid race conditions.

Cache Invalidation: Five strategies: TTL expiry (simplest), delete-on-write (most common), event-driven (best for microservices), version stamping (no explicit deletion), stale-while-revalidate (best UX). Combine TTL + delete-on-write as the production default.

Cache Stampede: Occurs when a popular key expires and thousands of requests hit DB simultaneously. Best solutions: lock + single refill (most effective), jittered TTL (prevents mass expiry), stale-while-revalidate (best UX). Always add jitter to TTLs in production.

Hot Key Problem: One key overwhelms a Redis node with extreme read traffic. Default solution: two-tier cache with local in-process LRU (5s TTL) in front of Redis — reduces Redis load by 99.9%. Also consider key replication and key splitting for extreme cases (>1M RPS).

CDN Architecture: Edge PoPs handle Anycast DNS, TLS termination, WAF/DDoS, caching, and compression. Use fingerprinted URLs for static assets (1-year TTL), s-maxage for dynamic content, and Cache-Control: private for user-specific data. CDN offloads 60–80% of origin traffic and absorbs DDoS attacks.

Track Your DSA Progress — It's Free

Stop solving random questions. Start with the right 206 questions across 16 patterns — structured, curated, and completely free.

206 curated questions 16 patterns covered Google login · Free forever
Create Free Account →