Part 1

Complete Caching Quiz — 30 Questions

Caching quiz — 30 questions across 4 sections
30 questions covering all Class 6 caching topics — target 24+ correct (80%+) to be interview-ready
0 / 30
Questions Revealed
Target: 24+ correct (80%) to be interview-ready
Q1 In cache-aside, when data is written to the database, the application should:
  • A Update the cache with the new value
  • B Delete the cache key and let the next read repopulate
  • C Do nothing to the cache
  • D Write to the cache first, then the database
Correct Answer: B

In cache-aside, on write: update DB, then DELETE the cache key. The next read triggers a cache miss and repopulates from DB. Deleting avoids race conditions from concurrent updates where two threads could leave stale data in the cache.

Q2 Write-through caching is best for:
  • A Write-heavy analytics workloads
  • B Data that must never be stale (permissions, config)
  • C Reducing write latency
  • D Systems with no caching at all
Correct Answer: B

Write-through writes to cache + DB synchronously, ensuring the cache is always consistent. Best for data that must never be stale: user permissions, config, feature flags.

Q3 Write-behind (write-back) caching risks:
  • A Slower read performance
  • B Data loss if the cache crashes before flushing to the database
  • C Higher read latency
  • D Increased database load
Correct Answer: B

Write-behind writes to cache first and flushes to DB asynchronously. If the cache crashes before flushing, those writes are lost. Only use for data where brief loss is acceptable (view counts, metrics).

Q4 Which caching pattern is used by approximately 90% of production systems?
  • A Write-through
  • B Write-behind
  • C Cache-aside (lazy loading)
  • D Refresh-ahead
Correct Answer: C

Cache-aside (lazy loading) is used by ~90% of production systems. The app checks cache, on miss queries DB and populates cache. Simple, effective, and fault-tolerant (cache failure just means more DB reads).

Q5 Redis is approximately how much faster than a PostgreSQL disk read?
  • A 2x faster
  • B 10x faster
  • C 100x faster
  • D 10,000x faster
Correct Answer: D

Redis read: ~0.5ms. PostgreSQL disk read: ~5ms (SSD) to 50ms+ (cold data). That is 10,000x to 100,000x faster. This is why caching is the #1 performance optimization in system design.

Q6 In cache-aside, why is DELETE preferred over UPDATE for the cache on writes?
  • A DELETE is faster than UPDATE
  • B DELETE avoids race conditions where concurrent updates leave stale data
  • C UPDATE is not supported by Redis
  • D There is no difference
Correct Answer: B

With concurrent writes, UPDATE can result in stale data if writes execute out of order: Thread A writes DB (v1), Thread B writes DB (v2), Thread B updates cache (v2), Thread A updates cache (v1 — stale!). DELETE avoids this: the next read always gets the latest from DB.

Q7 A 95% cache hit rate with 10,000 requests/sec means the database receives:
  • A 10,000 requests/sec
  • B 9,500 requests/sec
  • C 500 requests/sec
  • D 0 requests/sec
Correct Answer: C

95% hit rate means 95% of 10,000 = 9,500 served by cache. Only 5% (500) reach the database. This 20x reduction is why caching is transformative for database scaling.

Q8 Refresh-ahead caching proactively refreshes cache entries:
  • A After they expire
  • B Before they expire, preventing any cache misses
  • C Only when the database changes
  • D Never — it relies on TTL only
Correct Answer: B

Refresh-ahead proactively refreshes cache entries before they expire, based on predicted access patterns. This eliminates cache misses entirely for predictable workloads (e.g., trending content refreshed every 30s).

Q9 TTL-based cache invalidation means:
  • A The cache key is deleted when the database changes
  • B The cache key auto-deletes after a configured time period
  • C The cache never expires
  • D The cache is deleted when memory is full
Correct Answer: B

TTL (Time-To-Live) sets an automatic expiry on cache keys. After the TTL, the key is deleted. The stale window equals the TTL. Simplest invalidation strategy — no code needed beyond setting TTL at write time.

Q10 Event-driven cache invalidation using Kafka is best for:
  • A Single-service applications
  • B Microservices where multiple services cache the same data
  • C Systems with no message queue
  • D Reducing cache hit rates
Correct Answer: B

In microservices, multiple services cache overlapping data. Event-driven invalidation via Kafka lets each service manage its own cache independently — the writing service publishes an event, each caching service handles its own invalidation. Fully decoupled.

Q11 LRU eviction removes the key that was:
  • A Inserted first (oldest)
  • B Accessed least frequently overall
  • C Accessed least recently (longest time since last access)
  • D Randomly selected
Correct Answer: C

LRU = Least Recently Used. It evicts the key that has not been accessed for the longest time. Based on temporal locality — recently used data is likely to be used again soon.

Q12 LFU eviction is better than LRU when:
  • A All keys are accessed equally
  • B There is a stable set of very popular keys that must survive eviction
  • C The cache is very small
  • D Keys are never accessed more than once
Correct Answer: B

LFU preserves frequently accessed keys even if they have not been accessed recently. A product viewed 100K times survives eviction over a product viewed twice yesterday. Better for stable hot datasets.

Q13 The recommended Redis eviction policy for production is:
  • A noeviction (return errors when full)
  • B allkeys-lru
  • C volatile-random
  • D allkeys-random
Correct Answer: B

allkeys-lru evicts the least recently used key across all keys when memory is full. noeviction (the Redis default!) returns errors when full — dangerous in production. Always explicitly configure your eviction policy.

Q14 Stale-while-revalidate serves:
  • A An error while fetching fresh data
  • B The stale cached value immediately while refreshing in the background
  • C Only fresh data, blocking until the database responds
  • D Random data from the cache
Correct Answer: B

Stale-while-revalidate immediately returns the stale cached value (zero latency for user) and refreshes in the background. The user gets an instant response, and the next request gets fresh data. Best user experience for latency-sensitive content.

Q15 Cache-Control: no-cache means:
  • A Never store this response in any cache
  • B The response can be cached but must be revalidated before serving
  • C Cache forever with no expiry
  • D Only the CDN can cache this response
Correct Answer: B

no-cache does NOT mean 'don't cache.' It means 'you can cache, but must revalidate with the origin before serving.' To actually prevent caching, use no-store. This is one of the most common interview misconceptions.

Q16 A cache stampede occurs when:
  • A The cache is too small
  • B A popular key expires and thousands of requests hit the database simultaneously
  • C Redis crashes
  • D Too many keys are added to the cache
Correct Answer: B

Cache stampede = popular key expires + thousands of concurrent requests all miss cache + all query DB simultaneously = DB overloaded. Different from hot key (key exists but overwhelms one node).

Q17 The most effective solution for cache stampede is:
  • A Increasing the TTL to infinity
  • B Lock + single refill: one request repopulates cache while others wait
  • C Removing the cache entirely
  • D Using a bigger database
Correct Answer: B

Lock + single refill: first request acquires a distributed lock (SETNX), queries DB, repopulates cache. Others wait ~20ms and retry from cache. Result: 1 DB query instead of thousands.

Q18 Adding random jitter to TTL (e.g., TTL = 3600 + random(0,300)) prevents:
  • A Hot key problems
  • B Mass key expiry at the same time (reducing stampede risk)
  • C Cache misses entirely
  • D Database writes
Correct Answer: B

Jittered TTL prevents mass expiry: instead of all keys expiring at exactly 3600s, they expire at 3600+random(0,300)s. This spreads expirations over 5 minutes, preventing simultaneous stampedes.

Q19 The hot key problem is different from cache stampede because:
  • A Hot key: the key exists but overwhelms one Redis node. Stampede: the key expired.
  • B They are the same problem
  • C Hot key only affects databases, not caches
  • D Stampede only happens with CDNs
Correct Answer: A

Hot key: key EXISTS in cache but gets so many reads it overwhelms the Redis node. Stampede: key EXPIRED and many requests simultaneously miss cache and hit DB. Different problems, different solutions.

Q20 The best solution for a hot key getting 1M reads/sec is:
  • A Increase Redis memory
  • B Add a local in-process LRU cache (5s TTL) on each app server
  • C Delete the key from cache
  • D Increase the TTL to 24 hours
Correct Answer: B

Local in-process LRU with short TTL (5s) on each app server. 95%+ of reads served from local memory at 0.01ms. Redis sees only 1 miss per server per 5 seconds instead of millions of reads.

Q21 With two-tier caching (local + Redis) for a hot key, if you have 10 app servers and a 5-second local TTL, Redis receives approximately:
  • A 1M reads/sec (no reduction)
  • B 100K reads/sec
  • C ~2 reads/sec (10 servers × 1 miss per 5 seconds)
  • D 0 reads/sec
Correct Answer: C

10 servers, each caching locally for 5 seconds. Each server misses once every 5 seconds. 10 × (1/5) = 2 Redis reads/sec. Down from 1,000,000 reads/sec — a 500,000x reduction in Redis load.

Q22 Key splitting (e.g., key_0, key_1, ..., key_9) solves hot keys by:
  • A Making the key shorter
  • B Distributing reads across multiple Redis nodes via different hash slots
  • C Encrypting the key
  • D Deleting the key faster
Correct Answer: B

Splitting key into key_0 through key_9 distributes these sub-keys across different Redis hash slots (and therefore different nodes). Reads are spread across 10 nodes instead of hammering one.

Q23 A CDN reduces latency primarily by:
  • A Compressing data to 1% of original size
  • B Serving cached content from edge servers geographically close to users
  • C Upgrading the user's internet connection
  • D Running faster database queries
Correct Answer: B

CDNs cache content at edge servers in 300+ global locations. A user in Delhi gets content from Mumbai (10ms) instead of Virginia (150ms). The geographic proximity is the primary latency benefit.

Q24 A Pull CDN fetches content from the origin:
  • A Before any user requests it
  • B On the first user request (cache miss), then caches for subsequent requests
  • C Never — content must be manually uploaded
  • D Only when the origin server crashes
Correct Answer: B

Pull CDN fetches from origin on the first request (cache miss), caches the response at the edge, and serves subsequent requests from cache. This is the default mode for Cloudflare and CloudFront.

Q25 Fingerprinted URLs (e.g., app.a3f2b1.js) are used with CDNs to:
  • A Encrypt the file content
  • B Ensure cache busting: new content gets a new URL, bypassing stale cache
  • C Reduce file size
  • D Improve SEO ranking
Correct Answer: B

Fingerprinted URLs contain a hash of the file content (app.a3f2b1.js). When content changes, the hash changes, creating a new URL. The CDN treats it as new content, bypassing any stale cache. This lets you set 1-year TTLs safely.

Q26 Cache-Control: private means:
  • A No one can cache this response
  • B Only the user's browser can cache it, not CDN or shared caches
  • C Only the CDN can cache it
  • D The response requires a password
Correct Answer: B

private means only the end user's browser can cache this response. CDN, proxy servers, and shared caches must NOT cache it. Use for user-specific data (dashboards, profiles, account pages).

Q27 ETag headers help caching by:
  • A Encrypting the response
  • B Allowing the server to return 304 Not Modified if content has not changed
  • C Setting the TTL automatically
  • D Blocking DDoS attacks
Correct Answer: B

ETag is a content fingerprint. On subsequent requests, the client sends If-None-Match: {etag}. If content has not changed, the server returns 304 Not Modified (no body), saving bandwidth while still validating freshness.

Q28 In a complete caching architecture, what percentage of requests typically reach the database?
  • A 50%
  • B 25%
  • C ~2–5% (most served by cache layers above)
  • D 0% (the database is never needed)
Correct Answer: C

With browser cache (~50%), CDN (~25%), proxy cache (~10%), local cache (~5%), and Redis (~8%), approximately 98% of reads are served by caches. Only ~2–5% reach the database.

Q29 Which caching layer has the highest hit rate but the smallest capacity?
  • A CDN edge
  • B Redis cluster
  • C Browser cache
  • D Database buffer pool
Correct Answer: C

Browser cache has the highest hit rate (~50% of all requests) because it serves repeat visits at 0ms with zero server resources. But it is limited to one user's data on one device — smallest capacity scope.

Q30 For user-specific data (dashboards, profiles), the correct Cache-Control header is:
  • A Cache-Control: public, max-age=86400
  • B Cache-Control: private, no-store
  • C Cache-Control: s-maxage=3600
  • D No Cache-Control header needed
Correct Answer: B

User-specific data should use Cache-Control: private (browser only, not CDN) and no-store for sensitive data. public would let CDNs cache personalized content, potentially serving user A's data to user B.

Part 2

Design Cache Layer for Instagram Feed

The Challenge: Serve 600K Feed Requests/Second

Instagram's feed is one of the most demanding caching problems in the industry. Every time a user opens the app, the feed service must assemble a personalized list of 50 posts from hundreds of followed accounts — complete with author info, like counts, comment previews, and a machine-learning-ranked ordering — all within 200 milliseconds. At 500 million daily active users, this translates to approximately 600,000 feed requests per second at peak. Without caching, this would require millions of database queries per second, which is physically impossible.

Instagram feed overview — 50 posts, 10 data fields each, 600K requests/sec
Figure 1: Instagram feed — each request assembles 50 posts, each with ~10 data fields, at 600K requests/sec peak
Step 1: Feed Generation Strategy

The first design decision is how feeds are generated. There are two fundamental approaches: fan-out on write (pre-compute the feed when a user posts) and fan-out on read (compute the feed when a user opens the app). Instagram uses a hybrid of both.

Fan-out on write vs fan-out on read — push vs pull model
Figure 2: Fan-out on write pushes post IDs to followers at write time. Fan-out on read computes the feed at request time. Each has different trade-offs.

Fan-Out on Write (Push Model): When Alice posts a photo, a fan-out service immediately pushes Alice's post ID into the cached feed of every one of Alice's followers. When Bob opens Instagram, his feed is already pre-computed in Redis — just read and return. Reads are extremely fast (one Redis read) but writes are expensive for users with many followers. If Alice has 10,000 followers, her single post triggers 10,000 cache writes.

Fan-Out on Read (Pull Model): When Bob opens Instagram, the feed service fetches the latest posts from each account Bob follows, merges them, ranks them, and returns the top 50. No pre-computation, no wasted work. But reads are slower: if Bob follows 500 accounts, the service must fetch from 500 post lists, merge, and rank in real-time.

Hybrid approach — fan-out on write for normal users, fan-out on read for celebrities
Figure 3: Hybrid approach — fan-out on write for normal users (<10K followers), fan-out on read for celebrities (>10K followers), merged at read time
Instagram's Hybrid Approach
  • Normal users (<10K followers): Fan-out on write. Post ID pushed to all followers' feed caches immediately. Fast fan-out (10K writes per post).
  • Celebrities (>10K followers): NO fan-out on write. Posts stored centrally. At read time, the feed service fetches celebrity posts the user follows and merges them in.
  • Read-time merge: Service reads Bob's pre-computed feed (normal-user posts), fetches celebrity posts Bob follows, merges, applies ML ranking, returns top 50. The merge adds ~5ms — worthwhile to avoid writing to millions of caches per celebrity post.
Interview Tip: Always Mention the Hybrid

'I use fan-out on write for normal users because it gives O(1) read time from cache. For celebrity users with millions of followers, I skip fan-out and fetch their posts at read time to avoid writing to millions of caches. The feed service merges both at read time.' This shows you understand the celebrity problem and its solution.

Step 2: Cache Architecture
Complete Instagram feed cache architecture — CDN, three Redis clusters, PostgreSQL, Cassandra
Figure 4: Complete feed cache architecture — CDN for images, three separate Redis clusters (Feed, Post, User), PostgreSQL and Cassandra as sources of truth

The feed cache architecture uses three separate Redis clusters, each optimized for a different data type:

  • Feed Cache: Stores the pre-computed list of post IDs per user (feed:{user_id})
  • Post Cache: Stores post metadata — caption, image URL, timestamp (post:{post_id})
  • User Cache: Stores author profiles — name, avatar, verified badge (user:{user_id})

This separation allows independent scaling, TTL tuning, and eviction policies per data type. The feed cache needs aggressive LRU eviction and large capacity. The post cache needs a medium TTL. The user cache needs event-driven invalidation on profile updates.

Step 3: Feed Load Flow
Step-by-step feed load — 6 steps with latencies, ~10ms total
Figure 5: Six-step feed load — MGET for batch fetching drives the 10ms total latency, well within the 200ms target

When Bob opens Instagram, the feed service executes six steps:

  1. Read feed:bob from Feed Cache to get 200 pre-computed post IDs (<1ms).
  2. Slice the top 50 for this page using Bob's pagination cursor (<0.1ms).
  3. MGET all 50 post details from Post Cache in a single Redis round-trip (~1ms, ~45 cache hits).
  4. Fetch the ~5 cache-miss posts from PostgreSQL (~5ms).
  5. MGET unique author profiles from User Cache (~0.5ms).
  6. Merge, rank with ML model, and return JSON (~2ms).

Total: approximately 10ms — well within the 200ms target.

The Power of MGET

Redis MGET fetches multiple keys in a single round-trip. Instead of 50 individual GET commands (50 round-trips at 0.5ms each = 25ms), one MGET retrieves all 50 posts in a single 1ms round-trip. That is a 25x latency improvement. Always use MGET/MSET for batch operations in production Redis.

Step 4: Cache Key Design
Cache key patterns — feed, post, user, likes, counts, stories
Figure 6: Cache key patterns with TTLs and invalidation triggers for each data type in Instagram's feed system
Key PatternValueTTLInvalidation Trigger
feed:{user_id}List of 200 post IDs24 hoursNew post from followee (LPUSH)
post:{post_id}JSON: caption, image_url, timestamp1 hourPost edited/deleted (DEL)
user:{user_id}JSON: name, avatar, is_verified1 hourProfile updated (DEL)
likes:{post_id}:{user_id}1 or 0 (boolean)6 hoursUser likes/unlikes (SET/DEL)
counts:{post_id}JSON: like_count, comment_count30 secondsAny like/comment (INCR or TTL refresh)
story:{user_id}List of active story IDs30 minutesNew story posted / story expires
Why counts Need a Short TTL

Like and comment counts change every second on popular posts. A 1-hour TTL would show stale counts. Use a 30-second TTL for counts, or implement a pub/sub approach where counts are updated in real-time via Redis INCR. For the exact count shown on a post detail page, always read from the database.

The feed cache stores only post IDs (not full post data) because post IDs are tiny (~8 bytes each) while full post data is large (~500 bytes). Storing 200 post IDs per user costs 1.6 KB per user. With 500 million users, the feed cache needs approximately 800 GB — achievable with a 50-node Redis cluster at 16 GB per node.

Step 5: Cache Invalidation
Event-driven cache invalidation via Kafka — five events and their Redis operations
Figure 7: Five events that trigger cache invalidation — all routed through Kafka for async, decoupled invalidation

All cache invalidation is event-driven via Kafka. When an action occurs (new post, like, delete, profile update, unfollow), the responsible service publishes an event to Kafka. Cache Invalidation Workers consume these events and perform the appropriate Redis operations. This decouples the action from the cache update — the write path is never slowed down by cache operations.

EventKafka TopicCache OperationLatency Impact
New post (normal user)post.createdLPUSH to feed:{each_follower} + LTRIM to 200Async, ~100ms for 10K followers
New post (celebrity)post.createdStore in posts table only (no fan-out)~1ms (no cache write)
Post likedpost.likedINCR counts:{post_id}:likes + SET likes:{post}:{user}Async, <1ms
Post deletedpost.deletedDEL post:{id} + LREM from affected feedsAsync, ~50ms
Profile updateduser.updatedDEL user:{user_id}Async, <1ms
Unfolloweduser.unfollowedRemove author's posts from feed:{follower}Async, ~10ms
Step 6: Hot Key Handling
Instagram feed hot key scenarios — celebrity posts, trending hashtags, global config
Figure 8: Three Instagram hot key scenarios and their solutions — celebrity posts, trending hashtags, and global config each require a different approach

Instagram's feed has three hot key scenarios:

  • Celebrity posts: A viral post from a 100M-follower account generates millions of reads to post:{id}. Solution: local LRU cache on every feed service instance with 5-second TTL reduces Redis load by 99.9%.
  • Trending hashtags: Generate 500K reads/sec to trending:reels. Solution: replicate the key across 10 Redis read replicas.
  • Global config: Feature flags, ML model version is read on every single request. Solution: local cache refreshed every 10 seconds — zero Redis reads in steady state.
Interview Tip: Hot Key Is Your Differentiator

Most candidates mention caching. Few mention hot keys. Proactively say: 'For celebrity posts that go viral, post:{id} could receive 1M reads/sec. I add a local in-process LRU with 5s TTL on every feed service instance. Redis sees 2 reads/sec instead of 1M. For global config read on every request, I cache locally with 10s refresh — zero Redis overhead.' This shows production-level thinking.

Step 7: Design Checklist
12-point design checklist for Instagram feed cache
Figure 9: 12-point design checklist — use this framework to validate any feed cache design in an interview
AspectDesign DecisionWhy
Feed StrategyHybrid: push (normal) + pull (celebrity)Avoids writing to 10M caches per celebrity post
Feed CacheRedis: feed:{user_id} = List<post_id>Stores 200 IDs per user, 1.6 KB each. 50-node cluster.
Post CacheRedis: post:{post_id} = JSONDenormalized metadata. MGET for batch fetch.
User CacheRedis: user:{user_id} = JSONAuthor profile denormalized. 1-hour TTL.
Count CacheRedis: counts:{post_id} = JSON30-second TTL. INCR for likes. Short TTL for freshness.
InvalidationKafka events → Invalidation WorkersAsync, decoupled. No write-path latency impact.
Hot KeysLocal LRU (5s) for viral posts + config99.9% read reduction on hot keys.
StampedeLock + single refill on feed:{id}Prevents 10K DB queries when popular feed expires.
CDNImages only. NOT feed API responses.Feed is personalized — cannot cache at CDN.
PaginationCursor-based on cached ID listClient sends cursor, server slices from cache.
Evictionallkeys-lru on all Redis clustersBest general-purpose eviction for web workloads.
Scale50+ Redis nodes, consistent hashing~800GB for feed cache, ~200GB for post cache.
This Template Applies to Any Feed System

The Instagram feed cache design is a template for Twitter timelines, Facebook news feeds, LinkedIn feeds, TikTok For-You pages, and any content discovery system. The patterns are identical: hybrid fan-out, multi-key Redis caching (feed IDs + entity cache), MGET for batch reads, event-driven invalidation via Kafka, local LRU for hot keys, and cursor-based pagination. Master this design and you can apply it to any feed-based interview question.

Track Your DSA Progress — It's Free

Stop solving random questions. Start with the right 206 questions across 16 patterns — structured, curated, and completely free.

206 curated questions 16 patterns covered Google login · Free forever
Create Free Account →