HomeBlog › System Design Patterns
System Design · India 2026

System Design Interview Patterns for Indian Product Companies 2026: HLD, LLD, Scalability & Top Questions

Pranjal Jain · Ex-Microsoft, IIT Kanpur May 27, 2026 22 min read SDE2 / SDE3 / Senior

System design is the single biggest differentiator between SDE1 and SDE2 offers at Indian product companies. DSA gets you in the door; system design determines your level and compensation. Yet most engineers prepare for it the wrong way — memorizing "design Twitter" answers without understanding the underlying patterns that repeat across every question.

This guide teaches you the 12 core patterns that appear in 90% of system design questions at Google, Amazon, Flipkart, Swiggy, Razorpay, Zomato, and other Indian product companies. Learn these patterns, and you can design any system — not just the ones you memorized.

12
Core patterns that cover 90% of questions
SDE2+
Level where system design is mandatory
45 min
Typical system design round duration
7 steps
Universal HLD framework

1. The Universal System Design Framework (Use This Every Time)

Before learning patterns, internalize this 7-step framework. Every system design interview, regardless of company or problem, should follow this structure. It shows the interviewer you think like a senior engineer — requirements first, implementation last.

1

Clarify Requirements (5 min)

Functional requirements: what must the system do? Non-functional: scale (DAU, QPS), latency SLAs, availability target (99.9% vs 99.999%). Ask: "Is this read-heavy or write-heavy? Strong or eventual consistency?"

2

Capacity Estimation (3 min)

Back-of-envelope: QPS = DAU × requests/day ÷ 86400. Storage = entities × size × retention. Bandwidth = QPS × payload size. These numbers drive every architectural decision.

3

Define Core APIs (5 min)

Identify the 3–5 most important endpoints. Define request/response structure. This forces clarity on what the system actually does before you design how it does it.

4

Design the Data Model (5 min)

Identify core entities and relationships. Choose SQL vs NoSQL with justification. Define the primary key and most important indexes. Schema decisions have long-term consequences — make them explicit.

5

High-Level Architecture (15 min)

Draw the system: Client → CDN → Load Balancer → Services → Cache → DB → Message Queue. Explain every component choice. Don't just list technologies — explain why you chose Redis over Memcached, or Kafka over RabbitMQ.

6

Deep Dive (15 min)

The interviewer will guide you to 1–2 specific components to design in detail. This is where you demonstrate pattern depth. Common deep-dives: the caching strategy, the database sharding approach, the message queue design.

7

Trade-offs & Failure Scenarios (5 min)

What breaks first under load? What's your single point of failure? How do you handle a database failure? What's the consistency trade-off you've accepted? Senior engineers think about failure modes — this section shows maturity.

2. The 12 Core System Design Patterns

1
Read-Heavy Systems: Caching Strategy
Redis · CDN · Cache Invalidation · 80% read traffic pattern

Most user-facing systems are 80–90% reads. The pattern: put a cache layer in front of the database. Cache-aside (lazy loading) is the most common strategy — check cache first, miss → fetch from DB → populate cache.

  • L1 Cache: In-process memory cache (Guava Cache, Caffeine) — sub-millisecond, but per-instance, inconsistent
  • L2 Cache: Redis cluster — shared across instances, millisecond latency, supports rich data types
  • L3 Cache: CDN (CloudFront, Akamai) — for static assets and API responses that change rarely
  • Cache Invalidation Strategies: TTL (simple, stale risk) | Write-through (consistent, extra write) | Write-behind (async, faster writes, data loss risk)
  • Cache Stampede Problem: Thundering herd on cache miss → use probabilistic early expiration or request coalescing
🎯 Use when asked: Design Twitter timeline, design news feed, design product catalog, design any dashboard with repeated queries
2
Write-Heavy Systems: Message Queue + Async Processing
Kafka · RabbitMQ · Event-Driven · Fan-out · Backpressure

When writes burst (order placement, payment events, user actions) you can't synchronously write to all downstream systems. Decouple with a message queue: write fast to the queue, process asynchronously.

  • Kafka (preferred for high-throughput, durable, replay): Order events, payment events, analytics pipelines. Retention enables replay for debugging and backfill.
  • RabbitMQ (preferred for task queues, routing complexity): Email/SMS sending, background jobs, flexible message routing with exchanges.
  • Fan-out Pattern: One event → multiple consumers. Example: "order placed" → [inventory service, notification service, analytics service, warehouse service] all consume independently.
  • Backpressure: If consumers are slow, queue grows. Solutions: auto-scaling consumers, circuit breaker on producer side, dead-letter queue for failed messages.
🎯 Use when asked: Design notification system, design order processing, design payment system, design analytics pipeline, design activity feed
3
Horizontal Scaling: Database Sharding
Consistent Hashing · Shard Key · Hotspot Problem · Cross-Shard Queries

When a single database can't handle the load (typically >10M rows or >10K QPS), horizontally partition data across multiple shards. Each shard holds a subset of the data.

  • Range-based sharding: Users A–M on shard 1, N–Z on shard 2. Simple but creates hotspots (most users might be A–M).
  • Hash-based sharding: shard = hash(userId) % N. Uniform distribution but rebalancing when adding shards is expensive.
  • Consistent hashing: Virtual ring — adding/removing a shard only affects neighboring shards. Preferred for dynamic scaling.
  • Shard Key Selection: Choose a key with high cardinality and even distribution. Bad shard key = hot shard. E.g., sharding tweets by userId distributes load evenly; sharding by creation date creates temporal hot shards.
  • Cross-shard queries: JOINs across shards are expensive. Denormalize or maintain a global index for cross-shard lookups.
🎯 Use when asked: Design URL shortener with scale, design user service at WhatsApp scale, design DynamoDB, design any system with 100M+ users
4
Rate Limiting
Token Bucket · Sliding Window Counter · Redis · API Gateway

Rate limiting controls how many requests a client can make in a time window. Essential for API protection, fair usage, DDoS mitigation.

  • Token Bucket: Each user gets N tokens/second. Each request consumes 1 token. Tokens accumulate up to a max bucket size. Allows burst traffic within burst limit. Most commonly used.
  • Sliding Window Counter: Count requests in the last N seconds using a rolling window. More accurate than fixed window but requires more storage (sorted set in Redis per user).
  • Fixed Window Counter: Reset counter every minute. Simple but boundary attack: 100 req in last second of minute 1 + 100 req in first second of minute 2 = 200 req in 2 seconds.
  • Distributed Rate Limiting: Use Redis with atomic INCR + TTL. For multi-node: central Redis cluster acts as the rate limit state store. Lua scripts for atomic check-and-increment.
🎯 Use when asked: Design a rate limiter, design an API gateway, design any service that needs abuse protection
5
Real-Time Systems: WebSockets & Server-Sent Events
WebSocket · SSE · Long Polling · Live Updates · Chat

For live data (chat messages, order tracking, stock prices, collaborative docs), polling every few seconds is wasteful. Use persistent connections for server-push.

  • WebSocket: Full-duplex persistent connection. Client and server can both send messages anytime. Best for: chat, real-time gaming, collaborative editing, live bidding.
  • Server-Sent Events (SSE): Server pushes to client over HTTP. Client can't send back. Best for: live dashboards, order status updates, notification streams. Simpler than WebSocket.
  • Long Polling: Client makes request, server holds it open until data is available, then responds. Client immediately makes another request. Simple fallback when WebSocket isn't available.
  • Connection management at scale: Each WebSocket connection is a stateful socket. 1M concurrent connections need ~1000 servers (1K connections/server is typical). Use a connection service with Redis for routing messages to the right server.
🎯 Use when asked: Design WhatsApp/Slack chat, design Swiggy order tracking, design collaborative doc editor, design live sports score updates
6
Search Systems: Inverted Index & Full-Text Search
Elasticsearch · Inverted Index · TF-IDF · Faceted Search · Geo Search

When users type a query and need to find matching documents across millions of records, traditional SQL LIKE queries don't scale. Use a search engine with an inverted index.

  • Inverted Index: Maps each word → list of documents containing that word. Query "red sneakers" → intersect documents containing "red" and documents containing "sneakers".
  • Elasticsearch: Distributed inverted index built on Lucene. Handles full-text search, faceted filtering (price range, brand), geo-spatial queries, and aggregations.
  • Relevance Ranking: TF-IDF (term frequency × inverse document frequency) as baseline. Layer on: freshness boost, personalisation signals, click-through rates.
  • Synchronization challenge: Primary data lives in MySQL/PostgreSQL. Sync to Elasticsearch via CDC (Change Data Capture) with Debezium → Kafka → Elasticsearch consumer. Eventual consistency is acceptable for search.
  • Spell correction: Edit distance (Levenshtein) for "did you mean?" features. Trie for autocomplete prefix matching.
🎯 Use when asked: Design Flipkart/Amazon product search, design LinkedIn job search, design Google autocomplete, design Twitter trending topics
7
Idempotency & Exactly-Once Processing
Idempotency Key · Exactly-Once · Distributed Transactions · Payment Safety

In distributed systems, network failures cause retries. Without idempotency, a retry can cause double charges, duplicate orders, or double sends. Every financial/critical operation must be idempotent.

  • Idempotency Key: Client generates a unique key per operation (UUID). Server stores processed keys in Redis with TTL. On retry with same key: return cached result instead of reprocessing.
  • Pattern in Payment: Client sends POST /payment { amount: 100, idempotency_key: "uuid-123" }. Server processes payment and stores uuid-123 → {success, txn_id}. On retry: return the stored result — no double charge.
  • Exactly-Once in Kafka: Kafka supports exactly-once semantics (EOS) via transactions. Producer assigns a transactional ID; broker tracks processed offsets. Consume-transform-produce pipeline is atomic.
  • Saga Pattern: For distributed transactions across services (order → payment → inventory), use a saga: sequence of local transactions with compensating transactions on failure. Avoids distributed locks.
🎯 Use when asked: Design payment gateway, design order checkout, design money transfer system, any financial or critical-operation system
8
News Feed & Social Graph: Fan-Out
Fan-out on Write · Fan-out on Read · Push vs Pull · Timeline Cache

When a user posts content that needs to appear in all their followers' feeds, you face the fan-out problem. Two strategies, each with trade-offs.

  • Fan-out on Write (Push): When user A posts → immediately write to all followers' feed caches. Pro: fast feed reads. Con: huge write amplification for celebrities (Sachin Tendulkar has 20M followers — one tweet = 20M cache writes). Solution: hybrid approach below.
  • Fan-out on Read (Pull): When user reads feed → fetch posts from all followees in real-time. Pro: simple writes. Con: slow reads for users following many people.
  • Hybrid (used by Twitter/Instagram): Fan-out on write for regular users (<1K followers). Fan-out on read for celebrity accounts (verified/high-follower users). Merge the two at read time.
  • Timeline Cache: Pre-built feed stored in Redis sorted set (post_id + timestamp as score). Feed read = ZREVRANGE on the sorted set. Extremely fast.
🎯 Use when asked: Design Twitter/Instagram feed, design LinkedIn posts, design notification delivery, design activity feed in any social app
9
Geo-Spatial Systems: Location & Proximity
Geohash · Quadtree · Spatial Index · Real-Time Location

Finding the nearest driver, restaurant, or store requires efficient geo-spatial queries. Standard SQL queries on (lat, lng) don't scale — you need spatial indexing.

  • Geohash: Encode (lat, lng) into a short string (e.g., "ttnq"). Nearby points have similar prefixes. Redis supports geohash natively: GEOADD, GEODIST, GEORADIUS.
  • Quadtree: Recursively divide the map into 4 quadrants. Each node represents a region. Leaf nodes hold location data. Good for static data; expensive to update for moving objects.
  • Real-Time Location Updates: Driver sends GPS location every 5 seconds → Kafka → Location service → Redis geohash. Rider app queries Redis for drivers within N km. For 1M drivers: 1M updates/5s = 200K writes/sec to Redis — use Redis Cluster.
  • Proximity Search on Maps: Grid-based partitioning → divide the city into N×N cells. Assign entities to cells. Query: find all entities in current cell + adjacent 8 cells.
🎯 Use when asked: Design Ola/Uber driver matching, design Swiggy delivery radius, design "stores near me" for Flipkart Quick, design Zomato restaurant discovery
10
Unique ID Generation at Scale
Snowflake · UUID · Auto-increment · Clock Skew · Sortable IDs

Distributed systems need globally unique IDs for orders, users, transactions. Auto-increment in a single DB doesn't scale. UUID is unique but not sortable by time.

  • UUID v4: 128-bit random. Guaranteed unique globally. Downside: not sortable by time, large storage, bad for DB index locality.
  • Twitter Snowflake: 64-bit ID = 41 bits timestamp + 10 bits machine ID + 12 bits sequence. Sortable by creation time. 4096 IDs/millisecond per machine. Used by Twitter, Discord, many others.
  • ULID (Universally Unique Lexicographically Sortable ID): 128-bit, URL-safe, sortable. Better than UUID for databases — maintains insert locality.
  • Clock Skew Problem: NTP drift can cause two machines to generate the same Snowflake ID. Solutions: use a logical clock, or a centralized ID service (expensive but safe).
  • Instagram Approach: Postgres stored procedure generates IDs using epoch + shard ID + sequence. Avoids centralized bottleneck while maintaining sortability.
🎯 Use when asked: Design a URL shortener, design order IDs for Razorpay, design any system that needs distributed unique identifiers
11
Circuit Breaker & Graceful Degradation
Hystrix · Resilience4j · Fallback · Bulkhead · Timeout

In microservices, one slow service can cascade failures across the entire system. Circuit breakers prevent this — they stop calling a failing service and return a fallback response instead.

  • Circuit Breaker States: CLOSED (normal) → OPEN (too many failures, fast-fail all requests) → HALF-OPEN (let a few through to test if service recovered).
  • Failure Threshold: Open circuit when error rate > N% in last M requests, or when last N requests all failed.
  • Fallback Strategies: Return cached result (stale but available); return a default/empty response; return a simplified response; redirect to a backup service.
  • Timeout: Never let a service call run indefinitely. Set timeouts at every layer (HTTP client timeout, DB query timeout, Kafka consumer timeout).
  • Bulkhead Pattern: Limit the thread pool size for each downstream service. If service A is slow, it can only consume its allocated threads — it won't starve calls to services B and C.
🎯 Use when asked: Any microservices design question, design a payment gateway (what if the bank is down?), design resilient order processing
12
Data Pipelines: Lambda & Kappa Architecture
Batch · Streaming · Spark · Flink · Data Lake · OLAP

Analytics and ML features need both real-time data (for live dashboards) and historical data (for trend analysis, model training). Lambda Architecture handles both.

  • Lambda Architecture: Batch layer (historical accuracy) + Speed layer (real-time, approximate) + Serving layer (merges both). Pro: robust. Con: dual code paths, complex maintenance.
  • Kappa Architecture: Stream-only with reprocessing. Keep all events in Kafka (long retention). Reprocess historical data by replaying from the beginning. Simpler but requires Kafka storage for months/years.
  • Batch Processing: Apache Spark (large-scale ETL, daily reporting). Run nightly or hourly. High latency but can process petabytes.
  • Stream Processing: Apache Flink or Spark Streaming (real-time analytics). Process events as they arrive. Use for: live dashboards, real-time fraud detection, surge pricing.
  • Data Lake vs Data Warehouse: Lake = raw events in S3/HDFS (schema-on-read, cheap, flexible). Warehouse = processed, structured data in Redshift/BigQuery (schema-on-write, fast queries, expensive).
🎯 Use when asked: Design Flipkart's analytics platform, design real-time fraud detection, design Netflix recommendation pipeline, design any analytics/ML feature system

3. Company-Specific System Design Questions in India 2026

Company Most Asked System Design Questions Key Patterns to Emphasize
Google India Design Google Search, Design YouTube, Design Google Maps, Design Distributed File System MapReduce, Bigtable, consistent hashing, inverted index, geo-spatial
Microsoft India Design OneDrive, Design Teams Chat, Design Azure Service Bus, Design CI/CD Pipeline WebSockets, distributed storage, message queues, microservices
Amazon India Design Order Management, Design Product Search, Design DynamoDB, Design Rate Limiter Exactly-once, idempotency, consistent hashing, Dynamo paper concepts
Flipkart Design Cart & Checkout, Design Big Billion Days infrastructure, Design Delivery Tracking Flash sale queue, inventory locking, event-driven, geo-spatial
Swiggy / Zomato Design Delivery Partner Matching, Design Restaurant Search, Design Surge Pricing, Design ETA prediction Geo-spatial, real-time location, ML pipeline, Kafka fan-out
Razorpay / PhonePe Design Payment Gateway, Design UPI system, Design Reconciliation Engine, Design Fraud Detection Idempotency, saga pattern, exactly-once, circuit breaker, real-time ML
Walmart Global Tech Design Inventory Management, Design Price Engine, Design Store Locator, Design Supply Chain Caching, sharding, geo-spatial, batch processing, eventual consistency
Salesforce India Design Multi-Tenant CRM, Design Workflow Engine, Design API Rate Limiter, Design Audit Trail Multi-tenancy, row-level security, event sourcing, rate limiting

4. HLD vs LLD — What Each Level Requires

Aspect HLD (High-Level Design) LLD (Low-Level Design)
Focus System architecture, component interactions, technology choices Class hierarchy, methods, design patterns, API contracts
Output Architecture diagram with services, DBs, queues, CDN Class diagram, interface definitions, database schema
Scale concern How does this handle 10M requests/day? How does the OrderProcessor class handle concurrent requests?
Tools discussed Redis, Kafka, Elasticsearch, Kubernetes, S3, CDN Design patterns, SOLID principles, thread safety, OOP
Level required SDE2 and above, all companies SDE1+ at Flipkart (MCR), SDE2+ at most companies
Example question "Design Swiggy's order tracking at 1M orders/day" "Design the class structure for a parking lot system"

5. The 6 Most Common System Design Interview Mistakes in India

❌ Mistake 1: Starting with Architecture Without Clarifying Requirements

The #1 failure. "Design Twitter" → immediately starts drawing microservices. But is it 1000 users or 100M? Read-heavy or write-heavy? Without asking, your design has no foundation and the interviewer will redirect you anyway — wasting time and creating a bad first impression.

❌ Mistake 2: Name-Dropping Technologies Without Justifying Them

"I'll use Kafka, Redis, Elasticsearch, Kubernetes, Cassandra, and gRPC." Why? What problem does each solve? A design with fewer, well-justified technologies beats a buzzword salad. Interviewers test: "Why Kafka over RabbitMQ here?" If you can't answer, the technology choice is a red flag.

❌ Mistake 3: Ignoring Non-Functional Requirements

Most candidates design the happy path but never discuss: What happens if Redis goes down? What's the consistency model when two users edit simultaneously? What's the latency SLA and how do you enforce it? Senior engineers obsess over failure modes. NFRs are where you demonstrate seniority.

❌ Mistake 4: Designing the Perfect System Instead of the Right System

"I'll use a CQRS event-sourced microservices architecture with a two-phase commit distributed transaction." That's over-engineered for a startup. Great engineers design the simplest system that meets the requirements, not the most technically impressive one. Ask: "What's the MVP architecture, and what would we add at 10× scale?"

❌ Mistake 5: Not Drawing the Architecture

Verbal descriptions of complex systems are impossible to follow. Always draw — even in a virtual interview using a whiteboard tool. A diagram forces clarity, helps the interviewer follow your thinking, and gives you a shared reference for the deep-dive discussion.

❌ Mistake 6: Solving the Full Problem Before Getting Buy-In

Design the whole system, then say "any questions?" — and discover the interviewer wanted you to go deeper on the search component, not the notification system. Check in after each step: "Does this approach make sense? Should I go deeper anywhere here?" Interviewers want collaboration, not a monologue.

6. Frequently Asked Questions

What are the most important system design topics for SDE2 interviews at Indian product companies?
The most critical topics for SDE2 system design interviews in India are: caching with Redis (cache-aside, invalidation strategies), message queues with Kafka (async processing, fan-out), database sharding and consistent hashing, rate limiting (token bucket algorithm), CAP theorem and consistency trade-offs, and API design (REST, pagination, idempotency). Cover these 6 topics deeply and you can handle 80% of system design questions at Flipkart, Swiggy, Razorpay, Amazon, and similar companies.
What is the difference between HLD and LLD in system design interviews?
HLD (High-Level Design) covers system architecture — which services, databases, caches, and queues you use and how they connect. LLD (Low-Level Design) covers object-oriented design — class hierarchy, interfaces, design patterns, and database schema. Most SDE2 interviews in India focus on HLD with some LLD for the data model. Flipkart's Machine Coding Round is a practical LLD test where you implement the design in code. SDE3+ interviews often include both HLD and LLD depth.
How do I answer a system design question from scratch in an interview?
Follow this 7-step framework: (1) Clarify functional and non-functional requirements — 5 min; (2) Estimate capacity: QPS, storage, bandwidth — 3 min; (3) Define the core APIs — 5 min; (4) Design the data model — 5 min; (5) Draw the high-level architecture — 15 min; (6) Deep-dive into the most complex components — 15 min; (7) Discuss trade-offs and failure scenarios — 5 min. Never start drawing before step 1. Always check in with the interviewer before moving to the next step. Treat it as a collaborative design session, not a presentation.
What is consistent hashing and when should I use it in system design?
Consistent hashing distributes data across N nodes on a virtual ring. Each node owns a range of the ring. When you add or remove a node, only the adjacent portion of the ring needs remapping — O(K/N) keys move instead of O(K). Use it when: designing a distributed cache (Redis cluster), designing horizontal database sharding, designing a CDN edge node selection system, or any system that needs to distribute load across multiple nodes and must handle nodes being added/removed gracefully. Mention virtual nodes (vnodes) to handle uneven distribution — each physical node owns multiple positions on the ring.
How do I prepare for system design interviews in 4 weeks?
4-week system design prep plan: Week 1 — fundamentals: caching, databases (SQL vs NoSQL), load balancing, CAP theorem. Week 2 — messaging and async: Kafka, message queues, event-driven patterns, fan-out. Week 3 — design practice: design 3 full systems using the 7-step framework (URL shortener, Twitter feed, ride-sharing). Week 4 — company-specific prep and mock interviews. Resources: System Design Primer (GitHub), Gaurav Sen YouTube, Grokking System Design course. Practice speaking your design out loud with a 45-minute timer — silence in a system design interview is a red flag.

🏗️ Build System Design Skills with PrepFlix

Combine strong DSA foundations with system design knowledge to crack SDE2+ interviews at India's top product companies. Start with the PrepFlix DSA track and work up to system design.

Start Your Prep →

Related Guides