EXERCISE

The Exercise: Design a URL Shortener API

URL shorteners (like bit.ly, TinyURL, and short.io) are one of the most popular system design interview questions because they touch on every concept we have learned: API design, database design, caching, rate limiting, idempotency, load balancing, and scalability. This post-class exercise walks through a complete, production-grade API design that you can use as a template for interviews.

The challenge: design an API that can handle the following scale requirements:

  • 100 million short URLs created per month (~40 writes/second)
  • 10 billion redirects per month — a 100:1 read-to-write ratio (~4,000 reads/second)
  • Redirects return in under 10ms for cached URLs
  • Must scale horizontally without downtime

This is a read-heavy system where caching is king. The redirect path must be optimized for speed above everything else.

URL Shortener system overview — CREATE and REDIRECT flows
Figure 1: Two core flows — CREATE (generate short URL) and REDIRECT (resolve and send user to original URL)
STEP 1

Define the API Endpoints

A well-designed URL shortener needs exactly four endpoints. Each endpoint follows REST principles: resources are nouns, HTTP methods are verbs, and status codes communicate the result clearly.

Four endpoints with complete request and response specifications
Figure 2: Four endpoints with complete request and response specifications

Endpoint 1: Create Short URL

POST /api/v1/urls — creates a new short URL. The client sends the long URL in the request body. Optionally, the client can request a custom alias (e.g., short.ly/my-brand) and set an expiration date.

// Request POST /api/v1/urls { "long_url": "https://example.com/very/long/url", "custom_alias": "my-brand", // optional "expires_at": "2026-12-31T23:59:59Z" // optional } // Response 201 Created { "short_url": "https://short.ly/my-brand", "short_code": "my-brand", "long_url": "https://example.com/very/long/url", "created_at": "2026-03-14T10:00:00Z", "expires_at": "2026-12-31T23:59:59Z" } // 409 Conflict if custom_alias is taken // 400 Bad Request if long_url is invalid

Endpoint 2: Redirect

GET /{short_code} — the redirect endpoint. When a user visits short.ly/abc123, the server looks up abc123, finds the corresponding long URL, and returns a redirect response. This endpoint must be extremely fast because it handles 100× more traffic than the create endpoint.

Returns one of three status codes: 301 or 302 for successful redirects (see Step 4 for the critical design decision), 404 Not Found if the short code does not exist, or 410 Gone if the URL has expired.

Endpoint 3: Get Analytics

GET /api/v1/urls/{short_code}/stats — returns click analytics: total clicks, clicks by day/hour, top referrers, top countries, and device breakdown. Analytics data is collected asynchronously (click events published to Kafka and processed by a separate analytics service) so it does not slow down the redirect flow.

Endpoint 4: Delete URL

DELETE /api/v1/urls/{short_code} — deactivates a short URL. Requires authentication (only the URL creator can delete it). After deletion, the redirect endpoint returns 410 Gone. This endpoint must be idempotent: deleting an already-deleted URL should return 204 (not an error).

EndpointMethodAuth?Success Code
/api/v1/urlsPOSTOptional201 Created
/{short_code}GETNo301 or 302
/api/v1/urls/{code}/statsGETYes200 OK
/api/v1/urls/{short_code}DELETEYes204 No Content
STEP 2

Short Code Generation

The short code is the heart of the system. It must be short (7 characters), unique (no collisions), and generated quickly (sub-millisecond). We use Base62 encoding (a–z, A–Z, 0–9 = 62 characters), giving us 627 = 3.52 trillion possible codes — enough for over 1,000 years at 1 billion URLs per year.

Three code generation strategies
Figure 3: Three code generation strategies — Hash-Based, Counter-Based, and Key Generation Service

Strategy 1: Hash the Long URL

Take the MD5 or SHA256 hash of the long URL and extract the first 7 characters of the Base62 encoding. The same long URL always produces the same short code (deterministic). The problem: hash collisions — two different URLs might hash to the same 7-character code. Solution: on collision, append a counter (hash of URL + "1", then URL + "2") and retry. In practice, with 3.5 trillion possible codes, collision rate is extremely low. Avoid for new systems — the Key Generation Service is strictly better.

Strategy 2: Auto-Increment Counter

Use a database auto-increment ID and convert it to Base62. ID 1 becomes "1", ID 62 becomes "10", ID 1,000,000 becomes "4c92". Zero collisions by definition. The problem: codes are sequential and predictable — a malicious user can enumerate all URLs by incrementing the code. Solutions: use a random starting offset, XOR the ID with a secret before encoding, or use a range-based approach where each server takes a range of IDs.

Strategy 3: Pre-Generated Key Service RECOMMENDED

A separate Key Generation Service (KGS) pre-generates millions of random unique 7-character codes and stores them in a database. When an API server needs a new code, it fetches one (or a batch of 1,000) from the KGS. The KGS marks the code as "used" atomically. This approach has zero collisions, is not predictable, decouples code generation from the main request path, and is highly scalable — multiple API servers can fetch batches in parallel without coordination.

Interview Answer: Code Generation

When asked "how do you generate short codes?", answer: "I'll use a pre-generated Key Generation Service that creates random Base62 7-character codes in batches. API servers fetch a batch of 1,000 codes on startup and replenish when running low. This gives us zero collision risk, unpredictable codes, and sub-microsecond code assignment without database round-trips on the critical path."

STEP 3

Database Schema

Database schema — urls and click_events tables
Figure 4: Two tables — urls (mapping) and click_events (analytics)
-- URLs table (PostgreSQL) CREATE TABLE urls ( short_code VARCHAR(10) PRIMARY KEY, -- hash index, O(1) lookup long_url TEXT NOT NULL, user_id UUID, -- NULL for anonymous created_at TIMESTAMPTZ DEFAULT NOW(), expires_at TIMESTAMPTZ, -- NULL = never expires is_active BOOLEAN DEFAULT TRUE ); -- Click events (ClickHouse / Cassandra for scale) CREATE TABLE click_events ( id UUID DEFAULT gen_random_uuid(), short_code VARCHAR(10), clicked_at TIMESTAMPTZ DEFAULT NOW(), ip_address INET, user_agent TEXT, referrer TEXT, country VARCHAR(2) );

The schema is intentionally simple. The urls table stores the mapping between short codes and long URLs. The short_code column is the primary key (hash index for O(1) lookup) — the most performance-critical index in the entire system.

Use the Right Database for Click Events

For the click_events table, avoid PostgreSQL at scale. Click events are write-heavy (4,000 inserts/second), append-only, and queried by time range — a perfect fit for ClickHouse or Cassandra. ClickHouse handles 100K+ inserts/second and runs analytics queries 100× faster than PostgreSQL on time-series data. The /stats endpoint should query ClickHouse, not PostgreSQL.

STEP 4

The 301 vs 302 Decision

This is one of the most important design decisions in a URL shortener, and interviewers love to ask about it. The answer depends entirely on whether you need analytics.

301 vs 302 redirect trade-offs
Figure 5: 301 caches the redirect (less server load, no analytics). 302 routes every click through your server (analytics, flexibility).
Criteria301 Moved Permanently302 Found (Temporary)
Browser cachingYes — cached foreverNo — every click hits server
Server loadVery low (browser handles repeat visits)Higher (every click = server request)
AnalyticsNo repeat-click trackingEvery click tracked
Can change destinationNo (browser has it cached)Yes
Expiration enforcementNo (browser ignores expiry)Yes (server checks on each request)
Best forHigh-traffic, no analytics neededAnalytics, campaigns, A/B testing

The hybrid approach (used by bit.ly): Use 302 by default for analytics-rich URLs. Offer 301 as an option for high-traffic URLs where the creator does not need analytics. bit.ly uses 301 for free users (reduces their infrastructure cost) and 302 for paying customers (provides click analytics as a premium feature).

Interview Answer: 301 vs 302

"I'll use 302 by default because our system promises click analytics. With 301, the browser caches the redirect and subsequent visits never reach our server — we'd lose analytics data. The trade-off is higher server load, but our multi-layer cache (CDN → Redis) means most requests don't reach the application server anyway."

STEP 5

Multi-Layer Caching

With a 100:1 read-to-write ratio, caching is the single most important optimization. The goal is for 95%+ of redirect requests to be served without touching the database.

Four-layer caching strategy
Figure 6: Four-layer cache — Browser (301), CDN edge, Redis, and PostgreSQL. 95% of requests never reach the DB.

Layer 1: Browser Cache (301 Redirects)

If you use 301 redirects, the browser caches the mapping permanently. The second time a user visits the short URL, their browser resolves it locally without any network request — 0ms and costs you nothing. For 302 redirects, this layer does not apply.

Layer 2: CDN Edge (Cloudflare)

Configure your CDN to cache redirect responses at the edge. When 1,000 different users access the same short URL, only the first request reaches your origin server. The remaining 999 are served from the CDN edge (~5ms). This is especially effective for viral URLs receiving thousands of clicks per minute.

Layer 3: Redis In-Memory Cache

For requests that reach your application server, check Redis first (key: short_code, value: long_url). Redis lookup takes <1ms compared to 5–10ms for a PostgreSQL query. Use an LRU (Least Recently Used) eviction policy: popular URLs stay in cache, rarely-accessed URLs are evicted. With even a modest Redis cluster (16 GB), you can cache 50–100 million URL mappings.

Layer 4: PostgreSQL (Source of Truth)

Only ~5% of requests reach the database (cold URLs not in any cache). On a cache miss, the application queries PostgreSQL, stores the result in Redis with a 24-hour TTL, and returns the redirect. Subsequent requests for the same URL are served from Redis.

Cache Invalidation on Delete

When a URL is deleted, you must invalidate the cache: delete the Redis key immediately and send a cache purge request to Cloudflare. Otherwise, users will continue to be redirected even after deletion. This is a classic cache invalidation problem — always address it in interviews.

STEP 6

Rate Limiting

Rate limiting is essential for a URL shortener because short codes are a finite resource and creation can be abused for spam. The redirect endpoint needs much higher limits since blocking a redirect means a broken link.

Different rate limits for creation vs redirect
Figure 7: Different rate limits for creation (strict) vs redirect (lenient) — with proper HTTP headers

Creation Endpoint (POST /urls) — Strict Limits

  • Anonymous users: 10 URLs per hour
  • Authenticated users: 100 per hour
  • Premium: 1,000 per hour

Implement using a Token Bucket algorithm backed by Redis (key: rate_limit:{api_key}, value: remaining tokens, TTL: 1 hour). Return 429 Too Many Requests with Retry-After header when exceeded.

Redirect Endpoint (GET /{code}) — Lenient Limits

1,000 requests per minute per IP. This prevents DDoS but does not interfere with legitimate traffic. If a viral URL gets 10,000 clicks per second, the CDN cache handles it without rate limiting individual users.

Idempotency for Creation

All POST requests accept an Idempotency-Key header. If the client retries with the same key (e.g., after a network timeout), the server returns the original short URL without creating a duplicate. Keys are stored in Redis with a 24-hour TTL.

// Request with idempotency key POST /api/v1/urls Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000 { "long_url": "https://example.com/page" } // If key is new → create URL, cache key + response in Redis // If key exists → return cached response (no duplicate)
STEP 7

Error Handling

A well-designed API returns consistent, helpful errors. Every error response follows the same JSON structure regardless of the type of error.

Consistent error format with machine-readable codes
Figure 8: Consistent error format with machine-readable error codes and human-readable messages
// Standardized error response { "error": { "code": "URL_NOT_FOUND", // machine-readable "message": "Short code 'abc123' not found", // human-readable "status": 404, // redundant but convenient "request_id": "req_8f3k2x9p" // for log correlation } }
Error CodeHTTP StatusWhen It Occurs
INVALID_URL400long_url is malformed or not HTTP/HTTPS
ALIAS_TAKEN409custom_alias is already in use
URL_NOT_FOUND404short_code does not exist
URL_EXPIRED410URL existed but has passed expiry date
UNAUTHORIZED401Auth required but no token provided
FORBIDDEN403Token valid but user does not own this URL
RATE_LIMIT_EXCEEDED429Too many requests, Retry-After header included
Interview Tip: The request_id field

Always include a unique request_id in error responses. This lets users and support teams correlate a specific failed request with your server logs. Without it, a user reporting "I got an error" is nearly impossible to debug at scale. Generate a UUID per request in your middleware and attach it to every response.

STEP 8

Complete Architecture

Production architecture — CDN, ALB, API servers, Redis, PostgreSQL, Key Service, and Analytics
Figure 9: Production architecture — CDN, ALB, API servers, Redis, PostgreSQL, Key Generation Service, and Analytics pipeline

Component by Component

CDN (Cloudflare) — Sits in front of everything. Caches redirect responses at 300+ global edge locations. Absorbs DDoS attacks. Provides SSL termination at the edge for lowest-latency TLS handshakes.

Load Balancer (AWS ALB) — Layer 7 load balancer using Least Connections algorithm. Routes POST /api/* to the API service and GET /{code} to the redirect service. Health checks on /health every 30 seconds with Active-Passive HA to eliminate the LB as a SPOF.

API Servers (3+ instances) — Stateless application servers (Node.js or Go). Handle create, redirect, analytics, and delete. Horizontally scalable: add more servers behind the ALB for more capacity. Each server holds a local in-memory LRU cache for the hottest URLs.

Redis Cluster — Serves two purposes: (1) caching URL mappings for <1ms redirect lookups, and (2) storing rate limit counters and idempotency keys. 3-node cluster with replication for high availability.

PostgreSQL (Primary + Replica) — Source of truth for URL data. Primary handles writes (40 QPS). Read replica handles analytics queries and cache-miss reads. The redirect path almost never hits PostgreSQL (95%+ Redis cache hit rate).

Key Generation Service (KGS) — Pre-generates random unique 7-character Base62 codes in batches of 10,000. API servers fetch a batch when they run low. Decouples code generation from the request path and eliminates collision checking on the hot path.

Analytics Pipeline (Kafka → ClickHouse) — Click events are published to Kafka asynchronously (does not slow the redirect). A Kafka consumer writes events to ClickHouse (columnar time-series database). The /stats endpoint queries ClickHouse, not PostgreSQL.

Interview Tip: How to Present This Architecture

In an interview, draw the architecture top-down: Client → CDN → Load Balancer → API Servers → Redis → PostgreSQL. Then add the async analytics side branch: API Servers → Kafka → ClickHouse. Highlight the two critical design decisions: (1) the multi-layer cache eliminates 95% of DB reads, and (2) analytics are processed asynchronously so they never block the redirect. This structure shows end-to-end systems thinking.

CHECKLIST

API Design Checklist

Use this checklist to verify your URL shortener design covers all important aspects. In an interview, touching on each of these topics demonstrates comprehensive API design skills.

12-point API design checklist
Figure 10: 12-point checklist covering every aspect of the URL shortener API design
#Checklist ItemCovered in Step
1REST endpoints defined (nouns, HTTP verbs, status codes)Step 1
2Short code generation strategy (Base62, collision-free)Step 2
3Database schema with correct indexesStep 3
4Redirect type justified (301 vs 302)Step 4
5Multi-layer caching (browser, CDN, Redis, DB)Step 5
6Cache invalidation on deleteStep 5
7Rate limiting (different limits per tier and endpoint)Step 6
8Idempotency keys for safe retriesStep 6
9Consistent error format with request_idStep 7
10API versioning (/v1/)Step 1
11Load balancer HA (Active-Passive, VIP)Step 8
12Async analytics pipeline (Kafka → ClickHouse)Step 8

Track Your DSA Progress — It's Free

Stop solving random questions. Start with the right 206 questions across 16 patterns — structured, curated, and completely free.

206 curated questions 16 patterns covered Google login · Free forever
Create Free Account →