Day 4: Caching & CDN
What You'll Learn Today
- Why caching is essential for system performance
- Cache strategies (cache-aside, write-through, write-behind, read-through)
- Cache eviction policies (LRU, LFU, TTL)
- Redis vs Memcached comparison
- CDN architecture and how it accelerates content delivery
- Cache invalidation challenges and solutions
Why Caching Matters
Caching stores frequently accessed data in a fast storage layer to reduce latency and database load. The performance difference is dramatic.
| Storage Layer | Latency | Throughput |
|---|---|---|
| L1 Cache (CPU) | ~1 ns | Fastest |
| L2 Cache (CPU) | ~10 ns | Very fast |
| RAM (in-process) | ~100 ns | Fast |
| Redis/Memcached | ~1 ms | High |
| SSD (database) | ~0.1-1 ms | Moderate |
| HDD (database) | ~10 ms | Slow |
| Network (cross-region) | ~100 ms | Slowest |
A system reading from Redis instead of a database can see 10-100x improvement in response time. For read-heavy systems (which most systems are), caching is the single most impactful optimization.
flowchart LR
Client["Client"]
Cache["Cache<br>(Redis)<br>~1ms"]
DB["Database<br>(PostgreSQL)<br>~10-50ms"]
Client --> Cache
Cache -->|"Cache Miss"| DB
style Cache fill:#22c55e,color:#fff
style DB fill:#3b82f6,color:#fff
Cache Strategies
Cache-Aside (Lazy Loading)
The application manages the cache explicitly. On a read, check the cache first. If it is a miss, read from the database and populate the cache.
flowchart TB
App["Application"]
Cache["Cache"]
DB["Database"]
App -->|"1. Check cache"| Cache
Cache -->|"2a. Cache HIT"| App
App -->|"2b. Cache MISS β Read DB"| DB
DB -->|"3. Return data"| App
App -->|"4. Write to cache"| Cache
style Cache fill:#22c55e,color:#fff
style DB fill:#3b82f6,color:#fff
style App fill:#f59e0b,color:#fff
Pros: Only requested data is cached; cache failure does not break the system. Cons: Cache miss causes three round trips; data can become stale.
Read-Through
The cache itself is responsible for loading data from the database on a miss. The application always reads from the cache.
Pros: Simpler application code; consistent read path. Cons: Cache library must support data loading; initial requests are slow.
Write-Through
Data is written to the cache and database simultaneously. The write is only acknowledged after both succeed.
flowchart LR
App["Application"]
Cache["Cache"]
DB["Database"]
App -->|"1. Write"| Cache
Cache -->|"2. Write"| DB
DB -->|"3. Acknowledge"| Cache
Cache -->|"4. Acknowledge"| App
style Cache fill:#22c55e,color:#fff
style DB fill:#3b82f6,color:#fff
style App fill:#f59e0b,color:#fff
Pros: Cache is always consistent with the database. Cons: Higher write latency (two writes); caches data that may never be read.
Write-Behind (Write-Back)
Data is written to the cache immediately. The cache asynchronously writes to the database later.
flowchart LR
App["Application"]
Cache["Cache"]
DB["Database"]
App -->|"1. Write"| Cache
Cache -->|"2. Acknowledge"| App
Cache -.->|"3. Async write later"| DB
style Cache fill:#22c55e,color:#fff
style DB fill:#3b82f6,color:#fff
style App fill:#f59e0b,color:#fff
Pros: Very low write latency; batches writes to the database. Cons: Risk of data loss if the cache fails before writing to the database.
Strategy Comparison
| Strategy | Read Latency | Write Latency | Consistency | Data Loss Risk | Complexity |
|---|---|---|---|---|---|
| Cache-Aside | Miss: high, Hit: low | N/A (app writes to DB) | Eventual | None | Low |
| Read-Through | Miss: high, Hit: low | N/A | Eventual | None | Medium |
| Write-Through | Hit: low | High (two writes) | Strong | None | Medium |
| Write-Behind | Hit: low | Very low | Eventual | Cache failure | High |
Interview Tip: Cache-aside is the most commonly used strategy and the safest default answer. Combine it with a TTL for expiration.
Cache Eviction Policies
When the cache is full, an eviction policy determines which entries to remove.
LRU (Least Recently Used)
Evicts the entry that has not been accessed for the longest time. The most widely used policy.
Access order: A, B, C, D, A, E (cache size = 4)
[A] β A
[A, B] β B
[A, B, C] β C
[A, B, C, D] β D (full)
[B, C, D, A] β A (moved to front)
[C, D, A, E] β E (B evicted - least recently used)
LFU (Least Frequently Used)
Evicts the entry with the fewest accesses. Better for workloads with clear popularity patterns but more complex to implement.
TTL (Time To Live)
Entries expire after a fixed duration. Simple and effective for data that has a known staleness tolerance.
| Policy | Mechanism | Best For | Drawback |
|---|---|---|---|
| LRU | Evict least recently accessed | General purpose | Scan-resistant (one-time reads can flush useful data) |
| LFU | Evict least frequently accessed | Stable popularity patterns | Slow to adapt to changing patterns |
| TTL | Expire after fixed time | Data with known staleness tolerance | Thundering herd on mass expiration |
| FIFO | Evict oldest entry | Simple scenarios | Ignores access patterns |
| Random | Evict random entry | Unpredictable patterns | No intelligence |
Redis vs Memcached
Both are in-memory data stores used for caching, but they serve different needs.
flowchart TB
subgraph Redis_Features["Redis"]
R1["Data Structures<br>(strings, lists, sets, hashes, sorted sets)"]
R2["Persistence<br>(RDB, AOF)"]
R3["Replication + Clustering"]
R4["Pub/Sub + Streams"]
end
subgraph Memcached_Features["Memcached"]
M1["Simple Key-Value<br>(strings only)"]
M2["No Persistence"]
M3["Multi-threaded"]
M4["Simple and Fast"]
end
style Redis_Features fill:#ef4444,color:#fff
style Memcached_Features fill:#3b82f6,color:#fff
style R1 fill:#ef4444,color:#fff
style R2 fill:#ef4444,color:#fff
style R3 fill:#ef4444,color:#fff
style R4 fill:#ef4444,color:#fff
style M1 fill:#3b82f6,color:#fff
style M2 fill:#3b82f6,color:#fff
style M3 fill:#3b82f6,color:#fff
style M4 fill:#3b82f6,color:#fff
| Feature | Redis | Memcached |
|---|---|---|
| Data types | Strings, lists, sets, hashes, sorted sets, streams | Strings only |
| Persistence | Yes (RDB snapshots, AOF logs) | No |
| Replication | Built-in leader-follower | Not built-in |
| Clustering | Redis Cluster (auto-sharding) | Client-side sharding |
| Threading | Single-threaded (event loop) | Multi-threaded |
| Memory efficiency | Higher overhead per key | More memory-efficient for simple strings |
| Max value size | 512 MB | 1 MB |
| Use case | Sessions, leaderboards, queues, pub/sub | Simple caching |
Choose Redis when you need rich data structures, persistence, or pub/sub. Choose Memcached when you need simple, high-throughput string caching with multi-threaded performance.
Interview Tip: In most system design interviews, Redis is the safer choice because of its versatility. Mention Memcached if the use case is purely simple key-value caching.
CDN Architecture
A Content Delivery Network (CDN) caches static content at edge servers geographically close to users, reducing latency and offloading traffic from origin servers.
flowchart TB
User1["User (Tokyo)"]
User2["User (New York)"]
User3["User (London)"]
Edge1["Edge Server<br>Tokyo"]
Edge2["Edge Server<br>New York"]
Edge3["Edge Server<br>London"]
Origin["Origin Server<br>US-West"]
User1 --> Edge1
User2 --> Edge2
User3 --> Edge3
Edge1 -->|"Cache miss"| Origin
Edge2 -->|"Cache miss"| Origin
Edge3 -->|"Cache miss"| Origin
style Edge1 fill:#22c55e,color:#fff
style Edge2 fill:#22c55e,color:#fff
style Edge3 fill:#22c55e,color:#fff
style Origin fill:#3b82f6,color:#fff
style User1 fill:#f59e0b,color:#fff
style User2 fill:#f59e0b,color:#fff
style User3 fill:#f59e0b,color:#fff
How a CDN Works
- User requests a resource (image, CSS, JS)
- DNS resolves to the nearest CDN edge server
- If the edge has the resource cached (cache hit), it returns it immediately
- If not (cache miss), the edge fetches from the origin, caches it, and returns it
- Subsequent requests from nearby users are served from the edge
Push vs Pull CDN
| Type | How It Works | Pros | Cons |
|---|---|---|---|
| Push | You upload content to CDN proactively | Full control over what is cached | Must manage uploads; wastes space for unpopular content |
| Pull | CDN fetches from origin on first request | Automatic; only popular content cached | First request is slow (cache miss) |
Most modern CDNs use the pull model. Popular providers include CloudFront, Cloudflare, Akamai, and Fastly.
What to Cache on a CDN
- Static assets (images, CSS, JavaScript, fonts)
- Video and audio files
- API responses that rarely change
- HTML pages (for static sites)
Cache Invalidation
Cache invalidation is famously one of the two hard problems in computer science. When the source data changes, the cache must be updated or removed.
Strategies
| Strategy | How It Works | Consistency | Complexity |
|---|---|---|---|
| TTL-based | Cache expires after a set time | Eventual (stale for TTL duration) | Low |
| Event-based | Publish event on data change; invalidate cache | Near real-time | Medium |
| Version-based | Include version in cache key; new version = new key | Immediate (old key orphaned) | Medium |
| Write-through | Update cache on every write | Strong | Low |
Common Challenges
Thundering herd: When a popular cache entry expires, thousands of requests simultaneously hit the database. Solutions:
- Lock/mutex: Only one request fetches from DB; others wait
- Stale-while-revalidate: Serve stale data while refreshing in the background
- Pre-warming: Refresh cache before it expires
Cache stampede: Similar to thundering herd but happens when the cache is empty (cold start). Solution: pre-warm the cache before going live.
Inconsistency: Cache and database are out of sync. Solutions:
- Keep TTLs short (30 seconds to 5 minutes)
- Use event-driven invalidation
- Accept eventual consistency where appropriate
flowchart TB
subgraph Problem["Thundering Herd Problem"]
Expire["Cache Key Expires"]
R1["Request 1"]
R2["Request 2"]
R3["Request N..."]
DB["Database<br>Overwhelmed"]
end
Expire --> R1
Expire --> R2
Expire --> R3
R1 --> DB
R2 --> DB
R3 --> DB
style Problem fill:#ef4444,color:#fff
style Expire fill:#ef4444,color:#fff
style DB fill:#ef4444,color:#fff
style R1 fill:#f59e0b,color:#fff
style R2 fill:#f59e0b,color:#fff
style R3 fill:#f59e0b,color:#fff
Caching at Every Layer
A well-designed system caches at multiple levels.
flowchart TB
Browser["Browser Cache<br>(HTTP headers)"]
CDN["CDN<br>(Edge servers)"]
LB["Load Balancer<br>(Connection caching)"]
AppCache["Application Cache<br>(In-process, local)"]
DistCache["Distributed Cache<br>(Redis / Memcached)"]
DB["Database<br>(Query cache, buffer pool)"]
Browser --> CDN --> LB --> AppCache --> DistCache --> DB
style Browser fill:#22c55e,color:#fff
style CDN fill:#22c55e,color:#fff
style AppCache fill:#f59e0b,color:#fff
style DistCache fill:#8b5cf6,color:#fff
style DB fill:#3b82f6,color:#fff
| Layer | What to Cache | TTL | Example |
|---|---|---|---|
| Browser | Static assets, API responses | Minutes to days | Cache-Control: max-age=3600 |
| CDN | Images, CSS, JS, videos | Hours to days | CloudFront, Cloudflare |
| Application | Computed results, config | Seconds to minutes | In-memory HashMap |
| Distributed Cache | DB query results, sessions | Seconds to hours | Redis, Memcached |
| Database | Query plans, buffer pool | Automatic | MySQL query cache |
Practice Problems
Exercise 1: Basics
A news website serves 50,000 articles. The top 1,000 articles account for 80% of all traffic. Design a caching strategy that specifies:
- Which cache strategy to use (cache-aside, write-through, etc.)
- Which eviction policy to use and why
- What TTL to set and why
- How to handle cache invalidation when an article is updated
Exercise 2: Applied
Design a caching architecture for a news website with these requirements:
- 10 million DAU
- 100,000 articles
- Articles are updated infrequently (average once per day)
- Home page shows trending articles (changes every 5 minutes)
- Article pages include comments (change frequently)
Specify: caching layers, Redis vs Memcached, cache-aside vs write-through, TTLs for different content types, and CDN strategy.
Challenge
An e-commerce platform experiences a thundering herd problem during flash sales. When a popular product page cache expires, thousands of requests hit the database simultaneously, causing timeouts. Design a solution that:
- Prevents the thundering herd
- Keeps product information fresh (within 10 seconds)
- Handles 100,000 QPS for popular products
- Maintains data consistency when inventory changes
Summary
| Concept | Description |
|---|---|
| Cache-Aside | App checks cache first; loads from DB on miss |
| Write-Through | Write to cache and DB simultaneously |
| Write-Behind | Write to cache; async write to DB |
| LRU | Evict least recently used entry |
| TTL | Entries expire after a fixed time |
| Redis | Rich data structures, persistence, pub/sub |
| Memcached | Simple, fast, multi-threaded string caching |
| CDN | Edge servers caching static content near users |
| Cache Invalidation | Keeping cache consistent with source data |
Key Takeaways
- Cache-aside with TTL is the safest default for most interview scenarios
- Cache at every layer - browser, CDN, application, distributed cache, database
- Redis is the go-to choice for most caching needs in interviews
- CDN is essential for any system serving static content globally
- Cache invalidation is hard - use TTL + event-based invalidation for best results
- Watch for thundering herd - use locks, stale-while-revalidate, or pre-warming
References
- Xu, Alex. System Design Interview - An Insider's Guide. Byte Code LLC, 2020.
- Redis Documentation
- Cloudflare - What is a CDN?
- Facebook's Memcached Paper
Next up: On Day 5, we explore Message Queues and Async Processing - covering Kafka, RabbitMQ, event-driven architecture, and delivery guarantees for building resilient distributed systems.