Learn System Design in 10 DaysDay 9: Design Video Streaming & File Storage

Day 9: Design Video Streaming & File Storage

What You'll Learn Today

  • Video upload and processing pipeline
  • Video transcoding and encoding formats
  • Adaptive bitrate streaming (HLS/DASH)
  • CDN strategy for video delivery
  • Thumbnail generation
  • File chunking and deduplication
  • Sync conflict resolution
  • Block storage vs object storage

Part 1: Video Streaming Platform (YouTube-like)

Requirements

  • Upload: Users upload videos of various sizes and formats
  • Processing: Transcode to multiple resolutions and bitrates
  • Streaming: Adaptive playback based on network speed
  • Scale: Billions of views per day, petabytes of storage

High-Level Architecture

flowchart TB
    U["User Upload"]
    API["API Server"]
    OS["Original Storage (S3)"]
    MQ["Message Queue"]
    TP["Transcoding Pipeline"]
    TS["Transcoded Storage (S3)"]
    TG["Thumbnail Generator"]
    MD[("Metadata DB")]
    CDN["CDN"]
    V["Viewer"]

    U --> API
    API -->|"Store original"| OS
    API -->|"Trigger processing"| MQ
    MQ --> TP
    MQ --> TG
    TP --> TS
    TG --> TS
    API --> MD
    TS --> CDN
    V --> CDN

    style API fill:#3b82f6,color:#fff
    style MQ fill:#f59e0b,color:#fff
    style TP fill:#8b5cf6,color:#fff
    style CDN fill:#22c55e,color:#fff
    style OS fill:#ef4444,color:#fff

Video Upload Pipeline

sequenceDiagram
    participant U as User
    participant API as API Server
    participant S3 as Object Storage
    participant Q as Message Queue
    participant T as Transcoder

    U->>API: Upload video (multipart)
    API->>S3: Store original file
    API->>Q: Enqueue transcoding job
    API->>U: Upload accepted (202)
    Q->>T: Pick up job
    T->>S3: Read original
    T->>T: Transcode to multiple formats
    T->>S3: Store transcoded versions
    T->>API: Notify completion
    API->>U: Video ready notification

Upload considerations:

  • Use multipart upload for large files (resume on failure)
  • Pre-signed URLs: Let clients upload directly to S3, bypassing the API server
  • Upload progress: Client tracks chunk upload progress

Video Transcoding

Transcoding converts the original video into multiple resolutions and bitrates.

Resolution Bitrate Use Case
240p 300 Kbps Very slow networks
360p 500 Kbps Mobile data saving
480p 1 Mbps Standard mobile
720p 2.5 Mbps HD mobile/tablet
1080p 5 Mbps Desktop HD
4K 20 Mbps Large screens

Encoding formats: H.264 (most compatible), H.265/HEVC (better compression), VP9 (open source), AV1 (newest, best compression)

Transcoding pipeline:

flowchart LR
    O["Original Video"]
    S["Split into segments"]
    T1["Transcode 240p"]
    T2["Transcode 480p"]
    T3["Transcode 720p"]
    T4["Transcode 1080p"]
    M["Generate manifest"]

    O --> S
    S --> T1 & T2 & T3 & T4
    T1 & T2 & T3 & T4 --> M

    style O fill:#ef4444,color:#fff
    style S fill:#f59e0b,color:#fff
    style T1 fill:#3b82f6,color:#fff
    style T2 fill:#3b82f6,color:#fff
    style T3 fill:#3b82f6,color:#fff
    style T4 fill:#3b82f6,color:#fff
    style M fill:#22c55e,color:#fff

Parallelization: Split video into segments and transcode each in parallel using distributed workers (e.g., AWS Lambda, Kubernetes jobs).


Adaptive Bitrate Streaming (ABR)

flowchart TB
    subgraph HLS["HLS (HTTP Live Streaming)"]
        direction TB
        H1["Master Playlist (.m3u8)"]
        H2["720p Playlist"]
        H3["480p Playlist"]
        H4["240p Playlist"]
        H1 --> H2 & H3 & H4
        H2 --> HS1["Segment 1"] & HS2["Segment 2"]
    end
    P["Player"]
    P -->|"Fetch playlist"| H1
    P -->|"Monitor bandwidth"| P
    P -->|"Switch quality"| H2

    style HLS fill:#8b5cf6,color:#fff
    style P fill:#3b82f6,color:#fff
Protocol Standard Segment Format Latency
HLS Apple .ts or .fmp4 6-30s
DASH MPEG .mp4 segments 2-10s
CMAF Joint .fmp4 2-5s

How ABR works:

  1. Video is split into small segments (2-10 seconds each)
  2. Each segment is encoded at multiple bitrates
  3. A manifest file lists all available qualities
  4. The player monitors network speed and switches quality per-segment

CDN for Video Delivery

flowchart TB
    O["Origin (S3)"]
    E1["Edge Server (US)"]
    E2["Edge Server (EU)"]
    E3["Edge Server (Asia)"]
    U1["US Viewer"]
    U2["EU Viewer"]
    U3["Asia Viewer"]

    O --> E1 & E2 & E3
    E1 --> U1
    E2 --> U2
    E3 --> U3

    style O fill:#8b5cf6,color:#fff
    style E1 fill:#22c55e,color:#fff
    style E2 fill:#22c55e,color:#fff
    style E3 fill:#22c55e,color:#fff
  • Popular videos: Cached at edge servers worldwide
  • Long-tail videos: Served from origin or regional caches
  • Cache strategy: Push popular content to edges; pull on demand for others
  • Cost optimization: Compress video, use efficient codecs, limit cache TTL for unpopular content

Thumbnail Generation

  • Extract frames at regular intervals (e.g., every 5 seconds)
  • Use ML to select the most visually interesting frame
  • Generate multiple sizes: small (list view), medium (card), large (player preview)
  • Store thumbnails alongside video segments in object storage
  • Serve through CDN

Part 2: Distributed File Storage (Google Drive / Dropbox)

Requirements

  • Upload and download files
  • Sync files across devices
  • Share files with other users
  • Version history
  • Conflict resolution

Architecture

flowchart TB
    C1["Desktop Client"] & C2["Mobile Client"] & C3["Web Client"]
    API["API Gateway"]
    MS["Metadata Service"]
    BS["Block Service"]
    SS["Sync Service"]
    NS["Notification Service"]
    MD[("Metadata DB (SQL)")]
    BK[("Block Storage (S3)")]
    MQ["Message Queue"]

    C1 & C2 & C3 --> API
    API --> MS & BS & SS
    MS --> MD
    BS --> BK
    SS --> MQ
    MQ --> NS
    NS --> C1 & C2 & C3

    style API fill:#3b82f6,color:#fff
    style MS fill:#8b5cf6,color:#fff
    style BS fill:#ef4444,color:#fff
    style SS fill:#22c55e,color:#fff
    style NS fill:#f59e0b,color:#fff

File Chunking and Deduplication

Instead of uploading entire files, split them into fixed-size chunks (typically 4 MB).

flowchart LR
    F["File (16 MB)"]
    C1["Chunk 1 (4 MB)\nhash: a1b2c3"]
    C2["Chunk 2 (4 MB)\nhash: d4e5f6"]
    C3["Chunk 3 (4 MB)\nhash: a1b2c3"]
    C4["Chunk 4 (4 MB)\nhash: g7h8i9"]

    F --> C1 & C2 & C3 & C4

    style F fill:#3b82f6,color:#fff
    style C1 fill:#22c55e,color:#fff
    style C2 fill:#8b5cf6,color:#fff
    style C3 fill:#22c55e,color:#fff
    style C4 fill:#f59e0b,color:#fff

Deduplication: Chunks 1 and 3 have the same hash β€” store only once. This saves significant storage.

Benefits of chunking:

Benefit Description
Incremental sync Only upload changed chunks
Deduplication Same content stored once
Resume upload Retry individual chunks
Parallel transfer Upload/download chunks in parallel
Efficient versioning Store only changed chunks per version

Sync and Conflict Resolution

flowchart TB
    subgraph Normal["No Conflict"]
        direction TB
        N1["Device A edits file"]
        N2["Sync to server"]
        N3["Notify Device B"]
        N4["Device B pulls update"]
        N1 --> N2 --> N3 --> N4
    end
    subgraph Conflict["Conflict"]
        direction TB
        C1["Device A and B edit same file offline"]
        C2["Both sync to server"]
        C3["Server detects conflict (version mismatch)"]
        C4["Keep both versions, let user resolve"]
        C1 --> C2 --> C3 --> C4
    end
    style Normal fill:#22c55e,color:#fff
    style Conflict fill:#ef4444,color:#fff

Conflict detection: Each file has a version number. When a client uploads, it sends its known version. If the server version differs, there's a conflict.

Resolution strategies:

Strategy How Used By
Last write wins Overwrite with latest Simple, lossy
Keep both Save as "conflicted copy" Dropbox
Merge Auto-merge changes (for text files) Google Docs
User resolves Present both versions to user Git

Metadata Service

The metadata database stores file information, not the file content itself.

File Metadata:
{
  "file_id": "f_123",
  "name": "report.pdf",
  "path": "/documents/2025/",
  "size": 2456789,
  "chunks": ["a1b2c3", "d4e5f6", "g7h8i9"],
  "version": 5,
  "owner": "user_456",
  "shared_with": ["user_789"],
  "created_at": "2025-01-15",
  "modified_at": "2025-03-20"
}

Use a relational database (PostgreSQL) for metadata because:

  • ACID transactions for consistency
  • Complex queries (search, sharing permissions)
  • Hierarchical structure (folders)

Block Storage vs Object Storage

Feature Block Storage Object Storage
Access Read/write by blocks Read/write whole objects
Performance Low latency, high IOPS Higher latency
Scalability Limited Virtually unlimited
Cost Higher Lower
Use case Databases, VMs Files, media, backups
Examples EBS, SAN S3, GCS, Azure Blob

For file storage: Use object storage (S3) for file chunks. It's cost-effective, durable (99.999999999% / "eleven 9s"), and scales without limit.


Summary

Concept Description
Video upload Multipart upload, pre-signed URLs
Transcoding Convert to multiple resolutions in parallel
ABR streaming HLS/DASH, player switches quality per segment
CDN Edge caching for low-latency video delivery
File chunking Split files into 4MB chunks for incremental sync
Deduplication Hash-based, same content stored once
Sync conflicts Version-based detection, keep both copies
Metadata service Relational DB for file structure and permissions
Object storage S3 for durable, scalable file chunk storage

Key Takeaways

  1. Video streaming requires a pipeline approach: upload, transcode, store, deliver via CDN
  2. Adaptive bitrate streaming lets the player choose quality based on network conditions
  3. File chunking enables incremental sync, deduplication, and parallel transfer
  4. Conflict resolution is one of the hardest problems β€” keep it simple (Dropbox's "conflicted copy" approach works well)

Practice Problems

Problem 1: Basic

Design an image hosting service (like Imgur). Consider upload, storage, resizing to multiple dimensions, and delivery. What's the key difference from video hosting?

Problem 2: Intermediate

Your video platform needs to support live streaming alongside pre-recorded content. How does the architecture change? Consider latency requirements, segment duration, and viewer scalability.

Challenge

Design a collaborative document editing system (like Google Docs). How do you handle real-time co-editing by multiple users? Research Operational Transformation (OT) or CRDTs and explain which you'd choose and why.


References


Next up: In Day 10, we'll design an E-Commerce Platform and a Rate Limiter, and wrap up with final interview tips.