Design a scalable Video Streaming System

Video streaming platforms like YouTube, Netflix, Hotstar, Amazon Prime, and Vimeo serve millions of hours of video daily. Designing such a system is challenging because video streaming is bandwidth-heavy, latency-sensitive, and highly scalable by nature.

Unlike text or images, video data is large, continuous, and must be delivered smoothly even under unstable network conditions. In this blog, we will design a real-world, scalable video streaming system, explaining why each architectural decision is made.


Index

  1. Understanding the Video Streaming Problem
  2. Functional Requirements
  3. Non-Functional Requirements (Traffic, Bandwidth & Scale)
  4. High-Level Architecture Overview
  5. Video Upload & Ingestion Flow
  6. Video Processing Pipeline
  7. Video Storage Architecture
  8. Content Delivery Network (CDN) & Edge Caching
  9. Video Playback Flow (End-to-End Lifecycle)
  10. API Design for Video Streaming
  11. Database Design & Metadata Modeling
  12. Scalability Strategy & Traffic Justification
  13. Reliability, Fault Tolerance & Failover
  14. Security & Access Control
  15. Trade-offs & Design Decisions
  16. Real-World Architecture Summary

1. Understanding the Video Streaming Problem

A video streaming system must:

  • Accept video uploads
  • Process videos into multiple formats
  • Store massive video files efficiently
  • Deliver videos smoothly to users worldwide
  • Adapt video quality based on network conditions

The core challenge is scale:

  • Few uploads
  • Extremely high reads
  • Massive bandwidth consumption

This imbalance shapes the entire system design.


2. Functional Requirements

The system should allow users to:

  • Upload videos
  • Watch videos with minimal buffering
  • Resume playback
  • Support multiple resolutions (240p → 4K)
  • Support mobile and web clients

Optional but realistic:

  • Subtitles
  • Thumbnails
  • Recommendations
  • Live streaming (out of scope here)

3. Non-Functional Requirements (Traffic, Bandwidth & Scale)

Example Traffic Assumptions

  • 10 million daily active users
  • Average video length: 10 minutes
  • Average bitrate: 3 Mbps
  • Peak concurrent viewers: 1 million

➡️ Bandwidth requirement:
1M × 3 Mbps = 3 Tbps peak

Key Non-Functional Goals

  • Low startup latency
  • Minimal buffering
  • High availability (99.9%+)
  • Horizontal scalability
  • Global delivery

These numbers justify CDNs, chunking, and adaptive bitrate streaming.


4. High-Level Architecture Overview

At a high level, the system consists of:

  • Client (Web/Mobile/TV)
  • Load Balancer
  • Upload API Service
  • Video Processing Service
  • Metadata Service
  • Object Storage
  • CDN
  • Databases
  • Message Queue

Each component is designed to scale independently.


5. Video Upload & Ingestion Flow

Step-by-Step Flow

  1. User uploads video via client
  2. Upload API generates pre-signed URL
  3. Client uploads video directly to object storage
  4. Metadata stored in database
  5. Event sent to message queue

Why Direct Upload to Storage?

  • Avoids overloading API servers
  • Supports large file uploads
  • Improves reliability

6. Video Processing Pipeline

Once uploaded, the video enters the processing pipeline.

Components

  • Transcoding Service
  • Thumbnail Generator
  • Subtitle Processor (optional)

Encoding & Formats

  • H.264 / H.265
  • VP9 / AV1 (optional)

Adaptive Bitrate Streaming (ABR)

Videos are split into small chunks (2–6 seconds) and encoded at multiple resolutions.

Why ABR?

  • Adapts to network conditions
  • Prevents buffering
  • Improves user experience

Protocols:

  • HLS
  • MPEG-DASH

7. Video Storage Architecture

Storage Type

Object Storage

  • AWS S3
  • Google Cloud Storage
  • Azure Blob Storage

Why Object Storage?

  • Massive scalability
  • Cost-effective
  • High durability (11 9s)

Storage Layout

/videos/{video_id}/{resolution}/{chunk_id}

This layout simplifies CDN integration.


8. Content Delivery Network (CDN) & Edge Caching

CDNs are critical for video streaming.

CDN Responsibilities

  • Cache video chunks at edge locations
  • Serve content closest to users
  • Reduce latency and origin load

Popular CDNs

  • CloudFront
  • Akamai
  • Cloudflare

Why CDN?

  • Video traffic is read-heavy
  • Origin servers cannot handle global load alone

For more understanding on ABR and CDN, read How ABR and CDN together degine modern video streaming


9. Video Playback Flow (End-to-End Lifecycle)

Playback Lifecycle Example

  1. User opens video page
  2. Client requests playback metadata
  3. Streaming URL returned
  4. Client requests first video chunk from CDN
  5. CDN serves chunk (or fetches from origin)
  6. Client dynamically switches bitrate

This flow ensures fast startup and smooth playback.


10. API Design for Video Streaming

Upload Initialization

POST /videos/initiate-upload

Response

{
  "upload_url": "signed_url",
  "video_id": "vid123"
}

Get Video Metadata

GET /videos/{video_id}

Response

{
  "title": "System Design Explained",
  "stream_url": "cdn_url/playlist.m3u8"
}

11. Database Design & Metadata Modeling

Video Metadata Table

Video(
  video_id,
  user_id,
  title,
  duration,
  status,
  created_at
)

Processing Status Table

VideoProcessing(
  video_id,
  resolution,
  status
)

Why Separate Metadata?

  • Small, frequently accessed
  • Independent scaling from video files

Databases:

  • MySQL/PostgreSQL (metadata)
  • DynamoDB/Cassandra (high scale)

12. Scalability Strategy & Traffic Justification

As traffic grows:

  • CDN absorbs read traffic
  • Object storage scales automatically
  • Processing workers scale horizontally
  • Databases shard by video_id

Example
If viewers double:

  • CDN handles most load
  • No DB or API bottleneck

13. Reliability, Fault Tolerance & Failover

  • Retry video processing jobs
  • Multi-region object storage
  • CDN failover
  • Graceful degradation (lower quality)

This ensures uninterrupted playback.


14. Security & Access Control

  • Signed URLs with expiry
  • DRM (Widevine, FairPlay)
  • Token-based access control
  • Rate limiting uploads

Security prevents unauthorized access and abuse.


15. Trade-offs & Design Decisions

  • Storage cost vs quality
  • Latency vs consistency
  • Preprocessing vs on-demand encoding

Real systems constantly balance cost, performance, and experience.


16. Real-World Architecture Summary

A scalable video streaming system:

  • Uses object storage + CDN
  • Relies on chunked, adaptive streaming
  • Separates metadata from video data
  • Scales horizontally at every layer
  • Optimizes for bandwidth efficiency

What’s Next?

👉 Design a scalable Rate Limiter

Where we’ll design a critical system protection component used across APIs and distributed systems.

Leave a Comment

Your email address will not be published. Required fields are marked *