System Design - News Feed

Catalogue
  1. 1. 1S - Scenario
    1. 1.1. 1 Enumerate Features
    2. 1.2. 2 Key Features
    3. 1.3. QPS Analysis & Predict
  2. 2. 2S - Service
  3. 3. 3S - Storage
    1. 3.1. 1 Select Database
    2. 3.2. 2 Design Schema
    3. 3.3. News Feed
      1. 3.3.1. Pull Model
      2. 3.3.2. Push Model
    4. 3.4. Trade Off
  4. 4. 4S - Scale
    1. 4.1. Optimize
      1. 4.1.1. Pull Moedel
      2. 4.1.2. Push Model
    2. 4.2. Maintenance
  5. 5. Seperated Services

4S Analysis, Take “Design a Twitter” for example!

1S - Scenario

1 Enumerate Features

  • Register, Login
  • User Profile Display, Edit
  • Upload Image, Video
  • Search
  • Post, Share a tweet
  • Timeline, News Feed
  • Follow, Unfollow a user

2 Key Features

  • Post a Tweet
  • Timeline
  • News Feed
  • Follow, Unfollow a User
  • Register, Login

QPS Analysis & Predict

QPS(Query Per Second)

  • Concurrent User (Peak, Fast Growing)
  • Read QPS
  • Write QPS

Range of QPS

  • QPS = 100, one laptop enough
  • QPS = 1k, one Web Server (Single Point Failure)
  • QPS = 1m, 1000 Servers (Maintainance)
  • NoSQL(10k QPS Cassandra, 1M QPS Memcached)

2S - Service

Split/ Application/ Module

  • User Service
  • Tweet Service
  • Media Service
  • Friendship Service

3S - Storage

  • Schema/ Data/ SQL/ NoSQL/ File System
  • System = Service + Storage

1 Select Database

  • SQL Database
    User Table, User Service
  • NoSQL Database
    Tweets, Social Graph, Tweet Service
  • File System
    Media

2 Design Schema

  • id & colums

News Feed

  • Facebook, Twitter, Wechat Moments, Byte Dance
  • Everyone has different news feed!!!
  • Follow and Unfollow

Pull Model

  • Merge K Sorted Arrays
  • News Feed, N Database Reads + Merge N arrays(Memory)
  • Post a tweet, 1 DB Write
  • But it’s very slow to read N DB when you get your news feed!!!

Push Model

  • Fanout, When a user post a tweet, then push this tweet to every user who follow him or her
  • News Feed, 1 Database Reads
  • Post a tweet, N DB Writes(But user do not need to wait!!!)
  • But when a user have toooo many followers, it take longer time to fanout!

Trade Off

  • Facebook - Pull
  • Instagram - Push + Pull
  • Twitter - Pull

4S - Scale

Sharding/ Optimize/ Special Case

Optimize

Pull Moedel

  • Cache before DB Query
  • Cache every user’s timeline
  • Cache every user’s news feed

Push Model

  • Disk is cheap
  • Inactive Users
  • Followers much larger than Following
    Fanout will take several hours!!!
  • Seperate Star User and Normal User

Maintenance

(Explain it later!)

  • Robust
  • Scalability

Seperated Services

  • Follow and Unfollow
  • Likes
  • Thundering Herd
Share