4S Analysis, Take “Design a Twitter” for example!
1S - Scenario
1 Enumerate Features
- Register, Login
- User Profile Display, Edit
- Upload Image, Video
- Search
- Post, Share a tweet
- Timeline, News Feed
- Follow, Unfollow a user
2 Key Features
- Post a Tweet
- Timeline
- News Feed
- Follow, Unfollow a User
- Register, Login
QPS Analysis & Predict
QPS(Query Per Second)
- Concurrent User (Peak, Fast Growing)
- Read QPS
- Write QPS
Range of QPS
- QPS = 100, one laptop enough
- QPS = 1k, one Web Server (Single Point Failure)
- QPS = 1m, 1000 Servers (Maintainance)
- NoSQL(10k QPS Cassandra, 1M QPS Memcached)
2S - Service
Split/ Application/ Module
- User Service
- Tweet Service
- Media Service
- Friendship Service
3S - Storage
- Schema/ Data/ SQL/ NoSQL/ File System
- System = Service + Storage
1 Select Database
- SQL Database
User Table, User Service - NoSQL Database
Tweets, Social Graph, Tweet Service - File System
Media
2 Design Schema
- id & colums
News Feed
- Facebook, Twitter, Wechat Moments, Byte Dance
- Everyone has different news feed!!!
- Follow and Unfollow
Pull Model
- Merge K Sorted Arrays
- News Feed, N Database Reads + Merge N arrays(Memory)
- Post a tweet, 1 DB Write
- But it’s very slow to read N DB when you get your news feed!!!
Push Model
- Fanout, When a user post a tweet, then push this tweet to every user who follow him or her
- News Feed, 1 Database Reads
- Post a tweet, N DB Writes(But user do not need to wait!!!)
- But when a user have toooo many followers, it take longer time to fanout!
Trade Off
- Facebook - Pull
- Instagram - Push + Pull
- Twitter - Pull
4S - Scale
Sharding/ Optimize/ Special Case
Optimize
Pull Moedel
- Cache before DB Query
- Cache every user’s timeline
- Cache every user’s news feed
Push Model
- Disk is cheap
- Inactive Users
- Followers much larger than Following
Fanout will take several hours!!! - Seperate Star User and Normal User
Maintenance
(Explain it later!)
- Robust
- Scalability
Seperated Services
- Follow and Unfollow
- Likes
- Thundering Herd