SXStudio

System Design Reading Notes 10: Design a News Feed System

This is my reading notes for Chapter 11 in book “System Design Interview – An insider’s guide (Vol. 1)”.

1. Problem Understanding

A news feed system is a personalized content feed that dynamically updates for users as their friends and connections post new content (e.g., Facebook feed, Instagram feed, or Twitter timeline). The primary challenge is building a system that can efficiently handle a large number of posts and a massive user base.

Clarifying Questions:

Example: Suppose we are designing Facebook’s news feed. Facebook has over 2 billion users, with some users having up to 5000 friends. Each of these users can post media like images and videos that need to be served to their friends’ feeds in real time.


2. High-Level Design

DESIGN A NEWS FEED SYSTEM

The news feed system consists of two major flows:

  1. Feed Publishing: This occurs when a user creates a new post. The post data is written into the database and cache, and the post is propagated to the feeds of the user’s friends or followers.
  2. News Feed Building: This occurs when a user opens the app and requests to view their news feed. The system retrieves posts from friends, sorts them (e.g., by reverse chronological order), and displays the results to the user.
High-Level Architecture Components:
Example:

3. API Design

Two key APIs drive the news feed system:

  1. Feed Publishing API:
    • Endpoint: POST /v1/me/feed
    • Parameters:
      • content: The post content (text, images, videos, etc.)
      • auth_token: User authentication token
    • Function: This API allows a user to publish a post, which gets stored in the database and cache.
  2. News Feed Retrieval API:
    • Endpoint: GET /v1/me/feed
    • Parameters:
      • auth_token: User authentication token
    • Function: This API retrieves a user’s news feed by fetching posts from friends and ordering them appropriately.
Example:
{
  "userId": "1234",
  "posts": [
    {
      "postId": "5678",
      "content": "Hello, world!",
      "createdAt": "2023-01-01T12:00:00Z"
    },
    {
      "postId": "5679",
      "content": "This is another post",
      "createdAt": "2023-01-02T14:00:00Z"
    }
  ]
}

4. Detailed Design

To make the system scalable and efficient, we break down the process into several key components:

  1. Feed Publishing Flow:
    • When a user publishes a post, the post data is first written to a database and also cached for quick access.
    • The post is then pushed to the news feeds of the user’s friends via the fanout service.
  2. News Feed Retrieval Flow:
    • When a user requests their news feed, the system pulls the relevant post IDs from the cache or database, aggregates them, sorts them (typically by time or ranking), and sends the fully populated feed back to the user.
Fanout Service:

The fanout service handles the distribution of posts to friends’ news feeds. Two models exist:

Example: A user publishes a post, and the fanout service pushes the post to the feeds of their 2000 friends. When any of these friends open their app, the post is already in their feed and loads instantly.


5. Caching Strategy

Caching is essential for performance. The system uses multiple caches to minimize latency and improve efficiency:

Example:

When User A posts, their friends’ news feed caches are updated with the new post ID. When User B requests their feed, the cache is hit, and post data is retrieved from the content cache.


6. Scaling and Optimization

Handling millions of users requires efficient scaling strategies:


Key Takeaways

  1. Trade-offs in Fanout Strategies: Fanout on write offers faster retrieval but can be computationally expensive for users with many friends. Fanout on read is less resource-intensive but slower to deliver content.
  2. Caching Is Critical: A multi-layered caching strategy can drastically reduce latency by avoiding database calls for frequently accessed data.
  3. Sharding and Database Scaling: Horizontal scaling and sharding help distribute the load across multiple servers, preventing bottlenecks in any single database.
  4. Message Queues for Decoupling: Decoupling write and read processes via message queues allows for better handling of peak loads, especially when processing tasks like distributing posts to thousands of friends.
  5. User Experience: Optimizing for low-latency feed retrieval ensures a smooth and responsive user experience, which is crucial for user retention in social platforms.

These are the detailed notes for designing a news feed system, incorporating key components, considerations, examples, and trade-offs necessary to handle real-world scale and performance demands​.

Exit mobile version