Mark As Completed Discussion

In this lesson, we will learn about the Facebook newsfeed and its architecture. We'll explore these key points:

  1. How does the Facebook newsfeed work?
  2. We'll see a detailed system design of Facebook's newsfeed system.
  3. Algorithms used in the system.

Facebook is the most popular social networking platform today. The increasing popularity of the platform came with its world influence and engaging features of the application.

One such feature of Facebook is its newsfeed system, which displays updates from friends and followers for each specific user. The users are exposed to all kinds of content from their connections on this newsfeed, and are also able to interact with each piece of content. In this lesson, we will discuss the structure behind this Facebook newsfeed in detail.

What is Facebook Newsfeed?

The newsfeed updates Facebook users about all the activities and stories regarding friends, family, businesses, and news sources that you're connected to on Facebook. A key feature of the newsfeed is its ability to show content to the user according to its relevance towards each particular user.

We see that this newsfeed system is not particular to Facebook, but also tot YouTube, Instagram, and Twitter-- all these applications are based on a similar newsfeed dashboard. Social media sites aim to engage users by displaying content and updates from other users, and the newsfeed system works very well for this purpose.

Understanding the Newsfeed Architecture

The architecture of a newsfeed, such as what's used in Facebook, is a fascinating journey through a series of interconnected components. Let's explore this system step by step and understand how each part contributes to the final result.

1. The User's Action: Adding or Updating a Post

When a user adds or updates a post on Facebook, it triggers the entire process. This is the starting point that leads to a sequence of actions within the system.

2. Web Server: Receiving the Post

Overview of the System
Once the post is created or updated, it's received by the web server. This server acts as a gateway, passing the post information to the next component in the chain.

3. Facebook Application Servers: Coordination and Processing

The Facebook application servers play a vital role in processing the post. They coordinate with the back-end data of users to perform the following functions:

  • Ranking Algorithm: Determines what content to show based on various factors like relevance, user preferences, and more.
  • Generate a Newsfeed: Combines various posts and content to create a personalized newsfeed for each user.

4. Cache Feed: Storing the Generated Newsfeed

After the newsfeed is generated, it's stored in a cache feed. This temporary storage allows for quicker retrieval and more efficient delivery to users.

5. Feed Notification and Retrieval

When a user requests a more recent feed, a feed notification is sent to the servers. The servers then retrieve the feed from the cache and send it to the users who requested it.

Understanding the architecture of a newsfeed system helps in grasping the complexity and coordination involved in delivering personalized content. From the initial user action to the delivery of the feed, each step is essential, and every component has a specific role.

Exploring the System Design of a Newsfeed

The system design of a newsfeed may seem simple at first glance, but it's a well-orchestrated combination of various entities and relationships. Let's break down the key components and understand how they interconnect to create a functional newsfeed system.

1. Primary Entities

a. User

The user is the core entity in the design of the newsfeed system. Each user is assigned:

  • Unique ID: Identifies the user within the system.
  • Account Information: Includes details like birthday, email, etc., required to create an account.

b. The Feed Entity

The Feed entity represents the collection of feed items that make up a user's newsfeed. It acts as a container for various posts, images, videos, and other content that a user sees when they browse through their newsfeed.

  • Unique Feed ID: Identifies the specific feed within the system.
  • User Association: Links the feed to a specific user or group of users.
  • Content Collection: Aggregates the various feed items, including posts, images, and videos.

b. Feed Item

The feed item is another crucial entity, representing individual pieces of content within the newsfeed. Each feed item is associated with:

  • Unique Feed and Feed ID: Identifies the specific feed within the system.
  • Content and Metadata Attributes: Supports different types of content like images and videos.

c. Media Item

Each media item, such as an image or video, is considered an entity on its own. This separation allows for more flexibility in handling different types of media.

2. Key Relationships

a. User-User Relationship

The relationship between users represents friendships or followings. This modeling enables users to connect with each other and shapes the content displayed in their newsfeeds.

b. Feed-User Relationship

The Feed entity is closely associated with the User entity. Each user may have one or more feeds, and the feed content is tailored based on user preferences, connections, and activities.

c. Feed-Feed Item Relationship

The Feed entity has a direct relationship with individual Feed Items. It combines different feed items to create a cohesive and engaging newsfeed.

d. Feed Item-Media Relationship

Each feed item corresponds to different media sources. This relationship ensures that the content can be presented in various forms, including images and videos.

Are you sure you're getting this? Fill in the missing part by typing it in.

Newsfeed system can be represented as a ____ database.

Write the missing line below.

Breaking Down the Newsfeed Process

Understanding how a newsfeed is generated, published, and viewed by users on a platform like Facebook requires a careful examination of several key steps. The process is meticulously optimized to ensure speed, relevance, and efficiency. Let's explore each phase in detail.

1. Feed Generation

a. Querying the Feed Database

To create a personalized feed for each user, the system begins by querying the feed database. This involves:

  • Fetching feed items from friends and followers.
  • Collecting content that may be of interest to the user.

b. Sorting and Ranking

The fetched feed items are then sorted and ranked to ensure that the feed is tailored to the user's preferences:

  • Recency: Newer items are prioritized to keep the feed fresh.
  • Relevance: Content is ranked based on how relevant it is to the user, using a specialized ranking algorithm.

c. Challenges and Limitations

The feed generation process is complex and can be time-consuming, especially for users with many followers or friends. Doing this in real-time, at the moment a user accesses Facebook, could lead to slow performance.

2. Optimization: Pre-Generation and Caching

a. Pre-Generating the Feed

To overcome the time constraints, the feed is often pre-generated. This means:

  • Creating the feed before the user requests it.
  • Storing the feed in a way that allows for quick retrieval.

b. Utilizing Cache Memory

The pre-generated feed is stored in cache memory, a high-speed storage area that allows for faster access to data:

  • Faster Retrieval: By keeping the feed in cache memory, the system can quickly serve it to the user when requested.
  • Offline and Slow Connection Support: This approach also enables the system to provide the feed to offline users or those with poor internet connections.

The process of generating, publishing, and viewing a newsfeed is a masterful blend of database management, algorithms, and optimization techniques. By carefully querying, ranking, and caching the feed, platforms like Facebook ensure that users receive a personalized and timely experience.

How Social Feeds Actually Show Up (Without Melting Your Servers)

When you open a social app and bam—new posts—there’s a whole system hustling behind the scenes to get that feed to you. Platforms generally pick one of three ways to do it: pull, push, or a hybrid of both. Think of them like delivery options for content.

1) Pull Model (a.k.a. Fan-out-on-load)

Vibe: “Make it fresh when I ask.”

How it works: Nothing is prepped ahead of time. When you open your feed, the system gathers the newest posts from all the people you follow, ranks them, and serves them on the spot.

Why teams love it

  • Fewer writes: No need to pre-store everyone’s feed. You only do work when someone actually looks.
  • Always fresh: You’re pulling the latest posts at read time.

Watch-outs

  • Heavy reads: If lots of users open feeds at once, your read/compute layer gets slammed.
  • Slower first bite: Users wait while you assemble the feed.

Great for: Smaller networks, early-stage products, or apps where users don’t scroll constantly.

Feed Publishing

2) Push Model (a.k.a. Fan-out-on-write)

Vibe: “Prep everyone’s plate the moment the chef finishes the dish.”

How it works: When someone posts, the system immediately “pushes” that post into each follower’s prebuilt feed. Opening the app later is super fast because the feed is already assembled.

Why teams love it

  • Lightning reads: Feeds load fast—most of the work happened earlier.
  • Predictable reads: Reading is mostly fetching a precomputed list.

Watch-outs

  • Write storms: One celebrity posts and you’re writing to millions of feeds in a burst.
  • Storage cost: You’re storing lots of nearly identical timelines.

Great for: Apps with heavy scrolling and fewer mega-accounts—or systems with solid write scaling and good background workers.

Feed Publishing

3) Hybrid Model

Vibe: “Work smarter, not harder.”

How it works: Mix and match. For regular users, push posts into followers’ feeds. For mega-influencers, pull their content on demand to avoid write explosions. Add caching, backfilling, and rankers on top.

Why teams love it

  • Balanced: You get fast reads where it matters and avoid catastrophic writes.
  • Cost-aware: Adjust behavior based on follower counts, activity, or SLAs.

Watch-outs

  • Complexity: You’re basically running two strategies and all the rules that decide between them.

Great for: Real social networks at scale. (This is what most mature feeds look like.)

Quick Cheat Sheet

  • If writes are your bottleneck: Pull.
  • If reads are your bottleneck: Push.
  • If you have whales (accounts with huge reach): Hybrid—push for the many, pull for the whales.
  • If you care about snappy UX on open: Push or Hybrid with aggressive caching.
  • If you’re early and want simplicity: Pull (then evolve).

Real-World Extras You’ll Almost Always Add

  • Caching: Keep recent feed pages hot to avoid rebuilding them every time.
  • Backfill & recovery jobs: If a push fails, a nightly job re-syncs feeds.
  • Rankers & dedupe: Merge posts from many sources, remove repeats, apply quality/ranking.
  • Fan-out throttling: Slow down or batch writes from giant accounts.
  • Pagination windows: Only precompute the first N items; pull the rest as users scroll.
  • Feature flags: Flip users or cohorts between strategies while you tune.

There’s no “best” model—only the best fit for your traffic shape, cost targets, and UX goals. Start simple, measure where the pain is (reads vs writes), and evolve toward a hybrid with smart rules when scale arrives.

The Art of Ranking the Newsfeed: Understanding Facebook's Edge Rank Algorithm

In the dynamic world of social media, ensuring that users see the content most relevant to them is paramount. Facebook's Edge Rank Algorithm plays a crucial role in this by ranking feed items according to their relevance to each specific user. Let's delve into how this algorithm works and the factors that determine the rank of feed items.

1. What is the Edge Rank Algorithm?

The term "edge" refers to every small activity on Facebook, such as posts, likes, shares, etc. The Edge Rank Algorithm ranks each edge connected to a user according to its relevance. Edges with higher ranks are displayed prominently in the user's feed.

2. Components of the Ranking Formula

The rank for each feed item is determined by the following formula:

Rank=AffinityĂ—WeightĂ—Decay

Let's break down these components:

a. Affinity: The Closeness Factor

Affinity measures the "closeness" between the user and the creator of the edge. It reflects how often the user interacts with the creator through likes, comments, or messages.

  • Higher Affinity: More frequent interaction results in higher rank.
  • Example: If you frequently like and comment on a friend's posts, their content will likely appear higher in your feed.

b. Weight: The Importance Factor

Weight assigns a value to each edge, reflecting its importance or impact.

  • Heavier Weight: Content with more substantial weight, such as comments over likes, leads to a higher rank.
  • Example: A post with more comments than likes will likely have a higher rank.

c. Decay: The Freshness Factor

Decay considers the age of the edge, with newer content having higher value.

  • Lower Decay: Older content has less value, leading to a lower rank.
  • Example: A fresh post from today will likely rank higher than a similar post from last week.

3. The Result: A Personalized Newsfeed

By combining Affinity, Weight, and Decay, the Edge Rank Algorithm creates a highly personalized ranking of feed items. Once the items are ranked, they are either sent to memory or directly retrieved from servers to display on the user's newsfeed.

Facebook's Edge Rank Algorithm is a sophisticated tool that tailors the user's newsfeed to their interests and interactions. By understanding the closeness, importance, and freshness of content, it ensures that users see what matters most to them.

Let's test your knowledge. Is this statement true or false?

Fan-out-on-load increases write operation when publishing newsfeed.

Press true if you believe the statement is correct, or false otherwise.

Summary

The Facebook architecture is quite complex, but it is easier to understand once you break it down into smaller steps. Facebook newsfeed system is constantly updating and changing to optimize for better performance. However, the core technologies to understand the newsfeed system remain the same.

One Pager Cheat Sheet

  • We will learn about the architecture and algorithms behind Facebook's newsfeed system in this lesson.
  • Facebook's Newsfeed provides users with personalized information and updates relevant to them based on connections they have made with other users on the platform.
  • The user's post is processed and sent through the web server to the application servers and the back-end to generate a newsfeed, which is then stored and sent to the user through feed notifications.
  • At the database level, the most important entity in the system design of a newsfeed system is the user, who will be assigned a unique ID along with a feed item, feed and feed_id, which corresponds to different media sources and is linked by two main relationships between user - user and feed item - media.
  • Newsfeed systems are typically represented using relational databases, where tables are linked through foreign keys to form relationships between entities.
  • The ranking algorithm sorts and ranks the feed items queried from the feed database according to recency and relevance for each user, and the generated feed can be stored in a cache memory for faster retrieval of content and better user experience.
  • Feed publishing is the process of displaying data to each specific user, and it can be processed by push, pull, or a hybrid model to reduce resource usage while maintaining performance.
  • Facebook's edge rank algorithm assigns a rank to each feed item based on Affinity, Weight, and Decay, and users are then presented their feed items in order of relevance.
  • Fan-out-on-load is a process used to cache the feed of a user to the user's device, thus increasing the efficiency of the newsfeed publishing process without increasing the write operations when publishing.
  • The core technologies of the Facebook newsfeed system remain unchanged despite its constant updates and changes for optimization.