Mark As Completed Discussion

In this lesson, we will learn about the Facebook newsfeed and its architecture. We'll explore these key points:

  1. How does the Facebook newsfeed work?
  2. We'll see a detailed system design of Facebook's newsfeed system.
  3. Algorithms used in the system.

Facebook is the most popular social networking platform today. The increasing popularity of the platform came with its world influence and engaging features of the application.

One such feature of Facebook is its newsfeed system, which displays updates from friends and followers for each specific user. The users are exposed to all kinds of content from their connections on this newsfeed, and are also able to interact with each piece of content. In this lesson, we will discuss the structure behind this Facebook newsfeed in detail.

What is Facebook Newsfeed?

The newsfeed updates Facebook users about all the activities and stories regarding friends, family, businesses, and news sources that you're connected to on Facebook. A key feature of the newsfeed is its ability to show content to the user according to its relevance towards each particular user.

We see that this newsfeed system is not particular to Facebook, but also tot YouTube, Instagram, and Twitter-- all these applications are based on a similar newsfeed dashboard. Social media sites aim to engage users by displaying content and updates from other users, and the newsfeed system works very well for this purpose.

Understanding the Newsfeed Architecture

The architecture of a newsfeed, such as what's used in Facebook, is a fascinating journey through a series of interconnected components. Let's explore this system step by step and understand how each part contributes to the final result.

1. The User's Action: Adding or Updating a Post

When a user adds or updates a post on Facebook, it triggers the entire process. This is the starting point that leads to a sequence of actions within the system.

2. Web Server: Receiving the Post

Overview of the System
Once the post is created or updated, it's received by the web server. This server acts as a gateway, passing the post information to the next component in the chain.

3. Facebook Application Servers: Coordination and Processing

The Facebook application servers play a vital role in processing the post. They coordinate with the back-end data of users to perform the following functions:

  • Ranking Algorithm: Determines what content to show based on various factors like relevance, user preferences, and more.
  • Generate a Newsfeed: Combines various posts and content to create a personalized newsfeed for each user.

4. Cache Feed: Storing the Generated Newsfeed

After the newsfeed is generated, it's stored in a cache feed. This temporary storage allows for quicker retrieval and more efficient delivery to users.

5. Feed Notification and Retrieval

When a user requests a more recent feed, a feed notification is sent to the servers. The servers then retrieve the feed from the cache and send it to the users who requested it.

Understanding the architecture of a newsfeed system helps in grasping the complexity and coordination involved in delivering personalized content. From the initial user action to the delivery of the feed, each step is essential, and every component has a specific role.

Exploring the System Design of a Newsfeed

The system design of a newsfeed may seem simple at first glance, but it's a well-orchestrated combination of various entities and relationships. Let's break down the key components and understand how they interconnect to create a functional newsfeed system.

1. Primary Entities

a. User

The user is the core entity in the design of the newsfeed system. Each user is assigned:

  • Unique ID: Identifies the user within the system.
  • Account Information: Includes details like birthday, email, etc., required to create an account.

b. The Feed Entity

The Feed entity represents the collection of feed items that make up a user's newsfeed. It acts as a container for various posts, images, videos, and other content that a user sees when they browse through their newsfeed.

  • Unique Feed ID: Identifies the specific feed within the system.
  • User Association: Links the feed to a specific user or group of users.
  • Content Collection: Aggregates the various feed items, including posts, images, and videos.

b. Feed Item

The feed item is another crucial entity, representing individual pieces of content within the newsfeed. Each feed item is associated with:

  • Unique Feed and Feed ID: Identifies the specific feed within the system.
  • Content and Metadata Attributes: Supports different types of content like images and videos.

c. Media Item

Each media item, such as an image or video, is considered an entity on its own. This separation allows for more flexibility in handling different types of media.

2. Key Relationships

a. User-User Relationship

The relationship between users represents friendships or followings. This modeling enables users to connect with each other and shapes the content displayed in their newsfeeds.

b. Feed-User Relationship

The Feed entity is closely associated with the User entity. Each user may have one or more feeds, and the feed content is tailored based on user preferences, connections, and activities.

c. Feed-Feed Item Relationship

The Feed entity has a direct relationship with individual Feed Items. It combines different feed items to create a cohesive and engaging newsfeed.

d. Feed Item-Media Relationship

Each feed item corresponds to different media sources. This relationship ensures that the content can be presented in various forms, including images and videos.

Are you sure you're getting this? Fill in the missing part by typing it in.

Newsfeed system can be represented as a ____ database.

Write the missing line below.

Breaking Down the Newsfeed Process

Understanding how a newsfeed is generated, published, and viewed by users on a platform like Facebook requires a careful examination of several key steps. The process is meticulously optimized to ensure speed, relevance, and efficiency. Let's explore each phase in detail.

1. Feed Generation

a. Querying the Feed Database

To create a personalized feed for each user, the system begins by querying the feed database. This involves:

  • Fetching feed items from friends and followers.
  • Collecting content that may be of interest to the user.

b. Sorting and Ranking

The fetched feed items are then sorted and ranked to ensure that the feed is tailored to the user's preferences:

  • Recency: Newer items are prioritized to keep the feed fresh.
  • Relevance: Content is ranked based on how relevant it is to the user, using a specialized ranking algorithm.

c. Challenges and Limitations

The feed generation process is complex and can be time-consuming, especially for users with many followers or friends. Doing this in real-time, at the moment a user accesses Facebook, could lead to slow performance.

2. Optimization: Pre-Generation and Caching

a. Pre-Generating the Feed

To overcome the time constraints, the feed is often pre-generated. This means:

  • Creating the feed before the user requests it.
  • Storing the feed in a way that allows for quick retrieval.

b. Utilizing Cache Memory

The pre-generated feed is stored in cache memory, a high-speed storage area that allows for faster access to data:

  • Faster Retrieval: By keeping the feed in cache memory, the system can quickly serve it to the user when requested.
  • Offline and Slow Connection Support: This approach also enables the system to provide the feed to offline users or those with poor internet connections.

The process of generating, publishing, and viewing a newsfeed is a masterful blend of database management, algorithms, and optimization techniques. By carefully querying, ranking, and caching the feed, platforms like Facebook ensure that users receive a personalized and timely experience.

Understanding Feed Publishing

Feed publishing is the vital process that enables social media platforms to display the content to individual users. Given the potentially vast number of friends and followers a user may have, this can be a complex and resource-intensive task. Let's delve into the three main approaches to feed publishing: the pull model, the push model, and the hybrid model, to understand how they work, their advantages, and their challenges.

1. Pull Model (Fan-out-on-load)

a. How it Works

In the pull model, the feed is created and stored in memory when a friend reloads their timeline. The most recent feed is loaded only when the user specifically requests it.

b. Advantages

  • Reduces Write Operations: By generating the feed only when requested, this approach minimizes write operations on the system database.

c. Challenges

  • Increased Read or Load Operations: May lead to server failures when loading a user's newsfeed.
  • Delayed Feed Viewing: Users must issue a request to view recent feeds.

Feed Publishing

2. Push Model (Fan-out-on-write)

a. How it Works

The push model immediately sends a new post to all followers as soon as it's created. The distribution work is done at the time of writing the post.

b. Advantages

  • Reduces Read Operations: Minimizes the need to search through a user's entire friends and follower list for updates.

c. Challenges

  • Increased Write Operations: Can be problematic for users with many friends or followers, leading to a surge in database write operations.

Feed Publishing

3. Hybrid Model

a. How it Works

The hybrid model combines features of both the pull and push models. It may, for example, allow users with fewer followers to use the push model, while employing the pull model for those with more followers.

b. Advantages

  • Balanced Approach: By selectively applying the pull and push models, the hybrid approach can optimize resource usage and performance.

c. Challenges

  • Complexity: Striking the right balance between the pull and push models requires careful design and may introduce additional complexity.

Feed publishing is an art and science that balances user experience, system performance, and resource utilization. Understanding the pull, push, and hybrid models provides insights into the mechanisms that enable timely and efficient content delivery on platforms like Facebook.

Each approach has its unique characteristics, benefits, and challenges. The choice of which to use depends on factors such as the user base, system architecture, and desired user experience.

The Art of Ranking the Newsfeed: Understanding Facebook's Edge Rank Algorithm

In the dynamic world of social media, ensuring that users see the content most relevant to them is paramount. Facebook's Edge Rank Algorithm plays a crucial role in this by ranking feed items according to their relevance to each specific user. Let's delve into how this algorithm works and the factors that determine the rank of feed items.

1. What is the Edge Rank Algorithm?

The term "edge" refers to every small activity on Facebook, such as posts, likes, shares, etc. The Edge Rank Algorithm ranks each edge connected to a user according to its relevance. Edges with higher ranks are displayed prominently in the user's feed.

2. Components of the Ranking Formula

The rank for each feed item is determined by the following formula:

Rank=Affinity×Weight×Decay

Let's break down these components:

a. Affinity: The Closeness Factor

Affinity measures the "closeness" between the user and the creator of the edge. It reflects how often the user interacts with the creator through likes, comments, or messages.

  • Higher Affinity: More frequent interaction results in higher rank.
  • Example: If you frequently like and comment on a friend's posts, their content will likely appear higher in your feed.

b. Weight: The Importance Factor

Weight assigns a value to each edge, reflecting its importance or impact.

  • Heavier Weight: Content with more substantial weight, such as comments over likes, leads to a higher rank.
  • Example: A post with more comments than likes will likely have a higher rank.

c. Decay: The Freshness Factor

Decay considers the age of the edge, with newer content having higher value.

  • Lower Decay: Older content has less value, leading to a lower rank.
  • Example: A fresh post from today will likely rank higher than a similar post from last week.

3. The Result: A Personalized Newsfeed

By combining Affinity, Weight, and Decay, the Edge Rank Algorithm creates a highly personalized ranking of feed items. Once the items are ranked, they are either sent to memory or directly retrieved from servers to display on the user's newsfeed.

Facebook's Edge Rank Algorithm is a sophisticated tool that tailors the user's newsfeed to their interests and interactions. By understanding the closeness, importance, and freshness of content, it ensures that users see what matters most to them.

Let's test your knowledge. Is this statement true or false?

Fan-out-on-load increases write operation when publishing newsfeed.

Press true if you believe the statement is correct, or false otherwise.

Summary

The Facebook architecture is quite complex, but it is easier to understand once you break it down into smaller steps. Facebook newsfeed system is constantly updating and changing to optimize for better performance. However, the core technologies to understand the newsfeed system remain the same.

One Pager Cheat Sheet

  • We will learn about the architecture and algorithms behind Facebook's newsfeed system in this lesson.
  • Facebook's Newsfeed provides users with personalized information and updates relevant to them based on connections they have made with other users on the platform.
  • The user's post is processed and sent through the web server to the application servers and the back-end to generate a newsfeed, which is then stored and sent to the user through feed notifications.
  • At the database level, the most important entity in the system design of a newsfeed system is the user, who will be assigned a unique ID along with a feed item, feed and feed_id, which corresponds to different media sources and is linked by two main relationships between user - user and feed item - media.
  • Newsfeed systems are typically represented using relational databases, where tables are linked through foreign keys to form relationships between entities.
  • The ranking algorithm sorts and ranks the feed items queried from the feed database according to recency and relevance for each user, and the generated feed can be stored in a cache memory for faster retrieval of content and better user experience.
  • Feed publishing is the process of displaying data to each specific user, and it can be processed by push, pull, or a hybrid model to reduce resource usage while maintaining performance.
  • Facebook's edge rank algorithm assigns a rank to each feed item based on Affinity, Weight, and Decay, and users are then presented their feed items in order of relevance.
  • Fan-out-on-load is a process used to cache the feed of a user to the user's device, thus increasing the efficiency of the newsfeed publishing process without increasing the write operations when publishing.
  • The core technologies of the Facebook newsfeed system remain unchanged despite its constant updates and changes for optimization.