Design Twitter
Translator: youyun
Author: labuladong
Design Twitter is question 355 on LeetCode. This question is both interesting and practical. It combines both algorithms about ordered linked lists and Object Oriented (OO) design principles. We'll be able to link Twitter functions with algorithms when we look at the requirements.
1. The Question and Use Cases
Twitter is similar to Weibo. We'll focus on the APIs below:
Let's look at an user story to understand how to use these APIs:
This is a common case in our daily life. Take Facebook as an example, when I just added my dream girl as friend on Facebook, I'll see her recent posts in my refreshed feeds, sorted in descending order. The difference is Twitter is uni-directional, while Facebook friends are bi-directional.
Most of these APIs are easy to implement. The most functionally difficult part could be getNewsFeed
, as we have to sort by time in descending. However, the list of followees are dynamic, which makes these hard to keep track of.
Algorithm helps here: Imagine we store each user's own tweets in a linked list sorted by timestamp, with each node representing the tweet's ID and timestamp (datetime of creation). If a user follows k followees, we can combine these k ordered linked lists, and apply an algorithm to get the correct getNewsFeed
.
Let's put the algorithm aside first and discuss in details later. There is another question: how should we use code to represent users and tweets to apply the algorithm? This involves OO design. Let's break into parts and tackle them one step at a time.
2. OO Design
Based on the analysis just now, we need a User
class to store information about users, and a Tweet
class to store information of tweets. The Tweet class will also be nodes in linked lists. Let's put up the frameworks:
Because Tweet
class needs to store timestamp, and User
class needs to use Tweet
class to store the tweets posted by a user, we put Tweet
class and User
class in Twitter
class as inner class. For clarity and simplicity, we'll define them one by one.
1、Implementation of Tweet Class
Based on the previous analysis, it is easy to implement Tweet
class. Each Tweet
instance just needs to store its own tweetId
and posted timestamp time
. As node in linked list, it also needs to have a point next
pointing to the next node.
2、Implementation of User Class
Let's think about the real use cases. A user needs to store his/her userId
, list of followees, and list of posted tweets. The list of followees can use Hash Set to store data, to avoid duplication and search fast. The list of posted tweets should be stored in a linked list to merge with order. Refer to the diagram below:
Besides, based on OO design principles, since the list of followees and the list of tweets are stored in User
, actions such as "follow", "unfollow", and "post" should be User
's actions. Let's define these as User
's APIs:
3、 Implementation of Several APIs
3. Design of The Algorithm
The algorithm which combines k ordered linked list is implemented using Priority Queue. This data structure is an important application of Binary Heap. All inserted elements are auto sorted. When some random elements are inserted, we can easily take them out in ascending or descending order.
Based on this cool data structure, we can easily implement the core function. Note that we use Priority Queue to sort time
in descending order, because the larger the value of time
, the more recent it is, and hence, the close to the head it should be placed:
Here is a GIF I created to describe the process of combining linked lists. Assume there are 3 linked lists of tweets, sorted by time
property in descending order, we'll combine them in res
in descending order. Note that the numbers in the nodes are time
property, not id
:
As of now, the design of a simple Twitter timeline function is completed.
4. Summary
In this article, we designed a simple timeline function using OO design principles and an algorithm which combines k sorted linked lists. This functionality is widely used in many social applications.
Firstly, we design the two classes, User
and Tweet
. On top of these, we used an algorithm to resolve the most important function. From this example, we can see that algorithms are not used alone in real applications. Algorithms need to be integrated with other knowledge to show their value.
However, our simple design may not cope with large throughput. In fact, the amount of data in real social applications is tremendous. There are a lot more aspects to take into consideration, including read and write performance to Database, the limit of memory cache, etc. Real applications are big and complicated engineering projects. For instance, the diagram below is a high-level system architecture diagram of a social network such as Twitter:
The problem we resolved is only a small part of the Timeline Service component. As the number of functions increases, the degree of complexity grows exponentially. Having one algorithm is not enough. It is more important to have a proper high-level design.
Last updated