How Tinder streams video to 70 million users

April 27, 2022

Hey Everyone!

Today we’ll be talking about

How Tinder uses HTTP Live Streaming (HLS)
- Tinder’s design goals and the choice to use HLS
- Transcoding MP4 files to HLS streams with FFMPEG and doing validation with Apple’s Media Stream Validator Tool
- Content Delivery with AWS Cloudfront and KPIs to measure performance
How Distributed Databases handle Replication - A “brief” summary of Chapter 5 from Designing Data Intensive Applications by Martin Kleppmann
- Why replicate your data across multiple nodes
- 3 popular strategies for writing changes to your replicas
- Common problems that arise from Replication Lag and how to solve them
Plus, some tech snippets on
- Comparing the web development experience with 10 different programming languages
- Self-publishing a technical book and selling 600+ copies
- An in-depth dive into HDFS

We also have a solution to our last Apple interview question and a new question from Bloomberg.

Quastor is a free Software Engineering newsletter that sends out summaries of technical blog posts, deep dives on interesting tech and FAANG interview questions and solutions.

Video Delivery at Twitter with HTTP Live Streaming

Tinder is one of the largest dating apps in the world with more than 75 million monthly active users.

One of their product features is Swipe Night, which is a choose-your-own-adventure game built in the app. You watch short video clips (themed around an solve-the-mystery game) and you make choices on what the main character should do. Tinder will match you with possible dates based on your choices.

Everyone on the app is watching the same video clips and it’s done live - at 6 p.m. local time.

Shreyas Hirday is a senior software engineer at Tinder and he wrote a great blog post on the technology Tinder used to stream the video clips to millions of users simultaneously.

Here’s a summary

There are many ways to deliver video content. The best approach depends on the tradeoffs you’re making.

In Tinder’s case, they cared about

Dynamic - Tinder should have the ability to change the video content at any time.
Efficient - Memory usage should be minimized since Tinder is being run on mobile devices.
Seamless - There should be little to no interruption during playback.

Based on these goals, Tinder decided to use HTTP Live Streaming (HLS), an adaptive bitrate protocol developed by Apple.

Adaptive bitrate streaming just means that the server will have different versions of the video and each version differs in display size (resolution) and file size (bitrate).

The video player will dynamically choose the best video version based on the user’s screen size and bandwidth. It will choose the version that minimizes buffering and gives the best user experience.

An HLS stream will provide a manifest file to the video player, which includes the URL to each copy of the video as well as the level of bandwidth the user should have in order to view that level of quality without issue.

Transcoding

Tinder engineers used FFMPEG to transcode MP4 files to HLS streams. They developed a workflow that had the MP4 file and configurations (resolution, bitrate, frame rate, etc.) as input and a directory containing the HLS stream as output.

They had multiple configurations for all the different video versions they wanted and they stored all these video versions in an AWS S3 bucket.

Some of the different configuration options they had for the various versions of the videos were

Frame rate
Video Resolution
Desired Video & Audio Bitrate
Video Bitrate Variability
Segment Length
Optimizations

You can read the article for a discussion on how they configured each of these parameters.

Validation

The output directory will have a Master Manifest file with information about all the different video versions in the HLS stream.

The video player will then decide which version to play and whether it should switch to a lower file-size version based on the information in the manifest file.

Therefore, having an accurate manifest file is very important for the user experience.

Apple provides a Media Stream Validator tool that tests the manifest by simulating a streaming experience. Tinder uses the results from that test to update the manifest and ensure accuracy.

Tinder then places the finalized manifest and videos in their production AWS S3 bucket.

Content Access & Delivery

Tinder uses AWS Cloudfront, a content delivery network (CDN), to ensure low-latency streaming for all their users.

As users from different areas of the US start playing SwipeNight, the CDN will copy the HLS stream directory from AWS S3 into regional caches so users from that region can get low latency access.

Measuring Video Performance

Tinder uses 5 key performance indicators (KPIs) to measure how well the streaming works

Perceived start up time for a user
Number of stalls during playback
Time spent in the stalled state
% of video sessions with errors
Average quality of the video measured by average bitrate

The Tinder app measures these KPIs along with metadata about the device and its network connection.

Tinder then works to find the right balance between these 5 KPIs for their use case. For a traditional streaming app like Netflix, spending 5 seconds in a buffering state might not be that bad. But a 5 second buffer on a mobile-centric app like Tinder can feel like an eternity.

For more details, you can read the full blog post here.

Quastor is a free Software Engineering newsletter that sends out summaries of technical blog posts, deep dives on interesting tech and FAANG interview questions and solutions.

Tech Snippets

I built 10 web apps… with 10 different languages - Jeff Delaney provides a great overview of 10 different web frameworks from 10 different languages. He talks about Ruby on Rails, Django (Python), Laravel (PHP), NextJS (JavaScript), Spring (Java), Phoenix (Elixir), Rocket (Rust), Gin (Go), Vapor (Swift) and Ktor (Kotlin).His opinion was that Ruby on Rails gave the best development experience.
Selling 600+ copies of a self-published technical book - Josef Strzibny is a software engineer who wrote the book Deployment from Scratch, an introduction to web application development.He published the book a year ago and has made more than $24,000 so far. He goes through the entire process and breaks down all the sales in his blog post.
An in-depth dive into Hadoop Distributed File System - This is a great article describing the architecture of HDFS. It also goes into the experience of using HDFS to manage 40 petabytes of data at Yahoo!

How Distributed Databases Work - Replication

Designing Data Intensive Applications (DDIA) is a must-read if you’re interested in backend engineering. The data engineering world is full of buzzwords and hype, but Martin Kleppman does an amazing job of breaking down all the core technologies.

Here’s a summary of Chapter 5 from DDIA on Replication

Replication is where you keep a copy of your data on multiple different machines. These machines are connected via a network, so they’re all accessible to your backend server(s).

Instead of having a single machine serve as your database, you are now using a distributed database consisting of multiple machines.

There are several reasons why you’d want to replicate your data across several computers

Reduce latency - Users in India can send requests to the nodes located in Delhi while users in America can send requests to the nodes located in New York.
Increase Availability - If one of the nodes goes down for some reason you’ll have another node that can take over and respond to requests for data.
Increase Read Throughput - Several nodes have the ability to respond to read queries instead of just having 1 machine doing all the work for read requests. Many workloads are read-scaling (consist of mostly reads and a small percentage of writes) so increasing read throughput is extremely helpful.

The difficult part about replication lies in handling changes to replicated data.

When you get a write request that modifies your database, how do you make sure that all the replicas reflect this write request?

How do you stop replicas that haven’t updated from responding with stale data to read requests?

There are 3 popular strategies for writing changes to all your replicas

Single Leader Replication - One replica node is designated as the leader. The other nodes are followers. Write requests go to the leader node, who will then propagate the changes to the followers. This is the replication strategy used by many databases like PostgreSQL, MongoDB, MySQL, and more.
Multi Leader Replication - This is similar to Single Leader, but now multiple nodes can act as the leader and process write requests.Multi-Leader Replication is usually implemented with external tools, such as Tungstein Replicator for MySQL, BDR for PostgreSQL and GoldenGate for Oracle.
Leaderless Replication - All replica nodes can accept write requests from clients, so there is no leader node. Riak and Cassandra are examples of databases that use leaderless replication strategies. Amazon used leaderless replication for their in-house Dynamo system, so Riak and Cassandra are also known as Dynamo-style.

Note - Amazon’s Dynamo system is different from Amazon’s DynamoDB. DynamoDB is based on many principles of Dynamo but has a different implementation. DynamoDB uses single-leader replication.

Almost all distributed databases use one of these three approaches and they all have their pros and cons.

However, Single Leader Replication is the most popular replication strategy for distributed databases. Therefore, we’ll dive further into single leader. If you’re interested in learning more about multi-leader and leaderless strategies, check out the book.

Single Leader Replication

Single Leader Replication works as follows

One of the replicas is designed as the leader. Write requests from clients will be sent to the leader, who will write the new data to it’s local storage.
The other replicas are known as followers. Whenever the leader writes new data to it’s local storage, it also sends the data changes to all of the followers.
Each follower takes the data change log from the leader and updates its local copy of the database by applying all the new writes.
When a client wants to read from the database, the read requests can be queried to any of the nodes in the database - leader or follower.

Writes to the database can be asynchronous, synchronous, and semi-synchronous.

For an asynchronous write, the leader will get the client’s write request and update it’s own local storage. Then, it will respond saying that the write was successful. After it responds, the leader will send a message to all the follower nodes with the data change from the client’s write request.

With a synchronous write, the leader will first make sure every follower node has written the data change to their local database. Once the leader node has received confirmation from all the followers, it will respond with a message that the write was successful.

For a semi-synchronous write, the leader will wait for write confirmation from a specific number of follower nodes (this parameter can be configured) until it responds with a message that the write was successful.

In practice, synchronous writes are rarely used. With a synchronous write strategy, write requests will take an extremely long time (since you have to wait for every follower to respond) and will frequently fail (any time one or more follower nodes are not responsive).

Therefore, engineers typically use a semi-synchronous strategy or an asynchronous strategy.

The tradeoff between semi-synchronous and asynchronous write strategies comes down to how fast you want your write requests processed (asynchronous writes are faster) and how durable you want your write requests to be (asynchronous write strategies have a greater chance of losing write data if the leader node crashes before sending write changes to the followers).

Two issues that come up frequently with Single Leader replication are

Handing Node Outages
Replication Lag

Handling Node Outages

Node outages are inevitable, especially if you’re using a large distributed database with many follower nodes.

There are two types of node outages: follower outages and leader outages.

Follower Failure: Catch-up recovery

If a follower node fails, then it can recover quite easily. Followers keep a log of all the data changes received from the leader in local, nonvolatile storage. Therefore, the follower knows the last transaction it processed.

The follower will query the leader for all the changes that have happened since that last transaction, and then update its local state to match the present.

Leader Failure: Failover

Handling a failure of the leader is trickier. One of the follower nodes needs to be promoted to the new leader and clients have to be reconfigured to send their writes to the new leader. The other followers also have to start consuming data changes from the new leader.

This process is called failover.

Failover processes have tons of things that can go wrong

If asynchronous replication is used, then the new leader may not have received all the writes from the old leader before it failed. This means weaker durability guarantees.
When the original leader comes back online, they may be misconfigured to think they are still the leader. This is a common failure and is often called split brain.
Load issues can arise since your database can’t accept new writes while the failover process is happening. If the leader node fails often, then this can clog up the database.

Replication Lag

When you’re using a single-leader replication strategy with semi-synchronous or asynchronous writes, you’ll frequently run into consistency issues where a client will read stale data from a follower node that hasn’t been fully updated.

This inconsistency is a temporary state, and if you wait for a bit then all the followers will eventually catch up. Therefore, this effect is known as eventual consistency.

However, eventual consistency is a vague term, and doesn’t specify how long the replication lag is. It could be a few seconds or even a few minutes.

Therefore, even with “eventual consistency”, the replication lag can be a big problem for your users.

In order to ease these issues, there are several approaches that you can use to reduce some of the common issues that users face.

We’ll go through some of these approaches and the issues that they solve.

Read Your Own Writes

Let’s say you’re building a twitter clone. The user can post a tweet from their computer, which will send a write request to your distributed database.

This write request is asynchronously replicated, so the leader will respond that the write was successful after changing it’s local state. Then, it will send the change to all the follower nodes.

If the user refreshes their page right after tweeting and tries to reload, the new read request for the user’s previous tweets might go to a follower node that hasn’t been informed of the new tweet.

Therefore, the user’s twitter profile will not show his new tweet after he refreshes.

Obviously, this can be super frustrating to users.

Read Your Own Writes consistency is a solution that guarantees that if the user reloads the page, they will always see any updates they submitted themselves. It makes no promises about other users.

Read Your Own Writes consistency can be implemented in many ways. One possible way is to track the last time a user submitted an update. If they submitted an update within the past minute, then their read requests should be handled by the leader node.

Monotonic Reads

Going back to the twitter clone example, asynchronous replication will result in some follower nodes lagging other follower nodes in terms of updates.

Therefore, a user might visit the website and get tweets from a follower node that is up to date. After, he might reload his page and then get tweets from a follower node that is lagging behind.

This will result in his twitter feed “moving back in time” as the data he’s getting is stale. This obviously means a bad user experience.

Monotonic reads is a guarantee that this kind of anomaly does not happen.

One way of achieving this guarantee is by making sure that each user always reads from the same follower node (different users can read from different replicas).

The replica can be chosen based on a hash of the user ID, rather than randomly.

Consistent Prefix Reads

Let’s say you have user A and user B on your twitter clone app. User A tweets out a picture of his dog. User B reply tweets to that picture with a compliment for the dog.

There is a causal dependency between the two tweets where user B’s reply tweet doesn’t make any sense if you can’t see user A’s tweet.

User C follows both user A and user B. If user A’s tweet goes through more replication lag than user B’s tweet, then user C might see user B’s reply tweet without getting user A’s tweet.

Constant Prefix Reads is a guarantee that solves this anomaly. This guarantee says that if a sequence of writes happens in a certain order, then anyone reading those writes will see them appear in the same order.

This can be solved if the database always applies the writes in the same order, but there are complications that arise when your database is sharded.

Check out DDIA for more details on this.

Quastor is a free Software Engineering newsletter that sends out summaries of technical blog posts, deep dives on interesting tech and FAANG interview questions and solutions.

Interview Question

How would you build a spelling correction system?

Possible Follow On Questions

How would you check if a word is misspelled?
How would you find possible suggestions?
How would you rank the suggestions for the user?

Previous Question

As a refresher, here’s the previous question

Implement a BSTIterator class that represents an iterator over the in-order traversal of a Binary Search Tree.

You should implement three functions

BSTIterator - the constructor of the class
hasNext - returns True if there is a number in the traversal to the right of the pointer.
next - moves the pointer to the right and then returns the number at the pointer.

Here’s the question in LeetCode.

Solution

The way we can solve this is by simulating an inorder traversal. When you do an inorder traversal, you typically code it recursively and take advantage of the call stack as a way to keep track of which node you’re on in the BST.

Since we’re building an iterator that does this, we can’t use recursion (and can’t take advantage of the call stack).

Instead, we have to simulate an interative inorder traversal.

For the iterative inorder traversal, we maintain an array s that simulates the call stack. As we traverse the BST, we follow the inorder pattern of exploring to the left, exploring the node, and then exploring to the right.

Once you understand how to write an iterative inorder traversal, creating the inorder iterator is simple.

We create instance variables for our stack and current node and then implement our iterative inorder traversal logic in the next function. Our hasNext function can just check the length of our stack and the current node to see if we’ve ended the traversal yet.