How Facebook Encodes Videos

Hey Everyone,

Today we’ll be talking about

  • How Facebook encodes Videos

    • Adaptive Bitrate Streaming vs. Progressive Streaming and a brief intro to video encoders

    • Facebook’s process for encoding videos.

    • Facebook’s Benefit-Cost Model for determining the priority of an encoding job in their priority queue

  • Is Rust the future of JavaScript Infrastructure?

    • A brief overview of Rust and what makes it special

    • JavaScript Ecosystem tools that are being built with Rust (SWC, Deno, esbuild)

    • Downsides of Rust for JavaScript tooling

  • Plus, a couple awesome tech snippets on

    • Unity for Software Engineers

    • An introduction to profiling python code

    • Spending $5k to learn how database indexes work

We have a solution to our last Apple interview question and a new question from Facebook.

Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.

Hundreds of millions of videos are uploaded to Facebook every day.

In order to deliver these videos with high quality and little buffering, Facebook uses a variety of video codecs to compress and decompress videos. They also use Adaptive Bitrate Streaming (ABR).

We’ll first give a bit of background information on what ABR and video codecs are. Then, we’ll talk about the process at Facebook.

Progressive Streaming vs. Adaptive Bitrate Streaming

Progressive Streaming is where a single video file is being streamed over the internet to the client.

The video will automatically expand or contract to fit the screen you are playing it on, but regardless of the device, the video file size will always be the same.

There are numerous issues with progressive streaming.

  • Quality Issue - Your users will have different screen sizes, so the video will be stretched/pixelated if their screen resolution is different from the video’s resolution.

  • Buffering - Users who have a poor internet connection will be downloading the same file as users who have a fast internet connection, so they (slow-download users) will experience much more buffering.

Adaptive Bitrate Streaming is where the video provider creates different videos for each of the screen sizes that he wants to target.

He can encode the video into multiple resolutions (480p, 720p, 1080p) so that users with slow internet connections can stream a smaller video file than users with fast internet connections.

The player client can detect the user’s bandwidth and CPU capacity in real time and switch between streaming the different encodings depending on available resources.

You can read more about Adaptive Bitrate Streaming here.

Video Codec

A video codec compresses and decompresses digital video files.

Transmitting uncompressed video data over a network is impractical due to the size (tens to hundreds of gigabytes).

Video codecs solve this problem by compressing video data and encoding it in a format that can later be decoded and played back.

Examples of common codecs include H264 (AVC), MPEG-1, VP9, etc.

The various codecs have different trade-offs between compression efficiency, visual quality, and how much computing power is needed.

More advanced codecs like VP9 provide better compression performance over older codecs like H264, but they also consume more computing power.

You can read more about video codecs here.

Facebook’s Process for Encoding Videos

So, you upload a video of your dog to Facebook. What happens next?

Once the video is uploaded, the first step is to encode the video into multiple resolutions (360p, 480p, 720p, 1080p, etc.)

Next, Facebook’s video encoding system will try to further improve the viewing experience by using advanced codecs such as H264 and VP9.

The encoding job requests are each assigned a priority value, and then put into a priority queue.

A specialized encoding compute pool then handles the job.

Now, the Facebook web app (or mobile app) and Facebook backend can coordinate to stream the highest-quality video file with the least buffering to people who watch your video.

A key question Facebook has to deal with here revolves around how they should assign priority values to jobs?

The goal is to maximize everyone’s video experience by quickly applying more compute-intensive codecs to the videos that are watched the most.

Let’s say Cristiano Ronaldo uploaded a video of his dog at the same time that you uploaded your video.

There’s probably going to be a lot more viewers for Ronaldo’s video compared to yours so Facebook will want to prioritize encoding for Ronaldo’s video (and give those users a better experience).

They’ll also want to use more computationally-expensive codecs (that result in better compression ratios and quality) for Ronaldo.

The Benefit-Cost Model

Facebook’s solution for assigning priorities is the Benefit-Cost model.

It relies on two metrics: Benefit and Cost.

The encoding job’s priority is then calculated by taking Benefit and dividing it by Cost.

Benefit

The benefit metric attempts to quantify how much benefit Facebook users will get from advanced encodings.

It’s calculated by multiplying relative compression efficiency * effective predicted watch time.

The effective predicted watch time is an estimate of the total watch time that a video will be watched in the near future across all of its audience.

Facebook uses a sophisticated ML model to predict the watch time. They talk about how they created the model (and the parameters involved) in the article.

The relative compression efficiency is a measure of how much a user benefits from the codec’s efficiency.

It’s based on a metric called the Minutes of Video at High Quality per GB (MVHQ) which is a measure of how many minutes of high-quality video can you stream per gigabyte of data.

Facebook compares the MVHQ of different encodings to find the relative compression efficiency.

Cost

This is a measure of the amount of logical computing cycles needed to make the encoding family (consisting of all the different resolutions) deliverable.

Some jobs may require more resolutions than others before they’re considered deliverable.

As stated before, Facebook divides Benefit / Cost to get the priority for a video encoding job.

After encoding, Facebook’s backend will store all the various video files and communicate with the frontend to stream the optimal video file for each user.

For more details, read the full article here.

Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.

Tech Snippets

  • Unity For Software Engineers Archives - This is an awesome series of tutorials on Unity designed for software engineers. It covers all the basic features of Unity like Scenes, Game Objects, Assets, etc but it also dives into topics like Raycasting, Animations, Pathfinding and more!Facebook is betting really big on VR, so now might be a fantastic time to learn Unity if you’ve been putting it off.

  • How I Tried To Reduce Pylint Memory Usage - Raphael is a Python developer and he works with Django. The codebase he deals with is extremely large, and he found Pylint to be running a bit slow and consuming a lot of memory.The blog post is on his journey profiling Pylint to figure out what the issue was. By the end, he was able to reduce Pylint’s memory usage by 80%.It’s a fantastic introduction on how to profile Python code.

  • Spending $5k to learn how database indexes work - This is a short, interesting blog post on how a mistake with PlanetScale’s serverless database offering led to a pretty large bill.The PlanetScale platform is built on Vitess, an open source solution for horizontally scaling MySQL. Vitess makes certain trade-offs that the author wasn’t aware of that led to the unexpectedly large bill.

Lee Robinson is the head of developer relations for Vercel (the creators of NextJS).

He wrote a great blog post on how he’s seeing the JavaScript web ecosystem embrace the Rust programming language.

Here’s a summary

Rust is a fast, reliable, and memory-efficient systems language that’s been voted “most-loved by developers” for the past 6 years (according to the Stack Overflow Developer Survey).

In the past, the vast majority of tools in the JavaScript ecosystem were written in JavaScript or TypeScript.

But, there’s been a trend away from JavaScript towards using systems programming languages.

We’re now seeing the next generation of JavaScript tooling being built using Rust.

Lee goes through several tools across the JavaScript ecosystem.

SWC

SWC is an extensible Rust-based platform that can be used for compilation, bundling, minification, and more.

It’s used by tools like NextJS, Parcel and Deno.

Deno

Deno is a modern and secure runtime for JavaScript and TypeScript. It uses v8 and is built with Rust.

Deno is attempting to replace NodeJS, and it is written by the original creators of NodeJS.

It’s linter, code formatter, and docs generator are built using SWC.

esbuild

esbuild is a JavaScript bundler and minifier.

Esbuild is what triggered the trend of building JavaScript tooling with systems programming languages like Go and Rust.

Why not Rust?

Lee also goes through some of the cons of Rust.

One is that Rust has a pretty steep learning curve. It’s a lower level of abstraction than what most web developers are used to.

Therefore, developers will have to think more about algorithms, data structures and memory management.

Additionally, Rust’s usage in the web community is still niche, but this is changing rapidly.

Read the full article for more of Lee’s thoughts.

Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.

Interview Question

You are given an array of integers where every integer occurs three times except for one integer, which only occurs once. Find and return the non-duplicated integer.

Input - [6, 1, 3, 3, 3, 6, 6]

Output - 1

Input - [13, 19, 13, 13]

Output - 19

Do this in O(N) time and O(1) space.

We’ll send a detailed solution in our next email, so make sure you move our emails to primary, so you don’t miss them!

Gmail users—move us to your primary inbox

  • On your phone? Hit the 3 dots at the top right corner, click "Move to" then "Primary"

  • On desktop? Back out of this email then drag and drop this email into the "Primary" tab near the top left of your screen

Apple mail users—tap on our email address at the top of this email (next to "From:" on mobile) and click “Add to VIPs”

Previous Solution

As a reminder, here’s our last question

You are given an n x n 2D matrix.

Rotate the matrix by 90 degrees clockwise.

Solution

We can solve this question by breaking it down into two steps.

  1. Find the transpose of the matrix

  2. Reverse the rows of the transpose

The transpose of a matrix is an operator which flips the matrix over it’s diagonal.

The row and column indices of the matrix are switched.

After finding the transpose of the matrix, you need to iterate through each of the rows and reverse the order of the elements in that row.

These two transformations are equivalent to a 90 degree clockwise rotation.

Here’s the Python 3 code.