How Pinterest Optimized Video Playback
An introduction to Adaptive Bitrate Streaming and how Pinterest was able to reduce startup latency for videos. Plus, the architecture of open source applications and how Anthropic was able to improve RAG.
Hey Everyone!
Today we’ll be talking about
How Pinterest Optimized Video Playback
Introduction to Adaptive Bitrate Streaming, HLS and DASH
Why Pinterest was experiencing high startup latency for videos
Embedding the video manifest files in their metadata API and improving performance with caching
Tech Snippets
Digital Signatures and how to avoid them
The Architecture of Open Source Applications
Anthropic’s new blog post on Contextual Retrieval for RAG
How Pinterest Optimized Video Playback
Pinterest is a social media platform that helps you discover ideas and inspiration related to whatever you’re interested in (cooking recipes, home decor, clothing, etc)
The platform was launched in 2010 and it’s grown to over 500 million monthly active users. Pinterest is now publicly traded and valued at more than $20 billion.
Like every other social platform, video content is one of the most popular mediums on Pinterest. When you’re serving videos to your users, one of your highest priorities should be to minimize any buffering and startup delay. With the modern day attention span, even having your video buffer for a couple of seconds can result in a huge number of users leaving your app.
Pinterest engineering published a great blog post on how they optimized video playback and reduced startup latency by 36%.
We’ll give some context on how videos are streamed, what protocols are involved and what Pinterest did to optimize playback.
Introduction to Adaptive Bitrate Streaming
When you’re delivering video to users, one technique that’s used universally nowadays is Adaptive Bitrate Streaming.
This is where you take the video and encode it at multiple bitrates and resolutions and store them all on your server. When a user wants to play the video, their phone will select the optimal rendition based on factors like network bandwidth and device characteristics to minimize any buffering.
With Adaptive Bitrate Streaming, the player can also switch dynamically between different bitrates. If the internet connection weakens while they’re watching a video on their phone, ABR allows the player to automatically switch to a lower bitrate stream so playback can be smooth without any buffering interruptions.
When the network improves, the player will automatically switch back to the higher bitrate stream to provide better video quality.
Basics of Adaptive Bitrate Streaming
There are different protocols you can use for Adaptive Bitrate Streaming, but they share some common fundamentals.
Chunking - the video file is broken up into small chunks. Each chunk ranges from 2-10 seconds in length.
Multiple Renditions - Each chunk is encoded at multiple bitrates and resolutions.
Manifest File - a manifest file contains metadata about the available renditions for every chunk, including their bitrates and resolutions.
Dynamic Selection - the user’s video player will use the manifest file to determine which chunk to download based on the current network conditions and device capabilities.
The most widely adopted Adaptive Bitrate protocols are HTTP Live Streaming (HLS) and Dynamic Adaptive Streaming over HTTP (DASH).
You’ve probably realized this by the names but HLS and DASH are both based on HTTP.
HTTP Live Streaming (HLS)
HLS was developed by Apple in 2009 and it’s one of the earliest and most widely adopted ABR protocols. The video stream is broken into small, HTTP-based downloads. It supports both live and on-demand streaming.
It’s developed and maintained by Apple so it’s natively supported on iOS, macOS and Safari.
HLS uses .m3u8 manifest files to guide the player in selecting the most appropriate video chunks based on real-time network conditions.
Dynamic Adaptive Streaming over HTTP (DASH)
DASH was created by a consortium of companies led by MPEG (Moving Picture Experts Group). The protocol was first published in 2012 and it currently powers platforms like YouTube and Netflix.
DASH uses .mpd manifest files to provide metadata about the available renditions and chunk URLs.
Video Streaming at Pinterest
At Pinterest, both HLS and DASH are used for delivering videos across iOS and Android platforms, respectively.
HLS: Utilized for video streaming on iOS devices through Apple’s AVPlayer, accounting for approximately 70% of video playback sessions on iOS apps.
DASH: Employed for video streaming on Android devices using ExoPlayer, representing around 55% of video playback sessions on Android.
One of the key metrics Pinterest measures for video performance is startup latency - the time it takes for a video to begin playing after a user initiates playback.
As we stated above, both HLS and and DASH require a manifest file before you can initiate video playback. With HLS, you might have to download additional manifest files (for the specific rendition) after downloading the main one.
Only after you download the manifest file can the video player start downloading the first few chunks of the video. This is the primary contributor to users’ perceived latency.
The Pinterest team decided to eliminate the latency from the round trips by embedding all the relevant manifest files in the original API response. When a user first requests metadata for a video (thumbnail, title, etc.), the API response to that request will also contain the manifest files of the video.
During playback, the player can swiftly access the manifest information locally and immediately start downloading video chunks.
Reducing API Response Time
When Pinterest started including manifest files in the API responses, the primary issue they faced was increased latency for the API endpoint. The backend now had to retrieve manifest files before it could respond with video metadata.
They were able to solve this issue with caching. They added a MemCache layer into the manifest serving process to cache the most popular video manifest files.
Here’s the new process for retrieving manifest files.
API Request - a client requests Pins metadata
Manifest Embeddings - the Backend retrieves manifest files from S3, serializes them and embeds the bytes within the API response
MemCache - Subsequent requests for popular video manifest files are served immediately from the MemCache caching layer.
Response Delivery - the API delivers the payload with the manifest data embedded
Results
With this new setup, Pinterest was able to see a 36.7% reduction in p90 startup latency on iOS. They also saw a 12.3% reduction in the number of users who had to wait longer than 1 second for a video to start.