How Discord Reduced Outgoing Traffic by 40%

Discord reduced WebSocket traffic significantly by switching from zlib to ztsd. Plus, how to evaluate dependencies for production code, building an Uber clone from scratch and more.

Hey Everyone!

Today we’ll be talking about

  • How Discord Reduced WebSocket Traffic by 40%

    • Introduction to Zstandard compression and it’s advantages

    • How Discord evaluated Zstandard vs. zlib and solved issues with Streaming Compression

    • Tuning Zstandard by optimizing hashlog, chainlog and windowlog

  • Tech Snippets

    • High Quality Apparel and Accessories for Developers

    • How to Evaluate Dependencies for Production Code

    • Building an Uber Clone from Scratch

    • How to Contribute to an Unfamiliar Open Source Project

How Discord Reduced WebSocket Traffic by 40%

Discord is a communication platform that lets users chat via text, voice and video. You can create servers with topic-based channels, share files, and more (it’s like a Slack for gamers). The app was launched in 2015 and has since grown to over 200 million monthly active users.

To send clients real-time updates (when a user has a new message for ex), Discord uses a service called the “gateway.”

In order to reduce bandwidth usage, Discord compresses all messages sent between the gateway and end users. Since 2017, Discord has used zlib compression, which makes messages 2-10x smaller.

Over the past few months, Discord has been replacing zlib with another compression algorithm called Zstandard (Ztsd).

Making this change (along with a couple other optimizations) has allowed Discord to reduce their outgoing bandwidth from the gateway service by 40%. They wrote a fantastic article on the steps they took to achieve this.

We’ll be summarizing the article and adding some extra context on Ztsd.

Introduction to Zstandard

Zstandard (zstd) was first released by Facebook in 2015 with the goal of providing flexibility between high compression ratios and fast compression speed. Since then, it has exploded in popularity and is used by major tech companies like Uber, AWS, Meta, and in software like the Linux kernel.

Some of the key selling points of Zstandard are:

  • Flexibility between Compression Ratios and Speed - Zstandard offers 22 compression levels, allowing developers to select the desired trade-off between speed and compression efficiency. Lower levels prioritize faster compression speeds, while higher levels provide better compression ratios at the cost of more compute and memory usage.

  • Well Supported - Zstandard is highly supported with extensive libraries, great documentation, and a strong community. There’s tons of resources available for troubleshooting, performance tuning, integration, and more

  • Efficient Compression for Small Payloads - Zstandard uses dictionary compression to improve efficiency. It creates a "lookup table" (dictionary) that maps common data patterns to short codes. This dictionary can be pre-trained on representative data and then reused across multiple compression sessions or files. This is super useful for compressing a series of small files or messages; traditional algorithms might not have enough data to build an effective compression context if they just compress each message independently.

Based on these selling points, Discord decided to test out Zstandard in their backend.

Dark Launch Testing

To quickly evaluate Zstandard without going through a lengthy process, Discord opted for a dark launch approach.

Here's how they did it:

  1. Compress a small percentage of production traffic on the backend with both zlib and Zstandard

  2. Collect performance metrics for both compression methods

  3. Discard the Zstandard-compressed data (i.e., don't actually send it to clients)

This allowed them to quickly iterate since they didn’t have to add client libraries on iOS, Android, Windows, etc. They were able to see results in just a few days, rather than waiting weeks or months for a full rollout.

However, initial results showed Zstandard was performing worse than zlib.

After investigating, the Discord team discovered this was because zlib was using streaming compression, while their Zstandard implementation wasn't.

Streaming Compression

Streaming compression maintains a compression context across multiple messages. The compression stream is initialized when a WebSocket connection opens and persists until the WebSocket closes.

This makes it much easier for the compressor to use knowledge from the previously compressed data in the stream. It results in significantly smaller payload sizes, especially if you’re sending a series of small messages with a similar structure.

Discord added streaming compression support to the ezstd library they were using and contributed this improvement back to the community.

With this fix, they started to see Zstandard significantly outperform zlib. The compression ratio improved from 6 to 10 and the average payload size dropped from 270 bytes to 166 bytes.

Tuning Zstandard

After seeing that improvement, the Discord team started optimizing their Zstandard setup.

Zstandard is highly configurable and lets you adjust exactly how much you trade-off compression efficiency for performance.

 The Discord team focused on optimizing three key parameters:

  • Hashlog: Determines the maximum size of the hash table used for finding matches. Larger hashlog values can improve compression but require more memory and hurt compression speed.

  • Chainlog: Controls the size of the secondary data structure used for collision resolution in the hash table. Similar to Hashlog, higher values can improve compression ratio but at the cost of memory usage and compression speed.

  • Windowlog: This parameter essentially sets the size of the sliding window used for finding matches during compression. This limits how far back the compressor can search for matches. Larger windowlog values improve compression ratio but increase memory usage and compute.

After experimentation, they settled on:

  • Overall compression level: 6

  • Chainlog and hashlog: 16

  • Windowlog: 18

Implementation and Rollout

While the original plan was to only implement Zstandard for mobile clients, the bandwidth improvements were so significant that Discord decided to ship it to desktop users as well.

The rollout was gated behind an experiment flag to allow for quick rollback if issues arose. This also let Discord to validate their results in production and monitor for any negative impacts on baseline metrics.

Over several months, Discord successfully rolled out Zstandard compression to all users across all platforms.

Massive Bandwidth Savings

Combined with other optimizations discovered during this project, Discord achieved a 40% reduction in gateway bandwidth usage by clients. This translates to significantly faster performance, especially for users on slower connections or with limited data plans.

Tech Snippets