How Quora scaled MySQL to 100k+ Queries Per Second
Plus, what makes a great manager of software engineers, how shopify uses pair programming for mentorship, actionable tips on improving speed and more.
Hey Everyone!
Today we'll be talking about
Scaling MySQL at Quora
Quora uses a sharded MySQL cluster to store questions, answers, upvotes, comments, etc.
Scaling Database Reads with Caching and Query Optimization
Scaling Database Storage by switching to RocksDB
Scaling Database Writes with Parallelized Replication
Tech Snippets
Actionable Tips on Improving your Coding Speed and Efficiency
How Shopify uses Pair Programming for Mentorship
The Value of In-house Expertise
What Makes a Great Manager of Software Engineers
Career Advice for Junior Developers
How Quora scaled MySQL
Quora is a social media site where you can post questions for the community to respond to. They have over 300 million monthly active users with tens of thousands of questions posted per day.
To serve this traffic, they make heavy use of MySQL. They have a sharded configuration that stores tens of terabytes and can scale to serve hundreds of thousands of QPS (queries per second).
If you’re curious about how they sharded MySQL, you can read about that here.
In addition to adding more machines to the MySQL cluster, Quora had to make sure that the existing set up was running as efficiently as possible.
Vamsi Ponnekanti is a software engineer at Quora and he wrote a fantastic article delving into the different factors of database load and the specific steps the engineering teams took for optimization.
With database load, there’s 3 main factors to scale: read load, write load and data size.
Read Volume - How many read requests can you handle per second. Quora scaled this with improving their caching strategy (changing the structure to minimize cache misses) and also by optimizing inefficient read queries.
Write Volume - How many write requests can you handle per second. Quora isn’t a write-heavy application. The main issue they were facing was with replication between the primary-replica nodes. They fixed it by changing how writes are replayed.
Data Size - How much data is stored across your disks. Quora optimized this by switching the MySQL storage engine from InnoDB (the default) to RocksDB.
We’ll delve into each of these in greater detail.
Read Volume
Quora has a read-heavy workload, so optimizing reads is super important. However, different kinds of reads require different optimizations.
One way to sub-divide the excess load from your read traffic is with
Complex Queries - Queries that are CPU-intensive and hog up the database (joins, aggregations, etc.).
High Queries Per Second requests - If you have lots of traffic then you’ll be dealing with a high QPS regardless of how well you design your database.
Here’s how Quora handled each of these.
Complex Queries
For complex queries, the strategy is just to rewrite these so they take less database load.
For example
Large Scan Queries - If you have a query that’s scanning a large number of rows, change it to use pagination so that query results are retrieved in smaller chunks. This helps ensure the database isn’t doing unnecessary work.
Slow Queries - There’s many reasons a query can be slow: no good index, unnecessary columns being requested, inefficient joins, etc. Here’s a good blog post on how to find slow queries and optimize them.
High QPS Queries
The solution for reducing load from these kinds of queries is an efficient caching strategy. At Quora, they found that inefficient caching (lots of cache misses) was the cause for a lot of unnecessary database load.
Some specific examples Quora saw were
Fetching a user’s language preferences - Quora needed to check what languages a user understands. Previously, they’d query the cache with (user_id, language_id) and receive a yes/no response. A query of (carlos_sainz, spanish) would be checking if the user Carlos Sainz understood spanish. They’d run this query for all 25 languages Quora supported - (carlos_sainz, english), (carlos_sainz, french), etc.
This led to a large key-space for the cache (possible keys were all user_ids multiplied by the number of languages) and it resulted in a huge amount of cache misses. People typically just know 1 or 2 languages, so the majority of these requests resulted in No. This was causing a lot of unnecessary database load.
Quora changed their cache key to just using the user id (carlos sainz) and changed the payload to just send back all the languages the user knew. This increased the size of the payload being sent back (a list of languages instead of just yes/no) but it meant a significantly higher cache hit rate.
With this, Quora reduced the QPS on the database by over 90% from these types of queries.Inefficient Caching for Sparse Data Sets - Another issue with caching that Quora frequently ran into was with sparse data sets in one dimension. For example, they might have to query the database to see if a certain question needs to be redirected to a different question (this might happen if the same question is reposted).
The vast majority of questions don’t need to be redirected so Quora would be getting only a few “redirects” and a large number of “don’t redirect”.
When they just cached by question_id, then the cache would be filled with No’s and only a few Redirects. This took up a ton of space in the cache and also led to a ton of cache misses since the number of “redirects” was so sparse.
Instead, they started caching ranges. If question ids 123 - 127 didn’t have any redirects for any of the questions there, then they’d cache that range has having all No’s instead of caching each individual question id.This led to large reductions in database load for these types of queries, with QPS dropping by 90%.
Reducing Disk Space used by Database
Another part of scaling databases is dealing with the huge amount of data you need to store.
Having large tables has lots of second order effects that make your life harder
As table size grows, a smaller percentage of table data fits in the database buffer pool, which means disk I/O increases and you get worse performance
Backup and restore times increase linearly with table size
Backup storage size will also grow at a linear rate
Therefore, you should have a clear understanding of what data needs to be stored and what data can be removed. Having optimal data retention policies is crucial.
Now, for optimizing the way data is stored on disk, Quora did this by integrating RocksDB.
RocksDB is a key-value store developed at Facebook that is a fork of Google’s LevelDB. It’s commonly swapped in as the storage engine (the storage engine is responsible for how data is stored and retrieved from disk) for NoSQL databases.
Using MyRocks (RocksDB) to reduce Table Size
MySQL allows you to swap out the storage engine that you’re using. InnoDB is the default but a common choice for scaling MySQL is to use RocksDB.
MyRocks was first built at Facebook where they integrated RocksDB as the storage engine for MySQL. You can read an in-depth analysis of the benefits here.
One of the big benefits is increased compression efficiency. Your data is written to disk in a block of data called a page. Databases can read/write data one page at a time. When you request a particular piece of data, then the entire page is fetched into memory.
With InnoDB, page sizes are fixed (default is 16 KB). If the data doesn’t fill up the page size, then the remaining space is wasted. This can lead to extra fragmentation.
With RocksDB, you have variable page sizes. This is one of the biggest reasons RocksDB compresses better. For an in-depth analysis, I’d highly suggest reading the Facebook blog post.
Facebook was able to cut their storage usage in half by migrating to MyRocks.
At Quora, they were able to see an 80% reduction in space for one of their tables. Other tables saw 50-60% reductions in space.
Optimizing Writes
With Quora, their database load is not write heavy. If they had any write-heavy workloads, then they used HBase (a write-optimized distributed database modeled after Google Bigtable) instead of MySQL.
However, one issue they saw was excessive replication lag between MySQL instances. They have a primary instance that processes database writes and then they have replica instances that handle reads. Replica nodes were falling behind the primary in getting these changes.
The core issue was that replication replay writes happen sequentially by default, even if the writes happened concurrently on the primary.
The temporary solution Quora used was to move heavy-write tables off from one MySQL primary node onto another node with less write pressure. This helped distribute the load but it was not scalable.
The permanent solution was to incorporate MySQL’s parallel replication writes feature.
For all the details, you can read the full blog post here.