How Shopify Ensures Consistent Reads

How Shopify deals with consistency issues due to replication lag with their primary-replica database setup.

Arpan KG
November 30, 2022

Hey Everyone!

Today we’ll be talking about

How Shopify Ensures Consistent Reads
- Using database replicas to deal with a read-heavy workload
- Dealing with issues around replication lag
- Causal Consistency and Monotonic Read Consistency
- Implementing Monotonic Read Consistency with MySQL

How Instacart Built Their Autocomplete System
- Handling misspellings with Levenshtein Edit Distance and checking for semantic duplication with an embeddings-based model
- Solving the Cold Start problem using query expansion
- Training a Learning-to-Rank model to rank autocomplete suggestions

Tech Snippets
- A Github Repo With Tons of Resources on Scalability
- Expectations of Professional Software Engineers
- Things You Should Do When Joining a New Team
- Top Announcements of AWS re:Invent 2022

How Shopify Ensures Consistent Reads

Shopify is an e-commerce platform that helps businesses easily build an online store to sell their products. Over 1.75 million businesses use Shopify and they processed nearly $80 billion dollars in total order value in 2021. In 2022, Shopify merchants will have over 500 million buyers.

For their backend, Shopify relies on a Ruby on Rails monolith with MySQL, Redis and memcached for datastores. If you'd like to read more about their architecture, they gave a good talk about it at InfoQ.

Their MySQL clusters have a read-heavy workload, so they make use of read replicas to scale up. This is where you split your database up into a primary machine and replica machines. The primary database handles write requests (and reads that require strict consistency) while the replicas handle read requests.

An issue with this setup is replication lag. The replica machines will be seconds/a few minutes behind the primary database and will sometimes send back stale data… leading to unpredictability in your application.

Reading from replicas with varying replication lag produces unpredictable results

Thomas Saunders is a senior software engineer at Shopify, and he wrote a great blog post on how the Shopify team addressed this problem.

Here's a summary

Shopify engineers looked at several possible solutions to solve their consistency issues with their MySQL database replicas.

Tight Consistency
Causal Consistency
Monotonic Read Consistency

Tight Consistency

One method is to enforce tight consistency, where all the replicas are guaranteed to be up to date with the primary server before any other operations are allowed.

In practice, you’ll rarely see this implemented because it significantly negates the performance benefits of using replicas. Instead, if you have specific read requests that require strict consistency, then you should just have those executed by the primary machine. Other reads that allow more leeway will go to the replicas.

In terms of stronger consistency guarantees for their other reads (that are handled by replicas), Shopify looked at other approaches.

Causal Consistency

Causal Consistency is where you can specify a read request to go to a replica database that is updated to at least a certain point of time.

So, let’s say your application makes a write to the database and later on you have to send a read request that’s dependent on that write. With this causal consistency guarantee, you can make a read request that will always go to a database replica that has at least seen that write.

This can be implemented using global transaction identifiers (GTIDs). Every transaction on the primary database will have a GTID associated with it. When the primary database streams changes to the replica machines, the GTIDs associated with those changes will also be sent.

Then, when you send a read request with causal consistency, you’ll specify a certain GTID for that read request. Your request will only be routed to read replicas that have at least seen changes up to that GTID.

Shopify considered (and began to implement) this approach in their MySQL clusters, but they found that it would be too complex. Additionally, they didn’t really need this for their use cases and they could get by with a weaker consistency guarantee.

Monotonic Read Consistency

With Monotonic read consistency, you have the guarantee that when you make successive read requests, each subsequent read request will go to a database replica that’s at least as up-to-date as the replica that served the last read.

This ensures you won’t have the moving back in time consistency issue where you could make two database read requests but the second request goes to a replica that has more replication lag than the first. The second query would observe the system state at an earlier point in time than the first query, potentially resulting in a bug.

The easiest way to implement this is to look at any place in your app where you’re making multiple sequential reads (that need monotonic read consistency) and route them to the same database replica.

We delve into how Shopify implemented this below.

Implementing Monotonic Read Consistency

Shopify uses MySQL and application access to the database servers is through a proxy layer provided by ProxySQL.

In order to provide monotonic read consistency, Shopify forked ProxySQL and modified the server selection algorithm.

An application which requires read consistency for a series of requests can give an additional UUID when sending the read requests to the proxy layer.

The proxy layer will use that UUID to determine which read replica to send the request to. Read requests with the same UUID will always go to the same read replica.

They hash the UUID and generate an integer and then mod that integer by the sum of all their database replica weights. The resulting answer determines which replica the group of read requests will go to.

The <code>consistent_read_id</code> is used to generate a hash that yields an index into a weighted list of servers. In this example, Every time we receive this particular consistent_ read_ id, server 1 will be selected.

For more details, you can read the full blog post here.

If you’d like to learn more about Consistency Models and the different types of consistency guarantees, I’d highly recommend reading Designing Data Intensive Applications by Martin Kleppman (check out chapter 5 on data replication for this topic).

How did you like this summary?

Your feedback really helps me improve curation for future emails.

Tech Snippets

Awesome Scalability - This is a great Github repo with a ton of talks, case studies, blog posts and more on designing large-scale, performant systems. It lists resources focused on a specific area of design like performance, availability, scalability, and more. It also lists articles on scaling engineering teams and management.

Expectations of Professional Software Engineers - Mike Ackton is a veteran of the games industry and he gave a great talk in 2019 on things he expects of developers he works with. Some examples are "I can articulate precisely what problem I'm trying to solve", "I can articulate how much my problem is worth solving", "I can articulate the most concrete use case of what I'm developing" and more. This is a good checklist to think about when you're working on a project.

How To Be Amazing In Any Team As A Dev (From a Staff Engineer at Meta) - Rahul Pandey is a former Staff Engineer at Meta and he now runs Taro, a community for developers who want to discuss career growth. He posted a great video on his youtube channel about what you should do when you join a new team. Tips include creating clear expectations on milestones you should hit, figuring out who you should meet in the organization and not wasting the first few weeks just reading documentation.

Top Announcements of AWS re:Invent 2022 - AWS re:Invent is AWS's developer conference where they make a ton of announcements and keynotes on products being offered. Here's a link where you can view articles and videos on all the announcements.

How Instacart Built their Autocomplete System

Instacart is a delivery platform with more than 10 million users. You can use the app/website to order items from local stores and have an instacart driver drop the item off to your home. Groceries, medicine, clothing and more are sold on the platform.

With such a wide range of products sold, having a high quality autocomplete feature is extremely important. Autocomplete can not only save customers time, but also recommend products that a customer didn’t even know he was interested in. This can end up increasing how much the customer spends on the app and benefit Instacart’s revenue.

Esther Vasiete is a senior Machine Learning Engineer at Instacart and she wrote a great blog post diving into how they built their autocomplete feature.

Here’s a summary

When a user enters an input in the Instacart search box, this input is referred to as the prefix. Given a prefix, Instacart wants to generate potential suggestions for what the user is thinking of.

For the prefix “ice c”, instacart might suggest “ice cream”, “ice cream sandwich”, “ice coffee”, etc.

In order to generate search suggestions, Instacart relies on their massive dataset of previous customer searches. Their vocabulary consists of 57,000 words extracted from 11.3 million products and brands. From this, they’ve extracted ~800k distinct autocomplete terms across all retailers.

These terms are loaded into Elasticsearch and Instacart queries it when generating autocomplete suggestions.

Some challenges that Instacart had to deal with around search suggestions were

Handling User Misspellings - A user might accidentally think “Mozzarella” is spelled as “Mozarella”. Therefore, when he types “Mozare”, he won’t see any helpful autocomplete suggestions.
Semantic Deduplication - “Mac and Cheese” and “Macaroni and Cheese” are two phrases for the same thing. If a user types in “Mac” in the search box, recommending both phrases would be a confusing experience. Instacart needs to make sure each phrase recommended in autocomplete refers to a distinct item.
Cold Start Problem - When a customer searches for an item, she searches for it at a specific retailer. She might search for “bananas” at her local grocery store or “basketball” at her local sporting goods store. Therefore, Instacart autocomplete will give suggestions based on the specific retailer. This presents a cold start problem when a new retailer joins the platform, since Instacart doesn’t have any past data to generate autocomplete suggestions off.

We’ll talk about how they solved each of these…

Handling Misspellings

If a customer accidentally thinks "Mozzarella" is spelled as "Mozarella", Instacart’s autocomplete should still suggest the correct term when the user types in "Mozar" into the search box.

To accomplish this, Instacart relies on fuzzy matching with Elasticsearch’s fuzziness parameter. That parameter works by using the Levenshtein Edit Distance algorithm under the hood, where you can specify the maximum edit distance you want to allow matches for.

Semantic Deduplication

Many times, the same item will be called multiple things. Macaroni and cheese can also be called Mac and cheese.

The autocomplete suggestion should not display both "Macaroni and Cheese" and "Mac and Cheese" when the user types in “Mac”. This would be confusing to the user.

Therefore, Instacart trained an embeddings-based model that learned the relationship between queries and products. They can use this model to generate a similarity score between two queries (by taking the dot product).

Terms like Mayo and Mayonnaise will give a very high semantic similarity score. The same applies for cheese slices and sliced cheese.

Instacart published a paper on how they trained this model, which you can read here.

Cold Start Problem

Instart’s autocomplete system will give completely different autocomplete suggestions based on the retailer. A grocery store will not give the same autocomplete suggestions as a sporting goods store.

Therefore, if a new retailer joins the Instacart platform, the app will have to generate autocomplete suggestions for them. This creates a cold start problem for distinct/small retailers, where Instacart doesn’t have a good dataset.

Instacart solves this by using a neural generative language model that can look at the retailer’s product catalog (list of all the items the retailer is selling) and extract potential autocomplete suggestions from that.

This is part of the Query Expansion area of Information Retrieval systems. Instacart uses doc2query to handle this.

Ranking Autocomplete Suggestions

In the paragraphs above, we talked about how Instacart generates possible autocomplete suggestions for a user’s prefix.

However, these also need to be ranked so that the most relevant suggestion is at the top. To do this, they trained a Machine-Learned Ranking model.

This model predicts the probability that an autocomplete suggestion will result in the user adding something to their online shopping cart. The higher the probability of an add-to-cart, the higher that autocomplete suggestion would be ranked.

The model can be broken down into two parts: an autocomplete conversion model and an add to cart model.

The autocomplete conversion model takes in the user’s prefix and then predicts the probability that the user will select a certain autocomplete suggestion. It’s a binary classification task where some examples of features are

Is the suggestion a fuzzy match (slightly different spelling)
How popular is this suggestion
What rate is this suggestion clicked on

The add to cart model calculates the conditional probability where given the user has clicked on a certain autocomplete suggestion, what is the probability that he adds one of the resulting products to his cart? This is another binary classification task where some of the features are

The add to cart rate for that autocomplete suggestion
The rate at which that autocomplete suggestion returns zero or a low number of results

These two models are then combined to generate the probability that a certain autocomplete suggestion will result in the user adding something to her cart.

The combined model is used to rank autocomplete suggestion terms. It resulted in a 2.7% increase in autocomplete engagement and an increase in gross transaction value per user.

For more details, you can read the full blog post here.

How did you like this summary?

Your feedback really helps me improve curation for future emails. Thanks!

Interview Question

You are given an array of k linked lists.

Each list is sorted in ascending order.

Merge all the linked lists into one sorted linked list and return it.

Here’s the question in LeetCode.