The Architecture of DoorDash's Search Engine

We'll talk about Apache Lucene and how DoorDash built a Search Engine using it. Plus, Hashing explained with interactive graphics, curated resources on debuggers and more.

Arpan KG
March 19, 2024

Hey Everyone!

Today we’ll be talking about

The Architecture of DoorDash’s Search Engine
- Issues DoorDash had with Elasticsearch
- Introduction to Apache Lucene
- DoorDash’s Document Indexer
- Executing Queries and Searching
- Making the Search Engine Multi-tenant
Overcoming the “Senior Engineer Plateau”
- Advancing from a junior to a senior developer can be a relatively straightforward path. However, advancing beyond senior can be much harder
- Benjamin Yolken is an engineering leader and he wrote a fantastic blog post post with tips on leveling up to Staff and Principal roles
- You need to focus on demonstrating impact. This can be done done through cross-team leadership, mentorship and more.
Tech Snippets
- Hashing Explained with Interactive Graphics
- Curated Resources on Debuggers
- 45 Ways to Break an API server
- Architecture of a Solo Business

The Architecture of DoorDash’s Search Engine

DoorDash is one of the largest food delivery apps in the world with close to 40 million users and hundreds of thousands of restaurants/stores. They’re active in over 30 countries.

One of the most prominent features in the app is the search bar, where you can look for restaurants, stores or specific items.

Previously, DoorDash’s search feature would only return stores. If you searched for “avocado toast”, then the app would’ve just recommended the names of some hipster joints close to you.

Now, when you search for avocado toast, the app will return specific avocado toast options at the different restaurants near to you (along with their pricing, customization options, etc.)

This change (along with growth in the number of stores/restaurants) meant that the company needed to quickly scale their search system. Restaurants can each sell tens/hundreds of different dishes and these all need to be indexed.

Previously, DoorDash was relying on Elasticsearch for their full-text search engine, however they were running into issues:

Scaling - In Elasticsearch, you store the documents index across multiple nodes (shards) where each shard has multiple replicas. DoorDash was facing issues with this replication mechanism where it was too slow for their purposes.
Customization - Elasticsearch didn’t have enough support for modeling complex document relationships. Additionally, it didn’t have enough features for query understanding and ranking.

To solve this, the DoorDash team decided to build their own search engine. They built it using Apache Lucene but customized the indexing and searching processes based on their own specification. They published a fantastic blog post on how they did it.

In this edition of Quastor, we’ll give an introduction to Apache Lucene and then talk about how DoorDash’s search engine works.

If you’d like to remember the concepts we discuss in this article, check out Quastor Pro.

You’ll get detailed Space-Repetition Anki Flash Cards on all the concepts covered in past Quastor Articles. The flash cards cover concepts from load balancing, network protocols, databases and more!

Introduction to Apache Lucene

Lucene is a high-performance library for building search engines. It’s written in 1999 by Doug Cutting (he later co-founded Apache Hadoop) based on work he did at Apple, Excite and Xerox PARC.

Elasticsearch, MongoDB Atlas Search, Apache Solr and many other full-text search engines are all built on top of Lucene.

Some of the functionality Lucene provides includes

Indexing - Lucene provides libraries to split up all the document words into tokens, handle different stems (runs vs. running), store them in an efficient format, etc.
Search - Lucene can parse query strings and search for them against the index of documents. It has different search algorithms and also provides capabilities for ranking results by relevance.
Utilities & Tooling - Lucene provides an GUI application that you can use to browse and maintain indexes/documents and also run searches.

DoorDash used components of Lucene for their own search engine. However, they also designed their own indexer and searcher based on their personal specifications.

DoorDash Search Engine Architecture

DoorDash’s search engine is designed to be horizontally scalable, general-purpose and performant.

All search engines have two core tasks

Indexing Documents - taking any new documents (or updates) and processing them into a format so their text can be searched through quickly. A very common structure is the inverted index format, that’s similar to the index section in the back of a textbook. Lucene uses an inverted index.
Searching the Index - taking a user’s query, interpreting it and then retrieving the most relevant documents from the index. You can also apply algorithms to score and rank the search results based on relevance.

DoorDash built the indexing system and the search system into two distinct services. They designed them so that each service could scale independently. This way, the document indexing won’t be affected by a big spike in search traffic and vice-versa.

Indexer

The indexer is responsible for taking in documents (food/restaurant listings for example), and then converting them into an index that can be easily queried. If you query the index for “chocolate donuts” then you should be able to easily find “Dunkin Donuts”, “Krispy Kreme”, etc.

The indexer will convert the text in the document to tokens, apply filters (stemming and converting the text to lowercase) and add it to the inverted index.

DoorDash uses Apache Lucene to handle this process and create the inverted index. The index is then split into smaller index segment files so they can easily be replicated.

The index segment files are uploaded to AWS S3 after creation.

In order to scale the number of documents that are ingested, DoorDash splits indexing traffic into high-priority and low-priority.

High-priority updates are indexed immediately whereas low-priority updates are indexed in a batch process (runs every 6 hours).

Searcher

The searcher is responsible for searching through the index files and returning results.

It starts by downloading the index segments from AWS S3 and making sure it’s working with the most up-to-date data.

When a user query comes in, the searcher will use Lucene’s search functions to search the query against the index segment files and find matching results.

The searcher is designed to be replicated across multiple nodes so it can handle increases in search traffic.

Tenant Isolation

DoorDash’s search engine is designed to be a general-purpose service for all the teams at the company. It’s a multi-tenant service.

Some of the considerations with a multi-tenant service include

Noisy Neighbor Problem - if one team is experiencing a spike in queries then that shouldn’t degrade performance for the other teams.
Updating - teams should be able to index new documents without affecting other tenants. Indexing errors should be contained to the team itself.
Customization - Teams should be able to create their own custom index schemas and custom query pipelines.
Monitoring - there should be a monitoring system in place for the usage and performance stats of each team/tenant. This helps with accountability, resource planning and more.

In order to address these concerns, the DoorDash team built their search engine with the concept of Search Stacks. These are independent collections of indexing/searching services that are dedicated to one particular use-case (one particular index).

Each tenant can have their own search stack and DoorDash orchestrates all their search stacks through a control plane.

Tech Snippets

Architecture of a Solo Business

Feedback is a small SaaS company that’s run by a solo founder. It allows users to embed a feedback form on their site and collect user responses.

This is a great blog post that dives into the architecture of the project. The platform is hosted on Fly and is designed around low maintenance and cost-effectiveness.

If you’re interested in starting up your own solo SaaS project as a side hustle, then this is a great read.

www.feelback.dev/blog/feelback-saas-launch-architecture

Introduction to Hashing

Sam Rose creates amazing visual, interactive explanations on difference concepts in engineering. This is a terrific blog post he made delving into hashing, why it’s useful and how it works. In the blog post you’ll build your own simple hash map.

This is a fantastic read if you’d like to learn more about how hash functions work. It’s also great if you just want to see the interactive explanations and how Sam visualizes complex concepts.

samwho.dev/hashing

Curated Resources on Debuggers

This is an in-depth curation of resources on debuggers. You’ll learn about how they work, the tech under the hood, current areas of research and more.

Some of the links include
- Writing a Linux Debugger
- How Stack unwinding works
- How GDB works

and more.

werat.dev/blog/learning-about-debuggers

45 ways to break an API server

This is a really handy reference list of good tests you should run to check your API. It includes tests for attempted code injection, different character encodings, unsupported HTTP methods/versions and much more.

dev.to/zvone187/45-ways-to-break-an-api-server-negative-tests-with-examples-4ok3

Premium Content

Subscribe to Quastor Pro for long-form articles on concepts in system design and backend engineering.

Past article content includes

System Design Concepts

Measuring Availability
API Gateways
Database Replication
Load Balancing
API Paradigms
Database Sharding
Caching Strategies
Event Driven Systems
Database Consistency
Chaos Engineering
Distributed Consensus

Tech Dives

Redis
Postgres
Kafka
DynamoDB
gRPC
Apache Spark
HTTP
DNS
B Trees & LSM Trees
OLAP Databases
Database Engines

When you subscribe, you’ll also get Spaced Repetition (Anki) Flashcards for reviewing all the main concepts discussed in prior Quastor articles

The Senior Engineer Plateau

If you’re a developer at a FAANG company, then you’re probably aware of the “senior plateau”. Advancing from a junior developer to a senior can be relatively straightforward but getting promoted beyond “senior” can be much more difficult.

Benjamin Yolken is a software engineer at Rippling and he was previously a Principal Engineer at Segment. He’s also spent time in leadership roles at companies like Twitter, Stripe, Airbnb and more.

He writes a fantastic blog with his experiences working in tech and this is a summary of his terrific advice on getting past the “senior plateau”

Why Level Matters

The level you’re at determines what your compensation will be. Each level will have specific salary and equity bands. Getting promoted will also come with equity refreshers (depending on the company) so it can be very lucrative.

Many companies will also tie bonuses to your level. Benjamin talked about how Stripe would give a target of 10% of base salary as a bonus for senior engineers. Staff engineers got a target of 20% of base salary for their bonus.

levels.fyi is a fantastic site to check for up-to-date data on compensation per level for the different big tech companies.

Common Frustrations about Leveling Up

Benjamin delves into some of the common frustrations that senior engineers will have when they’re trying to get promoted.

Senior devs have to realize that shipping code is no longer enough to get you promoted above senior engineer. Instead, you need to focus on demonstrating impact. This is done by cross-team leadership, mentorship, shipping complex projects, etc.

Your direct manager’s word is also not enough anymore. You need people outside of your immediate team to support your promotion.

How to Level Up

Benjamin delves into 3 tips for leveling up faster.

Find the right role - Before you sign a job offer (or switch roles within your company), you should understand what the role values/rewards. Is cross-team coordination crucial for the project you’ll be working on? Is it a type of role that will give you a high amount of visiblity?
Check the boxes - Consult your company’s career ladder documentation and work with your manager to figure out what boxes you need to check for promotion. Figure out what pieces you’re missing and work on improving them
Improve your visibility - Do regular “skip level“ meetings with your manager’s manager. They should understand what you’re working on and the impact you’re having. It’s also great to send out progress reports and launch announcements to others beyond your imediate team.

For more details, read the full blog post here.