How DoorDash Manages Inventory in Real Time for Thousands of Retailers

Plus,how HRT reduced their code tangling issues, optimizing hash tables, the power of two random choices and more.

Hey Everyone!

Today we’ll be talking about

  • How DoorDash Manages Inventory in Real Time for Hundreds of Thousands of Retailers

    • The Engineering Challenges of Adding Retailers to the DoorDash Platform

    • Using Cadence Workflows and CockroachDB

    • The Architecture of DoorDash’s Inventory Management Platform

  • Tech Snippets

    • How Hudson River Trading (HRT) reduced their Code Tangling Issues

    • Using Functional Programming at Booking.com

    • The Power of Two Random Choices for Load Balancing

    • Optimizing Hash Tables

    • How Cloudflare Traced a Packet Loss Issue to a Bug in the Linux Kernel

    • The Fallacy of Having Developers Split Time on Multiple Projects

How Recommendation Engines Work

Have you ever wanted to dig into the inner workings of tech like Video Compression algorithms, Recommendation Engines or GPS?

Academic papers and textbooks on topics like these can be extremely hard to understand and time-consuming to dig through. But getting a grasp on them can expand your knowledge of real world systems and help you become a better developer.

Luckily, Brilliant makes this extremely easy by providing interactive, visual lessons on topics in applied CS, machine learning, engineering, data science and much more.

They gamify learning to make it super convenient to build a daily learning habit where you’re mastering concepts in computer science, machine learning, mathematics, etc.

That’s why Brilliant is used by over 10 million developers, data scientists, researchers and lifelong learners.

With the link below, you can get a 30-day free trial to check it out. You’ll also get a 20% discount when you subscribe.

sponsored

How DoorDash Manages Inventory in Real Time for Hundreds of Thousands of Retailers

DoorDash is a food delivery service with over 30 million users in 27 countries. They are the largest food delivery app in the United States and have hundreds of thousands of restaurants, retailers, grocery stores, etc. on the platform.

The company started as a food delivery platform that exclusively served meals from restaurants. However, they’ve since expanded to grocery stores, retailers (like Target or DICK’s Sporting Goods), convenience stores and more.

As you might imagine, this has required a ton of engineering to operate smoothly at scale. One big problem is keeping track of inventory.

A grocery store will have hundreds or even thousands of different items. DoorDash needs to track the inventory of these items and display the current stock in realtime. As you might’ve experienced, it’s quite frustrating to place a delivery order for frosted strawberry poptarts only to later find out they were sold out.

DoorDash built a highly scalable inventory management platform to prevent this from taking place.

The platform ingests inventory data from sources like

  • CSV files from retailers with data on inventory

  • In-app signals from DoorDash drivers who indicate in the app that an item is out of stock

  • Internal tooling that uses machine learning to predict inventory based on historical data

DoorDash wanted a platform that can ingest data from these sources and quickly provide a fresh view of the in-store inventory of all the stores on the app. This platform has to support hundreds of thousands of retailers.

They needed the platform to be

  • Highly Scalable - DoorDash is a “hypergrowth” stage startup, growing by double-digits every year. The platform should support this growth.

  • Highly Reliable - All inventory update requests from merchants should eventually be processed successfully.

  • Low Latency - The item data is time-sensitive, so the DoorDash app needs to be updated as quickly as possible with the latest inventory data

  • High Observability - DoorDash employees should be able to see the detailed and historical item-level inventory information in the system.

Chuanpin Zhu and Debalin Das are software engineers at DoorDash and they published a fantastic blog post delving into the architecture of the inventory management platform and the design choices, tech stack and challenges.

Tech Stack

Here’s some of the interesting tech choices DoorDash made for their inventory management platform.

Cadence

Cadence is an open-source workflow orchestration tool that was developed at Uber to help them manage their microservices. If you have multiple backend services that need to be chained together then you can use Cadence to coordinate the task between all these services, handle any errors, retry failed requests, get observability into execution history and more.

Cadence workflows are stateful, so they keep track of which services have successfully executed, timed-out, failed, been rate-limited, etc.

If you need to implement a billing service where you have customer sign ups, free trial periods, cancellations, monthly charges, etc. then you can create a Cadence workflow to manage all the different services involved. Cadence will help you make sure you don’t double charge a customer, charge someone who’s already cancelled, etc.

You can see Java code that implements this in Cadence here.

CockroachDB

CockroachDB is a distributed SQL database that was inspired by Google Spanner. It’s designed to give you the benefits of a traditional relational database (SQL queries, ACID transactions, joins, etc.) but it’s distributed, so it’s extremely scalable.

It’s also wire-protocol compatible with Postgres, so the majority of database drivers and frameworks that work with Postgres also work with CockroachDB.

DoorDash has been using CockroachDB to replace Postgres. With this, they’ve been able to scale their workloads while minimizing the amount of code refactoring.

Some other tech DoorDash uses includes

  • gRPC

  • Apache Kafka

  • Apache Flink

  • Snowflake

and more. (we’ve already discussed these extensively in past Quastor articles).

Architecture

The system ingests item inventory data from a variety of sources.

These sources include

  • CSV files from retailers with data on item inventory

  • DoorDash drivers marking items as out of stock in the app

  • Integrations with Point-of-Sales devices to see what items are being purchased

And more.

Here’s the steps

API Controller - This is the entrypoint of inventory data to the platform. The sources will communicate with the controller with gRPC and send data on inventory many times a day.

Now, the Cadence Workflow begins.

There are several different microservices being called in succession to handle different tasks. Cadence will execute these jobs and track the workers to make sure they’re still running. If one of the services fails, then Cadence will automatically retry it.

Raw Feed Persistence - The inventory data from the various sources is first validated and then persisted to CockroachDB and to Kafka. One of the consumers that are reading from Kafka is DoorDash’s data warehouse (Snowflake), so data scientists can later use this data to train ML models.

Hydration - After being persisted, the inventory data is sent to a Catalog Hydration service. This service enriches the raw inventory data with additional metadata and this gets written to CockroachDB.

Out of Stock Predictive Classification - DoorDash trained an ML model to use the enriched inventory data to predict whether an item will be available in store or not. If an item has historically been extremely popular and there’s very little inventory left, then the ML model might predict that the item is going to be out of stock.

Guardrails - The last part of the workflow is the guardrails service, where DoorDash has certain checks configured to catch any potential errors. These are based on hard-coded conditions and rules to check for any odd behavior. If the inventory of an item is completely different from past historical norms, then a guardrail could be triggered and the update might get restricted.

Generate Final Payload - The final service in the Cadence workflow generates the payload with the updated inventory data stored in CockroachDB. This payload then gets sent to the Menu Service and the updated information is displayed in the DoorDash app/website. The payload is also sent to DoorDash’s data lake (AWS S3) for any future troubleshooting.

This was the initial MVP. DoorDash made some incremental changes to this to improve its scalability.

The changes include

  • Batching item updates so the system could process multiple item inventory updates at a time

  • CockroachDB table optimizations

For more details on these scalability improvements, you can read the full post here.

How did you like this summary?

Your feedback really helps me improve curation for future emails.

Login or Subscribe to participate in polls.

How Recommendation Engines Work

Have you ever wanted to dig into the inner workings of tech like Video Compression algorithms, Recommendation Engines or GPS?

Academic papers and textbooks on topics like these can be extremely hard to understand and time-consuming to dig through. But getting a grasp on them can expand your knowledge of real world systems and help you become a better developer.

Luckily, Brilliant makes this extremely easy by providing interactive, visual lessons on topics in applied CS, machine learning, engineering, data science and much more.

They gamify learning to make it super convenient to build a daily learning habit where you’re mastering concepts in computer science, machine learning, mathematics, etc.

That’s why Brilliant is used by over 10 million developers, data scientists, researchers and lifelong learners.

With the link below, you can get a 30-day free trial to check it out. You’ll also get a 20% discount when you subscribe.

sponsored

Tech Snippets