The Architecture of Stripe's Document Database

Stripe built a document database on top of MongoDB. We'll go over it's architecture and why they built it. Plus, AI hype is getting out of control, how to scale a startup's engineering team and more.

Arpan KG
June 12, 2024

Hey Everyone!

Today we’ll be talking about

The Architecture of Stripe’s Document Database - Stripe wrote a great blog post describing DocDB, their internal database as a service. DocDB is built on MongoDB and stores petabytes of data.
- Brief Intro to MongoDB and its Benefits
- Why Stripe built DocDB
- Architecture of DocDB and how it works
- Rebalancing Data Shards on DocDB for Efficiency
Tech Snippets
- Why Netflix uses FreeBSD
- AI Hype is getting out of Control
- How to Scale a Startup’s Engineering Team
- Build GPT-2 in 4 Hours

The Architecture of Stripe’s Document Database

Stripe is one of the largest payment processors in the world. In 2023, they processed over $1 trillion USD of payment volume, and they did this with a 99.999% uptime.

A crucial system that helped the company achieve this is DocDB, Stripe's internal Database as a Service.

Developers at Stripe can use the API for reading/writing data (OLTP reads/writes) and not have to worry about scaling compute, increasing storage, schema changes, etc. They can just focus on the product they're building.

Stripe's Database Infrastructure team published a fantastic blog post delving into the internals of DocDB and how it works.

When Stripe was founded in 2011, the company adopted MongoDB as their online database. They found it to be easier to use than a traditional relational database.

As the company grew to hundreds of terabytes of data, Stripe built DocDB on top of MongoDB to make scaling easier.

DocDB handles dynamic rebalancing between shards, gives fine-grained control over data distribution, ensures data consistency during migrations and more.

In this article, we'll first give a brief overview of MongoDB and then talk about how Stripe designed DocDB.

MongoDB Overview

MongoDB is a document-oriented database. It stores your data in semi-structured documents using BSON ( a binary format that extends JSON).

It was first developed in 2007 and released as an open-source database in 2009 (note - in 2018, they changed their license to be more restrictive).

The database was created by the founders of DoubleClick, the startup that would later get acquired by Google and become Google Ads.

The founders faced scalability and usability issues with traditional relational databases so they built MongoDB with certain design goals:

Developer-Friendly - Relational databases will store data in structured tables with relationships defined by primary and foreign keys. This doesn't naturally map with the traditional data structures you use in an object-oriented programming language (this issue is called Object-Relational Impedance Mismatch). MongoDB stores data as flexible, JSON-like documents so it's much more natural to map to objects in your code.
Scalability - Horizontal scalability is built into the core of MongoDB and it supports range, hash and zone-based sharding. Document-oriented databases encourage denormalization (where all related data is embedded into a single document) to minimize joins. Avoiding cross-shard joins is crucial for scalability.
Schema Flexibility - With a Relational Database, you need to define a schema and ensure that any data you insert follows the schema. Changing the schema means doing a database migration. On the other hand, MongoDB is schemaless. Each document can have different fields and the data types of those fields can vary from document to document.

Why Stripe built DocDB

Stripe originally started with MongoDB. As they grew (at an insanely fast rate), the engineering team desired additional features.

In order to utilize their database infrastructure most efficiently, they needed to transfer data between different shards in their fleet. Stripe has thousands of shards, so managing this can be very complex.

They wanted a solution that gave them complete operational control, so they could move individual data chunks between shards. Additionally, this needs to be done with minimal downtime and strong data consistency (Stripe is dealing with financial data).

To solve this, they built DocDB on top of MongoDB.

Architecture of DocDB

As we mentioned earlier, DocDB is a Database as a Service that Stripe engineers can use through an API.

Developers can send a read/write request to DocDB and it'll first go to a Database Proxy server.

The proxy server will first check for things like access controls, potential bugs, scalability, etc.

Then, it'll figure out which specific data chunks are being read/modified and it'll talk to a central Chunk Metadata Service to get the locations of the specific database shards.

Finally, the proxy server will send the read/write requests to the specified database shards. Each of the database shards has multiple replicas so they use a CDC (Change Data Capture) service to replicate changes between the replicas.

Data Movement Platform

When you're sharding horizontally, it's important to remember that your data won't be static. You'll need to move data between shards for expanding capacity, hardware upgrades, rebalancing hot/cold shards and more.

In Stripe's case, this is especially complicated because they're handling financial data.

Some of their requirements were

Data Consistency - data being migrated needs to be consistent between both the source and target shards.
Zero Downtime - any prolonged downtime is unacceptable as businesses need to process payments 24/7. Downtime should be under a few seconds so the product application can just retry the read/write request and there is minimal impact to the customer.
Granularity - they should be able to migrate an arbitrary number of data chunks between shards without any restrictions on the number of in-flight transfers or the number of migrations a given shard can perform at once.

Here's the steps they follow for zero-downtime migrations across database shards:

Confirm Migration and build Indexes - the system first registers the start of the migration in the Chunk Metadata Service. It also builds indexes on the target shards for the data chunks that are being migrated.
Bulk Data Import - take a snapshot of the data chunk on the original shard and copy it onto one or more target database shards.
Asynchronous Replication - After you copy over the original shard to the target shard, the original shard will still be getting writes. In this step, you asynchronously replicate any writes that happen on the original shard over to the target shards.
Correctness Check - Take point-in-time snapshots of the source and target shards and compare them to ensure data completeness and correctness.
Traffic Switch - Once the data is imported to the target shard and mutations are being properly replicated, DocDB's coordinator will switch traffic over to the target shard. Stripe does this by first blocking any new writes on the source shard. Then, they wait for the replication service to replicate any outstanding writes to the target shard. Finally, the update the route for the data chunk to point to the target shard in the Chunk Metadata service.
This step takes a few seconds, so any database requests that get blocked during this process can just retry after a small timeout and get served by the new shards.
Finalize Migration - The last step is to mark the migration as complete in the chunk metadata service. They can also delete the chunk data off the source shard.

Results

DocDB's ability to migrate data between shards in a consistent, granular and reliable way has made it significantly easier for Stripe to scale.

In 2023, they migrated petabytes of data between shards and it helped them achieve much better utilization of their database infrastructure.

Tech Snippets

AI Hype is getting out of Control

With the acceleration in AI over the past few years, there’s been a ton of hype generated by the press, wall street and big tech companies.

This video provides a more critical look at the hype and delves into benchmarks, LLM performance and some past fakedemos by big tech companies.

Companies like BP are claiming that they “need 70% fewer coders“ thanks to AI but skeptics are saying this is a massive over-estimation.

https://youtu.be/VctsqOo8wsc

Why Netflix uses FreeBSD

Netflix is a significant contributor to FreeBSD and they maintain an internal distro that they use for their servers.

These are slides from an interesting talk where Netflix delved into why they use FreeBSD, how they’ve improved the operating system, their testing suite for reliability and more.

people.freebsd.org/~gallatin/talks/OpenFest2023.pdf

Build GPT-2 in 4 Hours

Andrej Karpathy is a founding member of OpenAI and was previously Director of AI at Tesla. He runs a terrific YouTube channel where he posts longform content on building LLMs.

In his latest video, he builds the GPT-2 model and talks about optimization techniques like mixed precision training and kernel fusion. He runs benchmarks on the created model and shows that it has performance close to GPT-3.

https://www.youtube.com/watch?v=l8pRSuU81PU

How to Scale a Startup’s Engineering Team

Gad Salner was an Engineering Manager at Melio and he wrote a great blog post on how he scaled their engineering team with the first few dozen hires. You have to balance different aspects like onboarding, knowledge sharing, dev experience and more. Gad talks about Melio's strategy for how they expanded their engineering team while balancing all these areas.

medium.com/meliopayments/how-to-scale-a-unicorn-building-engineering-team-and-stay-sane-40af8ac7e3db