The Design of Uber's Push Notification System

November 21, 2022

Hey Everyone!

Today we’ll be talking about

How Uber Schedules Push Notifications
- The Different Components of Uber's Consumer Communication Gateway
- Using Linear Programming To Determine the Optimal Send Time
- Scheduling and Delivering Notifications
Tech Snippets
- Building a Startup's Engineering Team
- Airbnb's Key-Value Storage Engine
- Scaling Kafka at PayPal

Plus, we have a solution to our last coding interview question on string matching.

How Uber Schedules Push Notifications

Push notifications are a crucial channel for Uber Eats to inform customers of new restaurants, promotions, offerings and more. The team first started sending marketing push notifications in March 2020, and they’ve quickly grown the volume to billions of notifications per month.

However, there were a variety of issues with these notifications that caused a poor user experience

Quality Issues - notifications were being sent after hours, with invalid deeplinks, expired promo codes and more.
Timing Issues - notifications were being sent within minutes or hours of each other, overwhelming users
Too Much Work - the marketing team had to manually check messages for quality/timing, which resulted in a lot of work that could potentially be automated.

In order to solve this, Uber built a system called the Consumer Communication Gateway (CCG).

The CCG will receive all the marketing-related notifications that the various teams at Uber want to send and put them all in a buffer. Then, it scores all the notifications based on how useful each one is and uses that to schedule the notifications for a date/time and eventually deliver them.

Earlier this month, Uber Engineering published a great blog post that delves into how the Consumer Communication Gateway works.

Here’s a summary

At Uber, marketing push notifications will be sent by various teams, like Marketing, Product, City Operations and more. They send their notification to the Consumer Communication Gateway (CCG) to be sent out.

The CCG consists of 4 main components

Persistor - accepts the push notifications from the various teams and stores them onto non-volatile storage (sharded MySQL) for access by the other components. This database is called the Push Inbox.
Schedule Generator - fetches all the buffered pushes for a user from the Push Inbox and uses Uber’s Machine Learning platform to determine which notifications should be sent and when. We’ll discuss this in further detail below.
Scheduler - receives the push notification and delivery time. It makes a best attempt to trigger the delivery of that push at the correct time.
Push Delivery - sends the push along to downstreams responsible for message delivery. Apple Push Notification Service for iOS and Firebase Cloud Messaging for Android.

View the image here.

Persistor

The persistor is the entrypoint to the system, and it receives the push notifications intended for delivery via gRPC.

It stores the push content along with the metadata (promotion involved, restaurant hours, etc.) into the Push Inbox.

The Push Inbox is built on top of Docstore, a distributed database built at Uber that uses MySQL as the underlying database engine.

The inbox table is partitioned by the recipient's user-UUID, so it scales horizontally with minimal hotspots.

Schedule Generator

The Schedule Generator is triggered every time the Persistor writes a push notification to the Inbox.

It will then fetch all push notifications that have to be delivered for that user (even if they’re already scheduled) and uses Uber’s ML platform to determine the optimal schedule for the push notifications.

How Uber determines the Optimal Schedule Time For Notifications

Uber wants to deliver useful notifications to the user without overwhelming them (otherwise he/she might disable all marketing push notifications).

Uber engineers formulate this as an Assignment Problem, which is part of Combinatorial Optimization (examples of problems in Combinatorial Optimization are the Traveling Salesman, Minimum Spanning Tree and Knapsack problem)

For a certain user, they can send a maximum of S push notifications per day and they have N possible push notifications to choose for those S slots. Each push notification is assigned a certain importance score, so the problem is to figure out which push notification schedule maximizes the sum of those scores (given some other constraints).

Uber solves this using linear programming. Some of the specific constraints they have are

Push Notification Expiration Time (they have to send the push notification before the promotion code expires)
Push Send Window (what times per day can push notifications be sent? Avoid night time for example)
Daily Frequency Cap (how many notifications can be sent per day?)
Minimum Time Between Push Notifications (how much time does Uber have to put between certain push notifications to avoid annoying the user)

And more.

The optimization framework takes those constraints, the set of push notifications, the importance score of each notification and then determines the optimal schedule.

Determining the Importance Score

Each push notification has an importance score associated with it.

This score is determined with a machine learning model that predicts the probability of a user making an order within 24 hours of receiving that push notification.

ML Engineers trained an XGBoost model on historical data to predict the conversion probability given the following features

Category of the push notification (promo code, product announcement, holiday message, etc.)
Content (heading, body text, deeplink, etc.)
User features (User’s previous order history, engagement with push notifications)

And more.

Scheduler

Once the Schedule Generator has scheduled the notifications, it calls the Scheduler for each push notification.

The Scheduler provides a distributed cron-like system with a throughput of tens of thousands of triggers per second. It receives the notification and the delivery time and it'll trigger the delivery of the notification at the correct time.

For this, Uber uses Cadence, an open source, distributed orchestration engine. Push notification assignments are buffered into Kafka topics for each hour in the time horizon.

The scheduler allows for a push notification to easily be rescheduled to another time.

Push Delivery

When the scheduler determines that a scheduled push is ready for sending, it triggers the push delivery component.

This component does some last checks before sending out the notification (like whether Uber has enough delivery drivers in that area at the current time and if the promotion code is still valid). If the checks pass, the notification gets sent downstream to services like Firebase Cloud Messaging and Apple Push Notification Service.

The component provides for things like retries if delivery fails.

For more details, you can read the full blog post here.

How did you like this summary?

Your feedback really helps me improve curation for future emails.

Choosing The Right Authentication Platform

When you’re implementing authentication, you’ll usually want to go with a third party provider. It makes more sense to have your engineers focus on improving the product while outsourcing things like authentication to specialists.

However, picking an authentication provider can feel daunting.

You need to consider things like

Developer experience - to get up and running as quickly as possible
Security - ensuring your provider is SOC2 Type II compliant
Flexibility - to integrate with your existing infrastructure and scale with you
UX - ensuring you have control over your onboarding and login flow to match your brand

And much more.

Stytch solves all of these problems with a simple API that’s easy to integrate while allowing for maximum flexibility.

You can learn more about different authentication solutions and how to navigate these needs by talking to an Authentication Expert for free.

sponsored

Tech Snippets

How to Scale a Startup’s Engineering Team - Gad Salner was an Engineering Manager at Melio and he wrote a great blog post on how he scaled their engineering team with the first few dozen hires. You have to balance different aspects like onboarding, knowledge sharing, dev experience and more. Gad talks about Melio's strategy for how they expanded their engineering team while balancing all these areas.

Airbnb's Key-Value Store for Derived Data - Airbnb built a persistent key-value store that can scale to petabytes of data, serve low latency reads (< 50 milliseconds for p99) and efficiently load data in bulk. They wrote a great blog post on the design of this storage system and the technologies they used.

Scaling Kafka at PayPal - PayPal is migrating their analytical workloads to Google Cloud Platform. One component of this is building a streaming application which consumes massive amounts of data from Kafka (35 billion events daily) and streams it to BigQuery. They wrote an interesting blog post on how they ran load and performance tests on this application and dealt with issues that arose.

Previous Question

As a reminder, here’s our last question

Write a function that sorts the elements in a stack so that the smallest elements are on top.

You can use an additional temporary stack, but you cannot copy the elements into any other data structure.

The stack supports the following operations

push
pop
peek
isEmpty

Solution

We can solve this question with a modified version of insertion sort.

In insertion sort, you build the final sorted array one item at a time, where you remove an item from the input data and find its location in the sorted list and insert it there.

For us, we’ll have our input stack and then a temporary stack (tempStack).

We’ll continuously pop off the top element from our input stack and put it in a temporary variable (topNum). We’ll then find the correct place in tempStack to insert topNum in. Our loop invariant is that tempStack will always be correctly sorted.

However, tempStack… is a stack (obviously). So, we can’t randomly insert topNum anywhere in the stack. We can only push our element to tempStack.

Therefore, rather than doing random insertions, we’ll pop elements off tempStack and append them to our input stack if the item at the top of tempStack is greater than topNum (the item we’re trying to insert in tempStack).

On the next iteration of our loop, that element becomes our new topNum and will be placed back into tempStack in it’s correct location.

We can terminate the loop when our input stack is empty.

Then, we’ll just pop and append all the elements in tempStack to our input stack.