How Dropbox Accelerated Their A/B Tests

Plus Lessons Learned Implementing Payments in the DoorDash Android App, How to Present to Executives and more.

Hey Everyone!

Today we’ll be talking about

  • How Dropbox Accelerated Their A/B Tests

    • Dropbox needed a metric for judging A/B tests and how they affected revenue

    • They created Expected Revenue, which is calculated by looking at a user’s engagement data and using a ML model to predict the user’s two year revenue.

    • We’ll give an overview of how the model works, the engineering behind it and how it’s trained and backtested.

  • Lessons Learned Implementing Payments in the Doordash Android App

    • DoorDash published an interesting blog post on lessons the engineering team learned while building their payments system in the Android app

    • Make sure it’s possible to add new payment methods in the future

    • Plan for consumers who travel and use the app in multiple countries

    • Beware of restrictions and implementation guidelines specific to payment methods

  • Tech Snippets

    • How to Present to Executives (Engineering Leadership)

    • We Invested 10% To Pay Back Tech Debt (Engineering Leadership)

    • 42 Things I Learned From Building a Production Database (Backend)

    • Redis Explained (Backend)

    • Algorithms Implemented in Rust (General)

    • JavaScript APIs You Don’t Know About (Frontend)

    • The World of CSS Transforms (Frontend)

A/B Testing at Dropbox

Dropbox is a file hosting service with over 700 million users and more than $2 billion dollars in annual revenue.

In order to maximize revenue, the Dropbox team regularly runs A/B tests around landing pages, trial offerings, pricing plans and more. An issue the team faced was determining which metric to optimize for in their experiments.

Dropbox could run an A/B test where they change the user onboarding experience, but how should they measure the results? They could wait 90 days and see if the change resulted in paid conversions, but that would take too long.

They needed a metric that would be available immediately and was highly correlated with user revenue.

To solve this, the team created a new metric called Expected Revenue (XR). This value is generated from a machine learning model trained on historical data to predict a user’s two year revenue.

Michael Wilson is a Senior Machine Learning Engineer at Dropbox, and he wrote a great blog post on why they came up with XR, how they calibrate it, and the engineering behind how it’s calculated.

Possible Options for Metrics

Dropbox considered several frequently-used metrics for analyzing their A/B experiments.

Possible metrics include

  • 7 day activity rates

  • 30 day trial conversion rate

  • 90 day retention rate

  • 90 day annual contract value

And more

However, 7 day and 30 day measures didn’t factor in long term retention/churn. The 90 day measures provided a more accurate measure of churn but waiting for those results took way too long. Having to wait 90 days to figure out the result of an A/B test would hamper the ability to iterate quickly.

To solve this, the Dropbox team came up with Expected Revenue

What is Expected Revenue (XR)

Dropbox wanted a metric that was

  • Highly correlated with a user’s lifetime value

  • Could be calculated within a few days

XR is meant to measure this.

To calculate it, Dropbox looks at a variety of factors

  • User activity such as uploading/sharing files

  • Upgrading/downgrading their plan to a higher/lower tier

  • Inviting friends to join the app

  • User location

And more.

They built a machine learning model to use these factors to predict how much the user will spend over the next two years on Dropbox.

To generate the XR prediction, Dropbox uses a combination of Gradient Boosted Decision Trees and Regression models.

Dropbox trains the models on historical data they have around user activity, conversion and revenue. They also have historical data on past A/B experiments they ran and how much these experiments lifted user’s two-year revenue.

They used this data to train and back-test their XR prediction models.

Now, they can run A/B experiments and check how the XR prediction changed between the cohorts. Evaluating experiments takes a few days instead of months.

Calculating Expected Revenue

Dropbox calculates Expected Revenue values on a daily basis.

They use Apache Airflow as their orchestration tool to load the feature data and run the XR calculations.

The feature data is loaded through Hive, which extracts the data from Dropbox’s data lake.

The machine learning models are stored on S3 and accessed through an internal Dropbox Model store API. This evaluation is executed with Spark.

For more details, you can read the full blog post here.

Today’s summary was quite short, so apologies if you’re looking for more in-depth content. I’ll have longer summaries next week!

How did you like this summary?

Your feedback really helps me improve curation for future emails.

Login or Subscribe to participate in polls.

Tech Snippets

Engineering Leadership

​​Will Larson is the CTO of Calm and he writes a great engineering blog called Irrational Exuberance. In this post, he talks about presenting to executives at your company and how to do it well.

You should have a clear understanding of why you’re communicating with them and what you’re trying to convey. Give the relevant context, explain the problem and present your solution.

Things to avoid are

  • Presenting a question without an answer

  • Arguing against feedback

  • Fixating on your preferred outcome

Read the full blog post for more.

Alex Ewerlöf is a Senior Staff Engineer at Volvo. He wrote an interesting post on an initiative he had at a previous company where the engineering team would spend every other Friday dealing with any tech debt. He talks about how they implemented this and the tremendous effect it had on their codebase, team morale and development velocity.

If you find Quastor useful, you should check out Pointer.io. It’s a reading club for software developers that sends out super high quality engineering-related content.

It’s read by CTOs, engineering managers and senior developers so you should definitely sign up if you’re on that path (or if you want to go down that path in the future). It’s completely free! (cross promo)

Backend

Mahesh is a distributed systems researcher who was previously an Associate Professor at Yale teaching CS.

He spent a few years at Facebook, where he started and productionized Delos, a storage system for control plane services. This is replacing all uses of ZooKeeper at the company.

He wrote a great blog post where he delved into things he learned on the journey.

Here’s a great deep dive into Redis. Mahdi Yusuf talks about the data you can store in Redis, how it compares to Memcached and different deployments. He also dives into Redis’ persistence models and guarantees around durability.

A cool way to pick up a new programming language is to use it to solve LeetCode problems. You’ll get some practice solving quick, non-trivial problems with the language plus you can also get some interview prep (just in case).

A great way to get started with this is by looking up how common data structures & algorithms are implemented in the language you’re trying to use.

Here’s a github repo with Rust implementations of a bunch of different algorithms (BFS, Topological Sort, Bellman Ford, etc.) and data structures (Heaps, Linked Lists, BSTs, etc.).

Frontend

This is an awesome article on useful Browser APIs that you may be unaware of.

Juan Diego Rodriguez goes through the

  • Page Visibility API - triggers when a user has changed tabs/windows

  • Web Share API - share text, links, files and other content to another app like a messaging app, bluetooth/wifi channel, etc.

  • Broadcast Channel API - allows basic communication between different windows/tabs that are on the same origin

  • Internationalization API - makes it easy to translate text, dates, numbers, units, etc. to different languages on your website

Josh Comeau wrote a detailed article delving into the transform property in CSS. He talks about the different transform functions and gives awesome graphics that let you easily visualize what each function does.

After, he goes into combining transform functions to produce extremely cool effects and also provides graphics for that too.

It’s a great read if you want to improve your CSS Animation skills.

Lessons Learned from Implementing Payments in the DoorDash Android App

Harsh Alkutkar is a software engineer on DoorDash’s ordering experience team.

He wrote a great blog post for their engineering blog on integrating payments in their Android App (which has processed payment for hundreds of millions of orders).

How mobile payments are typically implemented

When a user is making an online order, they will submit their credit card information to a payment gateway such as Stripe or PayPal. The gateway encrypts this information and facilitates the transaction with payment processors.

The payment processor will talk to the issuing bank (for the user’s credit card) and request approval.

The approval will then bubble to the backend, which lets the client know if the payment was accepted/declined.

Mobile payments can become very complex for the following reasons

  • Multiple Payment Methods - In order to boost conversion rates, you want to offer as many payment methods as possible to the user. Each method requires its own integration into the app and requires its own custom testing strategy.

  • User experience - The UI needs to work with all payment methods and also for new and existing users. This creates quite a few scenarios that have to be implemented and tested.

  • Testing - Testing cannot be an afterthought. The backend has to be designed in a way that allows every payment method and flow to be tested before a release.

  • Fraud - Anti-fraud measures need to be implemented in any app that includes a mobile payment component.

  • Location - you need to account for the user’s location before processing their payment to comply with each country’s laws and regulations

Here’s a couple of the lessons DoorDash engineers learned while implementing payments in the DoorDash Android app.

Plan and design for future payment methods

In an earlier version of the DoorDash app, the developers didn’t properly account for the addition of new payment methods to the app. Instead, it was designed around credit cards and Google Pay. This led to challenges when adding new methods like PayPal or DoorDash credits.

In the new design, engineers introduced the notion of payment methods into the codebase where a payment method can be Google Pay, PayPal, a credit card, etc.

These are categorized into local payment methods that are part of the device (like Google Pay) and external payment methods that require interactions with a backend service (like Stripe).

Beware of restrictions and implementation guidelines specific to payment methods

Payment vendors may have specific ways they want to be portrayed in an app.

For example, Google Pay requires that it be the primary payment option wherever possible.

Additionally, they have strict UX guidelines that explain how to display their logos and buttons with specifications around width, margin, etc.

The payments team had to ensure that all of DoorDash’s assets complied with the guidelines set by the various payment vendors.

Plan for consumers in different countries or traveling consumers

Payments usually can’t be implemented in a generic way that scales worldwide. Each country has its own technical, legal and accounting implications.

Some payment methods may also need extra verification or information in other countries.

Therefore, they made sure their publishable API keys are country-specific and ensured that they fetched them based on the consumer’s current location.

Plan for performance

It’s absolutely critical to keep an app performant while it processes payments. Caching the payment methods and cards makes it faster because there’s less waiting for payment information on the cart and checkout screens.

However, this has to be done with care and must account for error cases where the backend and device are out of sync.

When a user starts the DoorDash app, they send backend requests to make sure they have the most up-to-date payment information on file.

Add lots of telemetry

Payment flows can be tricky to debug if they don’t work properly. There’s many different failure scenarios that can cause a failed payment.

Therefore, instead of just sending generic information like “payment failed”, the system should send as much information as possible; including things like error codes from providers, device/location information or any diagnostics that can help to identify the state of the app when it failed.

However, be careful not to include any personal identifiable information or any payment information that could be compromised by an attacker.