How Pinterest Replaced their Analytics Database

Why Pinterest switched from Apache Druid to StarRocks. Plus, how Slack runs cron jobs, DoorDash's standards for engineering managers and more.

Hey Everyone!

Today we’ll be talking about

  • Why Pinterest Replaced their Analytics Database

    • Previously, Pinterest used Apache Druid as their data store for their Analytics service. However, they started facing scaling pains.

    • After searching for replacements, the engineering team decided on StarRocks.

    • In this article, we’ll give an introduction to StarRocks and talk about Pinterest’s experience migrating from Druid to StarRocks.

  • Tech Snippets

    • Optimize for Iteration in Coding Interviews

    • How Slack Executes Cron Jobs at Scale

    • Where do Computers get Time From?

    • How DoorDash defines Great Engineering Leadership

When you’re building a startup, you’ll eventually have to tackle features like Single-Sign On and SCIM (to sync user data stemming from the identity provider to your app). These integrations are crucial if you want to sell to enterprise customers.

One question you’ll have to answer is Should you spend engineering time building SSO and SCIM in-house, or does it make more sense to use a ready-made service?

WorkOS published a terrific blog post delving into the trade-offs you’ll have to think about and the different steps involved in building SSO and SCIM from scratch.

In the post, they interviewed engineering leaders from high-growth startups like Warp, Prefect, and Chromatic to ask them how their companies thought about the trade-off.

For Warp, they considered building SSO in-house but abandoned the idea when they estimated the engineering time that it would require.

For Chromatic, they found open-source libraries like Passport.js that give reference points for SSO but decided it would be better to use a platform like WorkOS.

For more details on how you can think about build vs. buy, read the full blog post below.

sponsored

Why Pinterest Replaced their Analytics Database

Pinterest is one of the largest social media platforms in the world with over 500 million monthly active users.

They primarily make money through advertising where businesses can pay to promote their posts to Pinterest’s users.

In order to help these businesses understand how their ads are performing, Pinterest engineers spent a ton of time building analytics tooling to track how many views/clicks/saves ads get.

To power these analytics tools, Pinterest previously relied on Apache Druid, an open source, distributed, OLAP (low-latency, analytics workloads) database written in Java.

However, they’ve been having trouble scaling Druid and have been looking for replacements. One of the databases Pinterest is migrating to is StarRocks.

In this article, we’ll talk about what StarRocks is, Pinterest’s process of switching from Druid to StarRocks and what benefits they’ve seen.

Pinterest’s Objectives

As Pinterest’s scale and requirements changed, they started looking for replacements for Apache Druid.

Their targets were

  • Cost Efficient - as Pinterest grows, the costs should stay low

  • SQL - support standard SQL types and schemas

  • Query Capabilities - support joins, sub-queries and materialized views

  • Fast Data Ingestion - simplify ingestion pipeline by removing external dependencies (Pinterest had to use Apache MapReduce for transformations)

Pinterest assessed out multiple storage options and ran extensive tests. Eventually, they decided on StarRocks.

Brief Overview of StarRocks

StarRocks is an open-source, OLAP (analytics-focused) database that’s designed for running low-latency queries on data in real-time (it ingests and processes new data in seconds).

It was originally a fork of Apache Doris but the codebase has been significantly rewritten.

The database is made up of two components: frontend and backend nodes. The frontend nodes are responsible for compiling the user’s SQL queries into execution plans. The backend nodes execute these plans.

Pinterest engineers went with StarRocks because

  • SQL support - StarRocks supports standard SQL syntax and is also compatible with the MySQL protocol. You can connect StarRocks with MySQL clients and BI tools.

  • Fast Ingestion - StarRocks is designed for real-time queries so the ingestion pipeline is optimized to ingest data and make it ready for queries in seconds.

  • Community - StarRocks is open source and has a community of thousands of developers.

  • Performance Tests - Pinterest ran extensive performance tests and they saw lower latencies and improved cost efficiency. 

 Pinterest Analytics Rewrite

Pinterest started the switch by transitioning one of their services from Druid over to StarRocks.

They decided to start with the Partner Insights service. This is responsible for allowing businesses who advertise on pinterest to get live insights on how their ads are performing. It calculates how many impressions, clicks, saves, etc. a business’s ads are getting in real time.

Here’s the architecture of Pinterest’s setup with StarRocks for the Partner Insights service.

The components are

  • Front End Nodes - responsible for managing metadata and creating the query execution plan from the user’s SQL. 

  • Backend Nodes - handle query execution. They persist data locally and perform data scanning and retrieval.

  • Archmage - a Pinterest service that shields engineers from the complexities of deployment, version upgrades and other operations. This service creates a uniform interface over all the different analytical storage systems that Pinterest uses.

  • Load Balancer - distributes user queries among the StarRocks frontend nodes using a round-robin load balancing strategy.

Results

After the migration from Druid to StarRocks, Pinterest saw a huge increase in their cost-efficiency.

They were able to reduce the p90 latency (latency for the top 90% of requests) by 50% and they only needed 32% of the instances they had previously provisioned with Druid.

The data ingestion process was also streamlined and they achieved a data freshness of 10 seconds.

Enterprises can have hundreds to thousands of employees, where each employee has access to different SaaS tools. To manage authentication and authorization for each of these employees securely, large companies will use Identity Providers like Okta, Google Workspace, Azure Active Directory, and more.

If you’re building a SaaS tool and hoping to sign on large enterprises as a customer, one crucial feature you’ll need to build is integration with these Identity Providers.

WorkOS recently published a fantastic guide on all the intricacies you’ll need to think about when building this.

You’ll have to

  • Sync with Push, not Pull - Identity Providers operate on a push model. You’ll need to implement endpoints that can handle requests from these providers.

  • Create a Mapping Layer for your Permissions - Identity Providers will have a different concept of roles and permissions compared to your application. You’ll need to build a flexible mapping system that translates between the two. 

  • Handle Fine-Grained Authorization - You might have very fine-grained permissions with your application. You’ll need to implement a hybrid system where high-level roles are synced with the Identity Provider but resource-specific permissions are managed within your application.

Read WorkOS’s full developer guide to understand these concepts and what you need to build to sign on enterprise customers for your apps.

sponsored

Tech Snippets