How Pinterest Replaced their Analytics Database

Why Pinterest switched from Apache Druid to StarRocks. Plus, how Slack runs cron jobs, DoorDash's standards for engineering managers and more.

Hey Everyone!

Today we’ll be talking about

  • Why Pinterest Replaced their Analytics Database

    • Previously, Pinterest used Apache Druid as their data store for their Analytics service. However, they started facing scaling pains.

    • After searching for replacements, the engineering team decided on StarRocks.

    • In this article, we’ll give an introduction to StarRocks and talk about Pinterest’s experience migrating from Druid to StarRocks.

  • Tech Snippets

    • Optimize for Iteration in Coding Interviews

    • How Slack Executes Cron Jobs at Scale

    • Where do Computers get Time From?

    • How DoorDash defines Great Engineering Leadership

Why Pinterest Replaced their Analytics Database

Pinterest is one of the largest social media platforms in the world with over 500 million monthly active users.

They primarily make money through advertising where businesses can pay to promote their posts to Pinterest’s users.

In order to help these businesses understand how their ads are performing, Pinterest engineers spent a ton of time building analytics tooling to track how many views/clicks/saves ads get.

To power these analytics tools, Pinterest previously relied on Apache Druid, an open source, distributed, OLAP (low-latency, analytics workloads) database written in Java.

However, they’ve been having trouble scaling Druid and have been looking for replacements. One of the databases Pinterest is migrating to is StarRocks.

In this article, we’ll talk about what StarRocks is, Pinterest’s process of switching from Druid to StarRocks and what benefits they’ve seen.

Pinterest’s Objectives

As Pinterest’s scale and requirements changed, they started looking for replacements for Apache Druid.

Their targets were

  • Cost Efficient - as Pinterest grows, the costs should stay low

  • SQL - support standard SQL types and schemas

  • Query Capabilities - support joins, sub-queries and materialized views

  • Fast Data Ingestion - simplify ingestion pipeline by removing external dependencies (Pinterest had to use Apache MapReduce for transformations)

Pinterest assessed out multiple storage options and ran extensive tests. Eventually, they decided on StarRocks.

Brief Overview of StarRocks

StarRocks is an open-source, OLAP (analytics-focused) database that’s designed for running low-latency queries on data in real-time (it ingests and processes new data in seconds).

It was originally a fork of Apache Doris but the codebase has been significantly rewritten.

The database is made up of two components: frontend and backend nodes. The frontend nodes are responsible for compiling the user’s SQL queries into execution plans. The backend nodes execute these plans.

Pinterest engineers went with StarRocks because

  • SQL support - StarRocks supports standard SQL syntax and is also compatible with the MySQL protocol. You can connect StarRocks with MySQL clients and BI tools.

  • Fast Ingestion - StarRocks is designed for real-time queries so the ingestion pipeline is optimized to ingest data and make it ready for queries in seconds.

  • Community - StarRocks is open source and has a community of thousands of developers.

  • Performance Tests - Pinterest ran extensive performance tests and they saw lower latencies and improved cost efficiency. 

 Pinterest Analytics Rewrite

Pinterest started the switch by transitioning one of their services from Druid over to StarRocks.

They decided to start with the Partner Insights service. This is responsible for allowing businesses who advertise on pinterest to get live insights on how their ads are performing. It calculates how many impressions, clicks, saves, etc. a business’s ads are getting in real time.

Here’s the architecture of Pinterest’s setup with StarRocks for the Partner Insights service.

The components are

  • Front End Nodes - responsible for managing metadata and creating the query execution plan from the user’s SQL. 

  • Backend Nodes - handle query execution. They persist data locally and perform data scanning and retrieval.

  • Archmage - a Pinterest service that shields engineers from the complexities of deployment, version upgrades and other operations. This service creates a uniform interface over all the different analytical storage systems that Pinterest uses.

  • Load Balancer - distributes user queries among the StarRocks frontend nodes using a round-robin load balancing strategy.

Results

After the migration from Druid to StarRocks, Pinterest saw a huge increase in their cost-efficiency.

They were able to reduce the p90 latency (latency for the top 90% of requests) by 50% and they only needed 32% of the instances they had previously provisioned with Druid.

The data ingestion process was also streamlined and they achieved a data freshness of 10 seconds.

Tech Snippets