How Canva Collects 25 Billion Events Per Day

An overview of AWS Kinesis and how Canva uses it to collect and process 25 billion events per day. Plus, the art of good code review, how to find coachable employees and more.

Hey Everyone!

Today we’ll be talking about

  • How Canva Collects 25 Billion Events Per Day

    • Brief Overview of AWS Kinesis

    • Architecture of Canva’s Data Pipeline

    • Why Canva picked Kinesis over AWS SQS and techniques Canva uses to minimize costs

  • Tech Snippets

    • Go is my hammer, and everything is a nail

    • Coachability: The Prerequisite To Growth

    • The art of good code review

How Canva Collects 25 Billion Events Per Day

Canva is an online graphics design platform that lets you create presentations, social media banners, infographics, logos and more. They have over 175 million monthly users and are valued at $26 billion. 

In order to understand how people are using the platform, Canva’s mobile, web and desktop apps collect a wide range of events on user clicks, views, scrolls, etc.

Every day, Canva needs to collect and process over 25 billion events (800 billion events per month). This needs to be done with 99.999% uptime.

Last month, they published a fantastic blog post on how they built a data pipeline to handle this.

They talk about why they built the pipeline on AWS Kinesis and the specific techniques they use to minimize costs and latency.

Brief Overview of AWS Kinesis

AWS Kinesis is a family of services for processing and analyzing streaming data in real-time. It was launched in late 2013 and is composed of four main services: Data Streams, Data Firehose, Data Analytics and Video Streams.

Here’s a brief overview of the four services:

  • Data Streams - this service is responsible for ingesting and storing streaming data in real-time with sub-second latency. Kinesis Data Streams does not handle data processing so you’ll need to use another tool (Apache Flink, Kinesis Data Analytics, Spark, etc.) for transformations and analytics. Kinesis Data Firehose is used for sending the processed data to destinations like AWS S3, MongoDB, etc. 

  • Data Firehose - Firehose is primarily used for loading streaming data into data lakes, databases and analytics services. You can deliver your data to AWS S3, Redshift, Elasticsearch, Splunk and other data stores.


    However, Firehose can also handle data ingestion and basic transformations. A few months ago, Firehose was rebranded from Kinesis Firehose to Data Firehose (but Firehose’s API and other functionality wasn’t changed).

  • Data Analytics - If you’d like to run complex transformations on the streaming data that’s been ingested through Data Streams, then you can do that with Kinesis Data Analytics.

    Under the hood, Data Analytics uses Apache Flink so Amazon has also rebranded Kinesis Data Analytics to “Amazon Managed Service for Apache Flink” (but the core capabilities and purpose haven’t changed).

  • Video Streams - In addition to data, Kinesis can also be used for ingesting and storing live video. Kinesis Video Streams gives you the infrastructure to ingest and store video data. You can integrate it with other services to process and distribute the stored video.

Canva uses Kinesis Data Streams to ingest 25 billion events per day. From Kinesis, Canva sends the event data to Snowflake for processing.

Here’s how the data pipeline works…

Canva’s Data Pipeline for Collecting Events

Canva has iOS, Android, web and desktop applications. Each of these apps is instrumented to collect events and send them to Canva’s backend.

Canva’s servers will first validate the events and make sure that they conform to a predefined schema.

They will then batch the events together (with a few hundred events per batch) and apply ztsd compression. Then, Canva’s servers will send the events to a Kinesis Data Stream.

From Kinesis, Canva has an ingestion worker that will read the events and enrich them with additional data. This worker will do things like

  • Add country-level geolocation data

  • Add user device details

  • Correct any timestamp issues

Canva has a separate ingestion worker do this processing because they wanted to minimize the latency of the collection endpoint in the server. Decoupling the event collection and the event enrichment helps them scale to 25 billion events per day. 

After enrichment, the events are sent back to Kinesis. Canva’s router then routes the events to Snowflake. Canva runs their ML models, dashboards and data analytics with Snowflake as the data store.

Some of the event types are also sent to AWS SQS so they can be consumed by other backend services at Canva (that need to process the event data in real-time).

Minimizing AWS Costs

  • AWS Kinesis over SQS - In the first version of the data pipeline, Canva used AWS SQS and SNS instead of Kinesis. These were easier to set up however the pricing was significantly higher. By switching to Kinesis Data Streams, Canva saw costs drop by 85%.

  • Event Compression - Canva’s servers will first batch the events (in groups of a few hundred events per batch) and apply ztsd compression. These compressed batches will then be sent to Kinesis. Using this strategy (instead of sending each event as a separate record) saves Canva $600k every year in AWS costs.

Tech Snippets