How Booking.com Searches Through Millions of Locations in Milliseconds
An introduction to Quadtrees and using them for storing map data. Plus, how Pinterest uses Apache Druid for storing ad reporting metrics and how to learn Rust if you're a JavaScript dev.
Hey Everyone!
Today we’ll be talking about
How Booking Searches Through Millions of Locations in Milliseconds
One of Booking.com’s features is a map, where they show points of interest within a certain location
The data structure powering this map is a Quadtree, a tree data structure that’s used extensively for spatial indexing, image processing and more
We’ll talk about how Booking.com builds this Quadtree and how they search through it to find location markers
For Quadtrees storing more than 300,000 markers, Booking achieves a p99 lookup time of less than 5.5 milliseconds
How Pinterest Load Tests Their Database
Pinterest uses Apache Druid as a database to store and serve their ad reporting metrics. Druid is an open source, column-oriented, distributed database that's written in Java.
Architecture of Druid and the structure of its components and services.
How Pinterest tests Druid's ability under increased queries per second, data ingestion and storage requirements.
Tech Snippets
Pair Programming Anti-Patterns
Rust for JavaScript Developers
A Curated List of Resources for CTOs
A Dive into RLHF (why ChatGPT works so well)
How Booking.com’s Map Feature Works
Booking.com is an online travel agency that connects travelers with hotels, hostels, vacation rentals and more. Hundreds of millions of users visit their website every month to search for accommodations for their next vacation.
A key feature Booking provides on their website is their map. There are tens of millions of listed properties on their marketplace, and users can search the map to see
A rental property’s location
Nearby interesting places (museums, beaches, historical landmarks, etc.)
Distance between the rental property and the interesting places
This map needs to load quickly and their backend needs to search millions of different points across the world.
Igor Dotsenko is a software engineer at Booking and he wrote a great blog post delving into how they built this.
Searching on the Map
When the user opens a map for a certain property, a bounding box on the map is shown. Points of interest within this bounding box need to be displayed on the map so Booking needs to quickly find the most important points of interest within that bounding box.
Quadtrees
The underlying data structure that powers this is a Quadtree.
A Quadtree is a tree data structure that’s used extensively when working with 2-D spatial data like maps, images, video games, etc. You can perform efficient insertion/deletion of points, fast range queries, nearest-neighbor searches (find closest object to a given point) and more.
Like any other tree data structure, you have nodes which are parents/children of other nodes. For a Quadtree, internal nodes will always have four children (internal nodes are nodes that aren’t leaf nodes. Leaf nodes are nodes with 0 children). The parent node represents a specific region of the 2-D space, and each child node represents a quadrant of that region.
When you’re dealing with mapping data, the parent node will represent some region in your map. The four children of that parent will represent the northwest, northeast, southwest and southeast quadrants of the parent’s region.
For Booking, each node represented a certain bounding box in their map. Users can change the visible bounding box by zooming in or panning on the map. Each child of the node holds the northwest, northeast, southwest, southeast bounding box within the larger bounding box of the parent.
Each node will also hold a small number of markers (representing points of interest). Each marker is assigned an importance score. Markers that are considered more important are assigned to nodes higher up in the tree (so markers in the root node are considered the most important). The marker for the Louvre Museum in Paris will go in a higher node than a marker for a Starbucks in the same area.
Now, we’ll go through how Booking searches through the Quadtree and also how they build/update it.
Quadtree Searching
When the user selects a certain bounding box, Booking wants to retrieve the most important markers for that bounding box from the Quadtree. Therefore, they use a Breadth-First Search to do this.
They’ll start with the root node and search through its markers to find if any intersect with the selected bounding box.
If they need more markers, then they’ll check which child nodes intersect with the bounding box and add them to the queue.
Nodes will be removed from the queue (in first-in-first-out order) and the function will search through their markers for any that intersect with the bounding box.
Once the BFS has found the requested number of markers, it will terminate and send them to the user, so they can be displayed on the map.
Building the Quadtree
The Quadtree is kept in-memory, and re-built from time-to-time to account for new markers that are added to the map (or account for changes in marker importance).
When building the data structure, the algorithm first starts with the root node. This node represents the entire geographic area (Booking scales the Quadtrees horizontally, so they have different Quadtree data structures for different geographic regions).
If each node can only hold 10 markers, then Booking will find the 10 most important markers from the entire list and insert them into the root node.
After, they create 4 children for the northeast/northwest/southwest/southeast regions of the root node’s area and repeat the process for each of those children.
This is repeated until all the markers are inserted into the Quadtree.
Results
Quadtrees are quite simple but extremely powerful, making them very popular across the industry. If you’re preparing for system design interviews then they’re a good data structure to be aware of.
Booking scales this system horizontally, by creating more Quadtrees and having each one handle a certain geographic area.
For Quadtrees storing more than 300,000 markers, the p99 lookup time is less than 5.5 milliseconds (99% of requests take under 5.5 milliseconds).
For more details, you can read the full blog post here.
How did you like this summary?Your feedback really helps me improve curation for future emails. |
Tech Snippets
How Pinterest Load Tests Their Database
Pinterest is a social media service with over 430 million monthly active users. The company relies on advertising for revenue, where they show users promoted posts that are served based on a real time ad auction.
In order to store and serve all of their reporting metrics, Pinterest relies on Apache Druid - an open source, column-oriented, distributed data store that’s written in Java.
Druid is commonly used for OLAP (analytics workloads) and it’s designed to ingest massive amounts of event data and then provide low latency, analytics queries on the ingested data. Druid is also used at Netflix, Twitter, Walmart, Airbnb and many other companies.
To get an idea of how Druid works, you can split its architecture into three components: the query nodes, data nodes and deep storage.
Deep storage is where the company stores all their data permanently, like AWS S3 or HDFS.
Druid connects with the company’s deep storage and indexes the company’s data into Druid data nodes for fast analytical queries. New data can also be ingested through the data nodes and Druid will then write it to deep storage.
In order to make analytical queries performant, the data stored in Druid's data nodes is stored in a columnar format. You can read more about this here.
Clients can send their queries (in SQL or JSON) for Druid through the query nodes.
These components are broken down into different services that can be configured and scaled independently.
During the holiday months, Pinterest typically gets a big spike in traffic. In order to deal with this, teams at the company perform extensive load testing in the months prior.
Engineers on the Real-Time Analytics team at Pinterest wrote a great blog post on the process they go through for load testing Druid.
Here’s a summary
When load testing Druid, engineers are looking to verify several areas
Queries - The service should be able to handle the expected increase in queries per second and do so within the latency requirements specified in the service level agreement (SLA).
Ingestion - The real-time ingestion capabilities should be able to handle the increase in data. Druid should be able to take in all the data and write it to the data nodes and deep storage with low ingestion lag and a low number of failed writes.
Data Size - The storage system should have sufficient capacity to handle the increased data volume.
We’ll go through each of these and talk about how Pinterest tests them.
Setting up the Testing Environment
When load testing their Druid system, Pinterest can either do so with generated queries or with real production queries.
With generated queries, queries are created based on the current data set in Druid. This is fairly simple to run and does not require any preparation. However, it may not accurately show how the system will behave in production scenarios since the generated queries might not be representative of a real world workload (in terms of which data is accessed, query types, edge cases).
Another option is to capture real production queries and re-run these queries during testing. This is more involved as queries need to be captured and then updated for the changes in the dataset/timeframe. However, this is more reflective of what Druid will experience.
Pinterest moved ahead with using real production queries and implemented query capture using Druid’s logging feature that automatically logs any query that is being sent to a Druid broker host (you send your query to a Query Server which contains a broker host).
Engineers don’t conduct testing on the production environment, as that could adversely affect users. Instead, they create a test environment that’s as close to production as possible.
They replicate the Druid setup of brokers, coordinators, and more and also make sure to use the same host machine types, configurations, pool size, etc.
Druid relies on an external database for metadata storage (data on configuration, audit, usage information, etc.) and it supports Derby, MySQL and Postgres. Pinterest uses MySQL.
Therefore, they use a MySQL dump to create a copy of all the metadata stored in the production environment and add that to a MySQL instance in the test environment.
They spin up data nodes in the test environment that read from deep storage and index data from the past few weeks/months.
The testing service loads historical production queries from the log files in the production environment and sends them to the brokers in the test environment for execution. They ramp up the queries per second to test what they expect for holiday traffic.
Evaluating Query Load
Pinterest runs the real production queries on the test environment and looks at several metrics like
99th percentile latency of the queries
CPU usage of the brokers
Peak queries per second
Using these tests, they can find and address bottlenecks in Druid around the query and data services and adjust how much compute is dedicated for these components.
Some changes can be done quickly while others can take hours. Increasing the number of machines in the query services can be done quickly, whereas increasing the number of data replicas takes time since data needs to be indexed and loaded from deep storage.
Handling Increase in Data Ingestion
Testing Data Ingestion is quite similar to testing queries per second. Pinterest sets up a test environment with the same capacity, configuration, etc. as the production environment.
The main difference is that the Real-Time Analytics team now needs some help from client teams who generate the ingested data to also send additional events that mimic production traffic.
When reviewing ingestion traffic, the Pinterest team looks at
Ingestion lag
Number of successful/rejected events
General system health
And more.
They also make sure to validate the ingested data and make sure it’s being written correctly.
Handling Increase in Data Volume
Evaluating if the system can handle the increase in data volume is the simplest and quickest check.
For this, they look at the Druid Web Console, where they can see all the data nodes and current capacity. They estimate the amount of additional data that will be stored over the holiday period and adjust for that.
Results
From the testing, Pinterest found that they were able to handle the additional traffic expected during the holiday period. They saw that the broker pool may need additional hosts if traffic meets a certain threshold, so they made a note that the pool size may need to be increased.
For more details, you can read the full blog post here.
How did you like this summary?Your feedback really helps me improve curation for future emails. Thanks! |