How Airbnb Built Their Feature Recommendation System
Plus, how to minimize correlated failures in a distributed system, building a basic RDBMS from scratch and more.
Hey Everyone!
Today we’ll be talking about
How Airbnb Built Their Feature Recommendation System
Airbnb scans through customer reviews, conversations between hosts and travelers, support requests and other unstructured data.
One way they use this data is to generate recommendations on how the host can improve their property listings
They use TextCNN for named entity recognition and use word embeddings to map the text to key phrases.
Career Advice Nobody Gave Me: Never Ignore a Recruiter
Developers on Reddit/Hacker News like to joke about the large amount of recruiter spam you get on linkedin/email.
While most of the inbound messages you get are a waste of time, some of them can be meaningful opportunities.
You can make the most of the messages by creating a system to quickly send the recruiter a templated reply to get information on tech stack, total compensation, etc. while minimizing the amount of time you spend.
Tech Snippets
How to Minimize Correlated Failures in a Distributed System ~ AWS Builder’s Library
Building a Basic RDBMS From Scratch
How Asana Onboards Engineering Managers
How Dropbox Manages Data Quality and Coverage
How Airbnb Built Their Feature Recommendation System
Airbnb is an online marketplace where people can rent out their homes or rooms to travelers who need a place to stay. The company has hundreds of millions of users worldwide and over 4 million hosts on the platform.
In order to increase revenue, hosts on Airbnb need to create the most attractive listing possible (which travelers will see when they’re searching for an apartment). They should provide clear information on the specific things travelers are looking for (fast internet, kitchen size, access to shopping, etc.) and also advertise the best features of the home/apartment.
Airbnb makes this easier by providing highly personalized recommendations to hosts on details that should be added to the listing.
They generate these recommendations by analyzing a huge amount of data, including
In-app conversations between the host and travelers
Customer reviews for the property
Customer support requests that travelers made while they were staying on the property
Joy Jing is a senior software engineer at Airbnb and she wrote a great blog post on the machine learning Airbnb uses to generate these recommendations.
Here’s a summary
Airbnb has a huge amount of text data on each property. Things like conversations between the host and travelers, customer reviews, customer support requests, and more.
They use this unstructured data to generate home attributes around things like wifi speed, free parking, access to the beach, etc.
To do this, they built LATEX (Listing ATtribute EXtraction), a machine learning system to extract these attributes from the unstructured text.
It works in two steps
Named Entity Recognition (NER) - they extract key phrases from the unstructured text data
Entity Mapping Module - they use word embeddings to map these phrases to home attributes.
For NER, Airbnb wants to scan through the unstructured text and extract any phrases that are related to home attributes. To do this, they use textCNN (convolutional neural network for text).
They fine-tuned the model on human labeled text data from various sources within Airbnb and it extracts any key phrases around things like amenities (“hot tub”), specific POI (“Empire State Building”), generic POI (“post office”) and more.
However, users might use different terms to refer to the same thing. Someone might refer to the hot tub as the jacuzzi or as the whirlpool. Airbnb needs to take all these different phrases and map them all to hot tub.
To do this, Airbnb uses word embeddings, where the key phrase is converted to a vector using an algorithm like Word2Vec (where the vector is chosen based on the meaning of the phrase). Then, Airbnb looks for the closest attribute label word vector using cosine distance.
To provide recommendations to the host, they calculate how frequently each attribute label is referenced across the different text sources (past reviews, customer support channels, etc.) and then aggregate them.
They use this as a factor to rank each attribute in terms of importance. They also use other factors like the characteristics of the property (property type, square footage, luxury level, etc.).
Airbnb then prompts the owner to include more details about certain attribute labels that are highly ranked to improve their listing.
For more details, you can read the full blog post here.
How did you like this summary?Your feedback really helps me improve curation for future emails. |
Tech Snippets
Career Advice Nobody Gave Me: Never Ignore a Recruiter
Many developers on Reddit/Hacker News like to joke about the “recruiter spam” you can get as a software engineer. Many of the inbound messages can be completely irrelevant and it’s usually a waste of your time to engage.
Most engineers just ignore the messages, but Alex Chesser wrote a great blog post on a better way to engage with the inbound.
Instead, he uses a templated script that he copy/pastes to auto-respond to recruiter messages.
In the script he politely tells the recruiter he doesn’t have time for a call, but would like more information about the position. He enquires about the company name, job description and total compensation for the role.
When the recruiter responds, there’s three possible scenarios
The salary is at or below your current level - You’ve just collected a salary data point. Looks like you’re getting paid the right amount. You can reply to the recruiter with a message about how you’re only open to positions with a compensation of 1.5x your current salary.
The salary is less than 1.5x your current - Ask for more information about the technology stack and position type. Maybe there are growth opportunities in switching.
The salary is more than 1.5x your current - It probably makes sense to arrange a call.
Write templated replies for each of these scenarios, so it’s much faster to auto-respond (Alex gives examples of templates he uses in the full blog post).
The vast majority of your auto responses will probably result in scenario 1 (assuming you’re being paid a fair rate), but scenarios 2 and 3 is where you’ll find the biggest career growth.
Put a system in place so you don’t miss those opportunities while minimizing the amount of time/energy you need to invest.
You can read the full blog post here.
How did you like this summary?Your feedback really helps me improve curation for future emails. Thanks! |