Relational Databases on AWS

A tech dive on AWS RDS. Plus, how you can avoid getting down-leveled when you're interviewing for a new company, how Pinterest scaled to 11 million users with only 6 engineers and more.

October 02, 2023

Hey Everyone!

Today we'll be talking about

Relational Databases on AWS
- Pros/Cons of Self-Hosting vs. using AWS RDS
- Database Engine Choices with RDS
- What Factors Determine Pricing
- Scaling Vertically with Upgrading Hardware
- Scaling Horizontally with Read Replicas and Sharding
- AWS Aurora
How to Avoid Getting Down-Leveled in Behavioral Interviews
- Use the job description and level you’re applying for to generate stories.
- Make sure you come up with stories that are something an L5/L6 (or whatever you’re applying for) would do.
- In your story, make sure you adequately explain your level and responsibilities so the interviewer doesn’t down-level you.
Tech Snippets
- How Pinterest scaled to 11 million users with only 6 engineers
- Building an SQLite Clone in C
- Hire Developers by Having Them Read Code
- Using the Language’s Type System Effectively
- Hints for Distributed Systems Design

Relational Databases on AWS

In past Quastor articles, we’ve frequently talked about Postgres/MySQL and using cloud databases.

AWS RDS is the most popular cloud relational database service, so in this article we’ll delve into things like self hosting vs. using RDS, choices for database engine, pricing, scaling vertically/horizontally, AWS Aurora and more.

In past dives, we’ve delved into other cloud services like

AWS RDS

If you need a relational database on AWS, then one way to do it would be to spin up an EC2 instance (AWS’s cloud virtual machines) and install Postgres/MySQL/whatever from the AWS AMI marketplace (a listing of pre-configured solutions that you can immediately deploy on EC2 instances).

Then, you could use AWS EBS (AWS’s block storage solution) as the storage layer and hook it up with your EC2 instance.

When you need to scale vertically (upgrade the hardware), you can swap the EC2 virtual machine for a more powerful instance.

If you need to create read replicas, you manually create additional EC2 instances and configure the DBMS on each to operate as read replicas. You’ll have to synchronize them with the primary using the database system’s built-in replication mechanisms.

The alternative way of getting a cloud relational database on AWS is to use Relational Database Service (RDS), Amazon’s offering for a managed cloud relational database.

RDS has been around since 2009 and it lets you quickly spin up a database and have Amazon take care of the upgrades/patches/backups/etc. They also provide abstractions to make it easier to create replicas, upgrade to a beefier machine, automate backups and more.

Self Hosted vs. Using RDS

As discussed above, there are two main ways of spinning up a relational database in AWS. Either self-hosting it with EC2 and EBS or by using AWS RDS.

First, we’ll talk about the pros of self-hosting

Pros of Self Hosting

Flexibility

The biggest pro of self-hosting is flexibility.

If you’d like to customize the underlying OS or specific details about the hardware then that isn’t available with RDS.

Additionally, RDS is built on top of popular databases like Postgres, MySQL, MariaDB and more (we’ll delve into this in the database engines section). If you’re already familiar with/using Postgres, then you’ll want to pick the Postgres variant of RDS.

Although you’ll get the vast majority of functionality of Postgres (or whatever variant you pick), there are certain features that are missing.

For example, RDS does not give you true superuser access with Postgres. Certain admin functions that require superuser privileges might not be available.

Additionally, you won’t have access to the entire ecosystem of Postgres extensions.

Similar limitations may apply to the RDS variants of MariaDB, MySQL, Microsoft SQL Server and the other database engines supported.

Pricing

Another potential benefit of self-hosting is pricing. Using RDS as opposed to self-hosting most likely means a larger AWS bill.

RDS instances are approximately double the price (based on instance type, database engine, etc.) of the equivalent ec2 instance. Depending on your scale, this could mean a bill of a few hundred/thousand dollars more per month.

However, the trade-off here is that you’re spending more developer-time on managing Postgres/MySQL/whatever.

Developer time is typically a lot more expensive, so you might be making a penny-wise, pound-foolish decision by trying to save with self-hosting. But this obviously depends on your scale, expertise with self-hosting, etc.

Nhost switching off RDS

Nhost offers a Backend-as-a-Service platform and they published an interesting blog post last year on why they decided to self-host instead of using RDS.

The main reason why was specific to Nhost. They give every user a Postgres database and they were trying a setup where each RDS instance would host multiple databases from different users.

So, if Jack, Tom and I all create separate Nhost accounts, our Postgres databases might be grouped together on the same RDS-Postgres instance.

This quickly led to issues with the

Limited Extension Set - The Nhost team found that the set of extensions available for RDS Postgres was too limited for their use case.
Noisy Neighbor Problem - I decide I’m going to run full table scans for some analytics queries at 9 am Eastern Time. This hogs up CPU/Disk I/O and hurts performance for Tom and Jack.
Giving Users Direct Access - Maybe Tom wants super-user privileges over the Postgres database; Jack and I would obviously not be cool with that.

Instead, they moved to Postgres running on Kubernetes.

Pros of RDS

Management

As mentioned, the biggest pro of using RDS is that you offload the responsibility of managing the database to Amazon. You don’t have to worry about OS updates, database patches, backups, etc.

Instead, AWS handles all of that for you. For backups, they make it pretty easy to configure how often you want database snapshots to be taken/retained.

RDS also makes it easy to create read replicas, improve availability (deploy in different availability zones) and more.

Pricing

Despite paying more to AWS, you’ll most likely save money on developer time and headaches around managing your database.

The price difference probably makes sense to pay considering the hours of developer time you save (unless you leave all the database maintenance tasks to the unpaid intern and are fine with the occasional losses of customer data).

Database Engines

RDS is built on top of AWS EC2 and EBS instances where the EC2 instance is running some type of DBMS software.

The specific DBMS being run on the EC2 instance is the database engine for the RDS instance.

Amazon offers a choice between

MySQL
Postgres
MariaDB
Oracle Database
Microsoft SQL Server
Amazon Aurora

Due to licensing fees, Microsoft SQL Server and Oracle Database tend to be the most expensive options (whereas Postgres, MySQL and MariaDB are all open source).

Aurora is a database engine built at Amazon that is fully compatible with Postgres and MySQL. It operates differently from the other database engines, where it does not store data on a single EBS volume. Instead, Aurora divides your data into segments and stores them in a distributed manner (check out the Aurora section for more details).

This results in improvements in availability, performance and scalability. We’ll elaborate on this in the Aurora section.

Pricing

Here are some of the most important factors that determine how much you’re paying to AWS for your RDS setup.

Instance Type - how much compute/memory does your VM have? What region is it deployed in?
Licensing - Microsoft SQL Server and Oracle require additional licensing costs
Storage - How much storage do you have allocated for your database?
For Storage types, RDS lets you pick from
- General Purpose SSD storage - This is the most suitable for the vast majority of workloads.
- Provisioned IOPS SSD storage - If you have a high number of disk input/output operations, then this lets you scale to a very high I/O workload with low latency.
- Magnetic (HDD) - You should be picking SSD storage for new applications (due to the performance benefits), but HDD support is available for backward compatibility.
Data Transfer - AWS charges for data transferred out of their network to the internet or to other AWS regions/services.
Snapshot Costs - You can configure RDS to routinely take backups. You’ll be paying for storage of these database backups so you should set our retention policy accordingly.

Similar to other AWS services, you can reduce your bill by pre-paying for usage (reserved instances).

Scaling

With system design, the two ways of scaling are vertically and horizontally. Vertically means upgrading your hardware while horizontally means adding additional machines to handle the load.

Scaling Vertically

With RDS, scaling vertically is pretty simple.

One way is to increase the size of the virtual machine for the RDS instance in the AWS management console (or with the CLI or Infrastructure as Code).

This will cause temporary downtime as AWS spins up a new instance.

The alternative way is to create a replica of your RDS instance. This replica should be in the same availability zone and have the specs you’re looking for.

Once you spin it up, you switch-over from the old RDS instance to the new one.

Scaling vertically obviously has limits. The largest RDS instances on AWS have 32 cores and 244 gigabytes of memory.

Once you reach this point, your only option is to scale horizontally.

This is the first part of the tech dive on AWS RDS.

In the full tech dive, we’ll cover

Scaling Horizontally with Read Replicas
Scaling Horizontally with Sharding
AWS Aurora

For the full article, you can subscribe below.

It’s only $12 per month and I’d highly recommend using your job’s learning & development budget to cover it!

Here’s an email you can send to your manager asking for reimbursement.

Past content includes…

System Design Articles

Tech Dives

Database Concepts

Tech Snippets

How Pinterest scaled to 11 million users with only 6 engineers

When Pinterest hit 11.7 million monthly users, they only had 6 engineers. At the time, it was the fastest company to hit 10 million users.

They accomplished this by
- using simple, proven tech (MySQL, Django, Redis, etc.)
- limiting their options
- Preferring database sharding over clustering

This is a great read that delves into Pinterest’s tech stack, how they grew and mistakes the team made when scaling and more.

engineercodex.substack.com/p/how-pinterest-scaled-to-11-million

Hints for Distributed Systems Design

Murat Demibras is a Principal Applied Scientist at AWS and also teaches Distributed Systems at the University of Buffalo, SUNY.

He published a fantastic blog post delving into 12 tips for building distributed systems. They’re organized into functionality, performance and fault-tolerance.

muratbuffalo.blogspot.com/2023/10/hints-for-distributed-systems-design.html

Building an SQLite Clone in C

This is a 14 part series that delves into building a toy database. Skimming these posts is a terrific way to get an understanding of the components in a database and how they’re built (B-Tree indexes, storage engine, front-end).

cstack.github.io/db_tutorial

Hire Developers by Having Them Read Code

Almost all developers agree that the current coding interview process (leetcode) is broken.

This is an interesting blog post that suggests a different strategy. Interview developers by having them read code and predict the output. The code consists of things like basic function calls, recursion, side-effects, etc.

The key here is to not just focus on whether the candidate got the question right/wrong. Instead, delve deeper into the candidate’s answers and try to understand their thinking process.

freakingrectangle.wordpress.com/2022/04/15/how-to-freaking-hire-great-developers

The type system is a programmer's best friend

This is an interesting blog post on using type systems more effectively. If you're storing a user's email address then don't just rely on a string. Use a dedicated type and add in useful functionality, such as a .Domain() method which will return gmail or outlook. Having good types can prevent many future bugs and make your team much more efficient.

dusted.codes/the-type-system-is-a-programmers-best-friend

How to Avoid Getting Down-Leveled in Interviews

When you’re interviewing for a new company, one potential risk is that you get down-leveled from your current position. Maybe you’re an E5 at Facebook, but Google gives you an offer as an L4.

Coming in at the wrong level can set your career back a few years and also cause a big hit to your total comp.

Meta is a Principal Engineer at Amazon, where he’s conducted over 800 interviews and worked as a bar-raiser. He posted a great video explaining how you can avoid getting down-leveled.

Companies are trying to fill a specific need when they interview you. Hiring someone at the wrong level can be disastrous. If someone is mistakenly leveled too high, then it can be extremely expensive for the company.

The main reasons people get down-leveled are

Levels are different at various companies - Startups tend to promote more quickly than Big Tech companies. Check out levels.fyi for more info on the leveling schemes at various companies.
You got down-leveled - The usual cause for down-leveling is subpar performance in your behavioral interviews. (if you perform poorly in the technical interview, then you usually don’t get an offer at all)

How to Avoid Getting Down-Leveled In Behavioral Interviews

Behavioral interviews are all about story-telling and the key here is to show your leadership qualities through your stories.

You’ll usually see STAR (Situation, Task, Action, Results) recommended as the traditional story shape you should be following.

The issue with STAR is that it doesn’t work well for generating meaningful stories. It’s great for describing them, but STAR doesn’t help you figure out which qualities you’re looking to demonstrate and how.

Instead, you should think about the man-in-a-whole story shape.

Start your story by anchoring on your status and responsibilities in your past team (use terms from the job description here so you portray yourself as being in a similar level)
Layer in conflict, challenges and obstacles that you need to deal with. These conflicts should be significant enough to signify your level.
Overcome the conflict by describing how you took actions that an L5/L6 (or whatever your level you’re targeting) would do to solve the problem.

Once you have a story that fits this shape, then you can use STAR to format your story and make sure it has a point. Implement STAR after you’ve constructed your story so you can make sure it has substance.

Meta then gives an example of a behavioral interview story that fits this pattern.

Interviewer: Tell me about a time you made your team’s processes, software or system simpler.

Meta: Sure, I was the team lead on a team of eight software developers that was responsible for several mission-critical data plane services for our flagship product. I was responsible for the team’s architecture.

Meta starts by establishing his status at his previous company. He managed 8 developers as a tech lead and was working on a mission-critical service.

Meta: I spoke with an adjacent team and we decided that it made more sense if our team took ownership of some software they owned because it aligned with the charter of our team much more than it did theirs

Meta: Turns out it was pretty buggy but that wasn’t a big deal after I squashed the bugs and added unit tests and integration tests

Meta: The big issues cropped up later when it started paging our on-call engineers late at night. People would just restart the process and go back to sleep.

Meta: However, when it was my turn, I dug into the issue and found that we were dependent on sus4j. Its usage is not allowed in the company, so we were on the hook for migrating off of it immediately.

Meta: I estimated that it would take a month of dev-effort, but we didn’t have this because of an immovable deadline two months away.

Meta explained the challenge he was facing. This is at an appropriate difficulty for the roles he’s targeting.

Meta: I decided that the only reasonable course of action was to decommission the service. However, we didn’t know who was using it.

Meta: To figure this out, we turned on pass-through mode so clients would identify themselves. I found that 27 teams were using this service so I set up meetings with some of them to talk about deprecating the service.

Meta: It turns out that most of them could get what they needed from other services that my team owned. So, I created a campaign to deprecate the service. We split the clients up amongst my team members and let the client teams know that the service would be shut off in 3 months.

Meta: This was disruptive to our schedule but it didn’t blow up our project. It came out to a positive outcome and I was happy with how the architecture looked afterwards.

Meta ended by explaining the steps he took to overcome the challenge. He concluded with describing the final results.

To see the full video by Meta, check it out here.