Tech Dive - Containers

A dive into virtual machines, containers and docker. Plus, how Slack manages infrastructure projects, a guide to onboarding software engineers and more.

Hey Everyone!

Today we’ll be talking about

  • A Tech Dive on Virtual Machines, Containers and Dockers

    • Why should you care?

    • Deploying Code before VMs and the issues involved

    • Intro to VMs and their benefits

    • Intro to containers and their pros/cons compared to VMs

    • Intro to Docker and key concepts

    • Alternatives to Docker and the future of containers

  • Tech Snippets

    • 5 properties of a healthy software project

    • Build a CDN from Scratch

    • The Ultimate Guide to Onboarding Software Engineers

    • How Slack Manages Infrastructure Projects

The fastest way to get promoted is to work on projects that have a big impact on your company. Big impact => better performance review => promotions and bigger bonuses.

But, how do you know what work is useful?

The key is in combining your abilities as a developer with product skills.

If you have a good sense of product, then you can understand what users want and which features will help the company get more engagement, revenue and profit.

Product for Engineers is a fantastic newsletter that’s dedicated to helping you learn these exact skills.

It’s totally free and they send out curated lessons for developers on areas like

  • How to run successful A/B tests

  • Using Feature Flags to ship faster

  • Startup marketing for engineers

and much more.

sponsored

Tech Dive - Docker

One of the biggest trends in tech over the last decade has been the rise of containers and the meteoric growth they’ve seen in adoption. They’ve completely changed the way companies ship software and have helped catalyze new technologies/paradigms like microservices-based architectures or serverless computing.

Nearly all of the backend dev buzzwords you’ll see thrown around nowadays rely (in some form) on containers.

In this tech dive, we’ll first talk about the rise of containers and what problems they solve. Afterwards, we’ll delve into Docker, how it works and its associated ecosystem.

This is the first part of our pro article on Containers and Docker.

We’ll be sending out the full article tomorrow, so you can get the full content by subscribing to Quastor Pro here. Thanks for the support, I really appreciate it!

Why should you care?

There’s a constant influx of new technology being created every day… so why should you care about containers?

A container is a technology that allows you to package your application code, dependencies, environment variables, configuration settings, etc. into a single bundle.

You can then share this bundle (called a container image) with other developers who need to run your application. It makes deploying your code significantly easier.

Some benefits of using containers are

  • Consistent Development Environments

  • Simplified Dependency Management

  • Faster Onboarding for New Devs

  • Reproducible Builds

  • Easier Scalability

  • Portability

  • Easier Versioning & Rollbacks

  • Configuration Management

These are all benefits of using containers like Docker or LXC. However, these benefits didn’t start with containers.

Many of them came in the early 2000s with the precursor to containers - virtual machines (VMs).

In this article, we’ll first talk about life before virtual machines, why they came about, how they changed the game and their pros/cons.

Then, we’ll delve into the pros/cons of containers and why they’re so widely adopted. After, we’ll talk about Docker (an open-source platform for creating, deploying and running apps with containers) and the ecosystem of tooling with dockerfiles, docker engine, registries and more.

We’ll also explore alternatives to Docker and why you’d consider those. We’ll end with a discussion about the future of containers and where the industry is heading.

Deploying your Code Prior to VMs

Let’s say you’re building a small web application where you’re using Postgres for storage and Redis for caching. You’d like to deploy this application so that other people on the internet can use it.

There’s several issues you’ll have to deal with

  • Server Costs

  • Portability

Server Costs

Prior to VMs/containers, you’d probably have to buy your own dedicated machine to run your application.

You can’t run it on your own machine since you might interfere with the application when you’re using your computer for your own personal tasks.

You could consider running your application on your buddy’s powerful server that has some spare capacity. However, this comes with its own challenges.

  • Dependency Conflicts - Your friend’s server might already be running Redis/Postgres, but he could be using older versions that are incompatible with your application. 

  • Resource Contention - Maybe your friend’s application periodically has a spike in CPU usage. You can’t specifically allocate CPU/memory to each application so his CPU-usage spike will become your problem.

  • Security Issues - Your friend might have security vulnerabilities in his code. His application getting hacked could mean your data gets leaked too since they’re on the same OS.

Unfortunately, your only option is to buy your own machine. Even worse, you’ll also have to plan for future user growth. If your application doubles in the number of users, it would be quite wasteful if you had to throw out all the old hardware and buy a new, beefier server to keep up with the user growth.

In addition to the excessive spending, another issue you’ll deal with is portability.

Portability

Let’s say you want to send your web app to a QA tester for debugging, then that QA tester will also need postgres and redis installed on their machine.

Before they can even run your application, they’ll have to set up all the dependencies, utilities and scripts that you’re using. They’ll also have to make sure the versioning is correct, environment variables are set, configure security policies, etc.

This process can be extremely tedious and also very error-prone. It’ll lead to the “it works on my machine!” argument that can often happen between developers and DevOps.

With our example so far…. we just have a single QA tester. What happens when you have to deploy this app across hundreds of web servers?

Manually installing the dependencies/config settings/application is incredibly tedious and error prone.

Whenever there are any app updates, dependency upgrades, configuration changes, then you’ll have to make sure that the updates are correctly rolled out across all the servers in your fleet.

VMs were created to address both of these problems.

Virtual Machines

Virtual Machines allow you to run multiple “virtual computers” on a single physical computer. Each of these virtual computers (virtual machines) will have its own operating system with virtualized hardware. In other words, each VM behaves as if it were a standalone physical computer with its own dedicated amount of CPU, RAM and disk storage.

Virtual machines work by relying on a software layer called a hypervisor. Hypervisors can be split into Type 1 Hypervisors (run on a bare metal server) and Type 2 hypervisors (run on a guest operating system). All your virtual machines run on top of the hypervisor, which allocates the physical hardware resources across the VMs.

When you want to use a hypervisor to run your application, you take your application code, dependencies, libraries, etc. and create a VM image.

After you’ve created your VM image, you can back it up on AWS S3, distribute it across all the servers in your fleet, store it in an online repository and more (you can treat it just as any other file). When someone needs to run your application, they can just download the VM image and run it using a hypervisor.

Prior to the 1990s, VMs were mostly used with mainframes or in academic research. The first virtual machines were created at IBM to help them with time sharing for their massive mainframe servers.

In 1998, VMWare was founded and they released VMware Workstation in 1999. This allowed you to run virtual machines on the x86 architecture, which was the most common computing platform for personal & enterprise machines. It became much easier and more commonplace to run virtual machines.

With Virtual Machines, you now have the following benefits

  • Cost Efficiency - different users can run their applications on a single physical server by using different virtual machines. This allows you to use all the resources of the machine much more effectively.

  • Isolation & Security - VMs provide a high degree of isolation from each other. If an application on one machine gets hacked, the applications on different machines are still safe.

  • Flexibility & Portability - You can create VM images with everything you need to run your application. Another developer can download your VM image, run it and immediately get set up.

Virtualization exploded in popularity in the 2000s and helped spurn the growth of cloud computing. If you’d like to rent a server in the cloud, you no longer have to rent the entire server. Instead, you could just rent a small portion of the computing resources of the server. Amazon was the first cloud provider to take advantage of this with the launch of EC2 in 2006 (and it’s done rather well for them). 

However, as VMs proliferated, some issues began to come up.

The main problem was around how heavy and compute-intensive VMs were.

Each VM required its own dedicated OS. That meant that each VM would be

  • Resource Intensive - each VM occupies a large footprint on the underlying host machine

  • Require Patching and Maintenance - you need to dedicate time on OS-level patching and management

  • Slow to Boot - booting up a VM would take dozens of seconds

However, for many uses of VMs, you don’t actually need the entire OS. You could use something far more lightweight.

This is where containers come into play.

This is the first part of our tech dive on Containers and Docker. We’ll be sending out the full article (2500+ words) to Quastor Pro subscribers tomorrow.

We’ll talk about containers and their pros/cons compared to VMs, Docker and key concepts you should know (registries, dockerfiles, docker engine, open container initiative, podman, containerd and more)

Many engineering roles today need developers to get involved in product decisions, talk to users and analyze usage data. Understanding how to do this well is hard.

Product for Engineers wrote a fantastic blog post delving into some of the mistakes devs make when they’re trying to make decisions based on analytics data.

Some of the mistakes include

  • Making it too Complicated - It’s easy to get overwhelmed by the huge swath of data tools. Instead, start small.  Pick a specific feature and track its usage with trends and retention. Use that to iterate. 

  • Not Using Session Replays - Session replays are a fantastic tool for uncovering bugs, unexpected behavior and UX issues. They have a very high information density and aren’t just for PMs or marketers.

  • Only focusing on the Numbers - relying on data alone is like tying one arm behind your back. You also need qualitative data like surveys and user interviews. Combining the two will help you build better products.

For the rest of the mistakes, check out Product for Engineers. It’s a fantastic newsletter by PostHog that helps developers learn how to build apps that users love.

To hone your product skills and read more articles like this, check out Product for Engineers below.

sponsored

Tech Snippets