Tech Dive - Containers
A dive into virtual machines, containers and docker. Plus, how Slack manages infrastructure projects, a guide to onboarding software engineers and more.
Hey Everyone!
Today we’ll be talking about
A Tech Dive on Virtual Machines, Containers and Dockers
Why should you care?
Deploying Code before VMs and the issues involved
Intro to VMs and their benefits
Intro to containers and their pros/cons compared to VMs
Intro to Docker and key concepts
Alternatives to Docker and the future of containers
Tech Snippets
5 properties of a healthy software project
Build a CDN from Scratch
The Ultimate Guide to Onboarding Software Engineers
How Slack Manages Infrastructure Projects
Tech Dive - Docker
One of the biggest trends in tech over the last decade has been the rise of containers and the meteoric growth they’ve seen in adoption. They’ve completely changed the way companies ship software and have helped catalyze new technologies/paradigms like microservices-based architectures or serverless computing.
Nearly all of the backend dev buzzwords you’ll see thrown around nowadays rely (in some form) on containers.
In this tech dive, we’ll first talk about the rise of containers and what problems they solve. Afterwards, we’ll delve into Docker, how it works and its associated ecosystem.
This is the first part of our pro article on Containers and Docker.
We’ll be sending out the full article tomorrow, so you can get the full content by subscribing to Quastor Pro here. Thanks for the support, I really appreciate it!
Why should you care?
There’s a constant influx of new technology being created every day… so why should you care about containers?
A container is a technology that allows you to package your application code, dependencies, environment variables, configuration settings, etc. into a single bundle.
You can then share this bundle (called a container image) with other developers who need to run your application. It makes deploying your code significantly easier.
Some benefits of using containers are
Consistent Development Environments
Simplified Dependency Management
Faster Onboarding for New Devs
Reproducible Builds
Easier Scalability
Portability
Easier Versioning & Rollbacks
Configuration Management
These are all benefits of using containers like Docker or LXC. However, these benefits didn’t start with containers.
Many of them came in the early 2000s with the precursor to containers - virtual machines (VMs).
In this article, we’ll first talk about life before virtual machines, why they came about, how they changed the game and their pros/cons.
Then, we’ll delve into the pros/cons of containers and why they’re so widely adopted. After, we’ll talk about Docker (an open-source platform for creating, deploying and running apps with containers) and the ecosystem of tooling with dockerfiles, docker engine, registries and more.
We’ll also explore alternatives to Docker and why you’d consider those. We’ll end with a discussion about the future of containers and where the industry is heading.
Deploying your Code Prior to VMs
Let’s say you’re building a small web application where you’re using Postgres for storage and Redis for caching. You’d like to deploy this application so that other people on the internet can use it.
There’s several issues you’ll have to deal with
Server Costs
Portability
Server Costs
Prior to VMs/containers, you’d probably have to buy your own dedicated machine to run your application.
You can’t run it on your own machine since you might interfere with the application when you’re using your computer for your own personal tasks.
You could consider running your application on your buddy’s powerful server that has some spare capacity. However, this comes with its own challenges.
Dependency Conflicts - Your friend’s server might already be running Redis/Postgres, but he could be using older versions that are incompatible with your application.
Resource Contention - Maybe your friend’s application periodically has a spike in CPU usage. You can’t specifically allocate CPU/memory to each application so his CPU-usage spike will become your problem.
Security Issues - Your friend might have security vulnerabilities in his code. His application getting hacked could mean your data gets leaked too since they’re on the same OS.
Unfortunately, your only option is to buy your own machine. Even worse, you’ll also have to plan for future user growth. If your application doubles in the number of users, it would be quite wasteful if you had to throw out all the old hardware and buy a new, beefier server to keep up with the user growth.
In addition to the excessive spending, another issue you’ll deal with is portability.
Portability
Let’s say you want to send your web app to a QA tester for debugging, then that QA tester will also need postgres and redis installed on their machine.
Before they can even run your application, they’ll have to set up all the dependencies, utilities and scripts that you’re using. They’ll also have to make sure the versioning is correct, environment variables are set, configure security policies, etc.
This process can be extremely tedious and also very error-prone. It’ll lead to the “it works on my machine!” argument that can often happen between developers and DevOps.
With our example so far…. we just have a single QA tester. What happens when you have to deploy this app across hundreds of web servers?
Manually installing the dependencies/config settings/application is incredibly tedious and error prone.
Whenever there are any app updates, dependency upgrades, configuration changes, then you’ll have to make sure that the updates are correctly rolled out across all the servers in your fleet.
VMs were created to address both of these problems.
Virtual Machines
Virtual Machines allow you to run multiple “virtual computers” on a single physical computer. Each of these virtual computers (virtual machines) will have its own operating system with virtualized hardware. In other words, each VM behaves as if it were a standalone physical computer with its own dedicated amount of CPU, RAM and disk storage.
Virtual machines work by relying on a software layer called a hypervisor. Hypervisors can be split into Type 1 Hypervisors (run on a bare metal server) and Type 2 hypervisors (run on a guest operating system). All your virtual machines run on top of the hypervisor, which allocates the physical hardware resources across the VMs.
When you want to use a hypervisor to run your application, you take your application code, dependencies, libraries, etc. and create a VM image.
After you’ve created your VM image, you can back it up on AWS S3, distribute it across all the servers in your fleet, store it in an online repository and more (you can treat it just as any other file). When someone needs to run your application, they can just download the VM image and run it using a hypervisor.
Prior to the 1990s, VMs were mostly used with mainframes or in academic research. The first virtual machines were created at IBM to help them with time sharing for their massive mainframe servers.
In 1998, VMWare was founded and they released VMware Workstation in 1999. This allowed you to run virtual machines on the x86 architecture, which was the most common computing platform for personal & enterprise machines. It became much easier and more commonplace to run virtual machines.
With Virtual Machines, you now have the following benefits
Cost Efficiency - different users can run their applications on a single physical server by using different virtual machines. This allows you to use all the resources of the machine much more effectively.
Isolation & Security - VMs provide a high degree of isolation from each other. If an application on one machine gets hacked, the applications on different machines are still safe.
Flexibility & Portability - You can create VM images with everything you need to run your application. Another developer can download your VM image, run it and immediately get set up.
Virtualization exploded in popularity in the 2000s and helped spurn the growth of cloud computing. If you’d like to rent a server in the cloud, you no longer have to rent the entire server. Instead, you could just rent a small portion of the computing resources of the server. Amazon was the first cloud provider to take advantage of this with the launch of EC2 in 2006 (and it’s done rather well for them).
However, as VMs proliferated, some issues began to come up.
The main problem was around how heavy and compute-intensive VMs were.
Each VM required its own dedicated OS. That meant that each VM would be
Resource Intensive - each VM occupies a large footprint on the underlying host machine
Require Patching and Maintenance - you need to dedicate time on OS-level patching and management
Slow to Boot - booting up a VM would take dozens of seconds
However, for many uses of VMs, you don’t actually need the entire OS. You could use something far more lightweight.
This is where containers come into play.
This is the first part of our tech dive on Containers and Docker. We’ll be sending out the full article (2500+ words) to Quastor Pro subscribers tomorrow.
We’ll talk about containers and their pros/cons compared to VMs, Docker and key concepts you should know (registries, dockerfiles, docker engine, open container initiative, podman, containerd and more)