Tech Dive on DNS

We'll talk about what DNS is, complexities it needs to handle, steps in a DNS lookup and more.

Hey Everyone!

Today we'll be talking about

  • Tech Dive - DNS

    • What is DNS and the complexities it needs to handle

    • Why should you care?

    • Steps in a DNS look up

    • DNS Propagation

    • Tips for debugging DNS issues

  • How to Make Faster Decisions

    • Mirela is an Engineering Manager at Reddit and she wrote a great blog post delving into how she makes decisions quickly

    • Visualize decisions as a tree where they break down into smaller sub-choices

    • Prune the tree to minimize choices and delegate branches to sub-teams

    • Watch out for any information silos

  • Tech Snippets

    • Skills the Best Engineers have in Common

    • A Ruby on Rails documentary by Honeypot

    • Advice on Properly Framing Messages

    • Switching to Elixir

Ever wonder how search engines like Google work? How do they analyze trillions of documents and quickly search for exact matches in a few hundred milliseconds?

If you’re curious, then Brilliant put out a fantastic course delving into building a Search Engine. 

Like all their other courses, it’s fully interactive with animations, hands-on graphics and detailed explanations on things like

  • Web crawlers and extracting relevant information

  • Building an index and querying it

  • Algorithms for dealing with phrases and wildcards

And more.

Brilliant is a learning platform that has a huge amount of CS, math, science and statistics content.

They structure their lessons in bite-sized pieces, making it ideal for those short 10 minute breaks that you might otherwise waste on Instagram or Twitter.

With the link below, you can get a 30-day free trial to check it out. You’ll also get a 20% discount when you subscribe.

sponsored

Tech Dive - DNS

When you navigate to twitter.com, your web browser will send an HTTP request to Twitter’s servers to download the latest content. It will then render that content and you can get your cortisol spike as you read about the most recent horrible thing that’s happened in the world.

In order to send this HTTP request, your browser will need the IP address of Twitter’s web server (or Twitter’s load balancer).

DNS (Domain Name System) is the technology that makes this possible. It serves as the phonebook of the Internet, where it maps between the human-readable web URL (www.twitter.com) and the machine-readable, most up-to-date IP address (104.244.42.193).

Unfortunately, DNS is also significantly more intricate than your typical phonebook and has to deal with many more complexities.

For example

  • GeoDNS - dynamically route users to the nearest server. If you’re a user in India, then GeoDNS will make sure that you get the IP address of the server based in India.

  • Dynamic IP addresses - hosts will have IP addresses that change periodically. DNS needs to stay updated on these changes and propagate the new IP address to all the DNS servers around the world.

  • Web Servers Crash - servers will go down so DNS needs to have a failover mechanism and reroute traffic to healthy servers/load-balancers.

  • Domain Forwarding - websites will commonly have certain URLs forward to other parts of the website. They might implement this in their DNS records, so DNS has to account for this.

  • Security - hackers will try to take advantage of DNS for nefarious reasons. This can include DDoS attacks, spoofing (so users are directed to a malicious site instead of the real one), hijacking domains/subdomains and more.

In this article, we’ll delve into how DNS works, key components, propagation and potential failures. It will give you an idea of how DNS handles all these requirements and will hopefully come in handy if you’re unlucky enough to have to debug issues around DNS.

This is from our Tech Dive on DNS for Quastor Pro readers. For detailed tech dives on a huge host of topics in backend and data engineering, check out Quastor Pro.

Readers can typically expense Quastor Pro with their job’s Learning & Development budget. Here’s an email you can send to your manager.

Why should you care?

Here’s a few reasons why having a high-level understanding of DNS can be useful to you.

  • Managing Internal DNS - You’ll need DNS to route users to your load balancers/servers. DNS systems like AWS Route 53 offer a ton of features like running health-checks (to determine which IP address to send traffic to), geoDNS (routing the user to the nearest server), private DNS (for internal servers) and more. You might also want to run your own DNS servers (you can run your own authoritative nameservers for your domain. We’ll talk about what this means).

  • Source of Hairy Outages - if you have a bug in your application logic, then this might result in a server crashing or some small UI errors. That’s definitely not a great user-experience. However, a bug in your
    DNS configuration usually means the entire site will go down. So… quite a bit worse.


    DNS errors can be tricky to diagnose and debug; so much so that it’s become quite a meme.

    a beautiful Haiku on DNS issues


    Companies that have had prolonged outages from DNS errors include Notion, Square, Google, Slack, AWS, Microsoft, Akamai and many others.

    Debugging DNS issues can be hard; but it’s really hard if you have no idea how DNS works.


    DNS itself also has flaws. It had a really major security flaw that was patched in 2008. Dan Kaminsky was the white hat hacker who discovered and helped fix the issue. The flaw would’ve allowed attackers to impersonate websites… which leads us to the next point.

  • Security Vulnerabilities - As you might’ve guessed, DNS is a popular target for hackers. If a hacker is able to DNS spoof Twitter then they might replace it with a Twitter-clone that has a paywall in front of every post. Users might unwittingly think that this is just Elon’s latest ploy to get Twitter to stop hemorrhaging money… so they put in their credit card details and get scammed.

  • Block Ads - One of the key components of DNS that we’ll discuss is the resolver. If you run your own resolver, then you can actually configure it to block any domains for ad-networks. This can be extremely useful for devices that don’t let you install an ad-block extension. You might have a smart TV that shows you ads for some insane, shareholder-profit-maximizing reason; so blocking the domains that send ads to your TV can be done in your DNS resolver.

Steps in a DNS Lookup

We’ll first start by going through the steps that happen when your browser does a DNS look up.

  1. Input - You type in www.netflix.com into your web browser.

  2. Browser Cache - The browser first checks its own cache to see if it already has the IP address. This value will have an associated TTL, so it expires after a set amount of time.

  3. Operating System - Common operating systems provide utilities for doing DNS lookups, so the browser will use that. Here’s a good article on the tooling that Linux provides for DNS lookups.

  4. DNS Resolver - The DNS Resolver is responsible for executing the lookup and finding the IP address. If you go to your WiFi settings and check the DNS section then you can see the IP address of the resolver.

    If you weren’t able to get the IP address from cache then the DNS resolver will handle the look up.

  5. DNS Servers - DNS servers are laid out in a hierarchy where you have root nameservers, TLD nameservers and authoritative nameservers.
    The resolver will first check the Root nameserver, then the TLD nameserver and finally the authoritative nameserver.

  6. Root Nameserver - This are the first step in finding the IP address and they maintain information on the Root Zone. The Root Zone represents all the IP addresses for the top level domain names like .com, .net, .org, etc. There are 13 logical root servers around the world and each of these is replicated across thousands of physical servers.
    The Root nameserver will return the IP addresses if it has them cached. Otherwise, it will return information on which TLD nameserver to contact.

  7. TLD Nameserver - Top-Level Domain nameservers contain information on specific top-level domains like .com or .uk. TLDs can be generic (.com), sponsored (.edu, .gov, .mil), country code (.uk, .de, .jp), infrastructure-related (.arpa) and more.

    If the TLD nameserver has the IP address for your specific request cached, then it will return it. Otherwise, it will direct the resolver to the Authoritative Nameserver for the website you’re requesting.

  8. Authoritative Nameserver - These are the definitive source for the IP addresses of a domain name, so they will always provide the correct location. The DNS provider you pick will determine which Authoritative nameservers you’re using. If you use Namecheap to host your domain, then you’ll be using Namecheap’s authoritative nameservers. You can also choose to run your own authoritative nameservers.


    GeoDNS plays an important role in selecting the right IP address based on a user’s location. You might want to watch Rick and Morty, so you use a UK-based VPN to access the Netflix website. GeoDNS will direct you to Netflix.uk and you can enjoy breaking Netflix’s terms and conditions.


    GeoDNS operates on the level of the authoritative nameserver (although this information will also get cached in the Root and TLD nameservers).

  9. HTTP Request - Now, you have the IP address for Netflix’s server. Your browser and OS will cache that (with a TTL) and your browser will send an HTTP request asking for the homepage’s content. You can now proceed to waste the next 30 minutes flipping through thumbnails trying to figure out what you’re going to watch.

This is an excerpt from our Tech Dive on DNS for Quastor Pro readers. We also talk about DNS Propagation and debugging DNS issues.

For detailed tech dives on a huge host of topics in backend and data engineering, check out Quastor Pro.

Readers can typically expense Quastor Pro with their job’s Learning & Development budget. Here’s an email you can send to your manager.

If you were impressed by AlphaGo (DeepMind’s bot that mastered Go) or OpenAI’s Dota 2 bots, then you should take a look at Reinforcement Learning.

This is a branch of Machine Learning that delves into teaching agents how to behave in new environments using rewards.

If you’d like to learn more, Brilliant has an amazing course delving into Reinforcement Learning.

Like all the other Brilliant courses, it’s fully interactive with animations, hands-on graphics and detailed explanations on things like

  • Estimating Value Functions

  • Monte Carlo Methods applied to RL

  • Policy Gradient Methods

And more.

Brilliant is a learning platform that has a huge amount of math, data science, computer science and ML content. Their content is structured as bite-sized lessons with tons of interactive animations, graphics and more.

This makes it really easy to build a daily learning habit with the Brilliant app, instead of wasting time on Instagram or Twitter.

With the link below, you can get a 30-day free trial to check it out. You’ll also get a 20% discount when you subscribe.

sponsored

Tech Snippets

How to Make Faster Decisions from an Engineering Manager at Reddit

Many top CEOs/founders frequently tout the importance of making decisions quickly.

Making decisions too slowly can have several effects

  1. Slows Down Everyone Else - Planning involves multiple decisions, which can be interdependent. You can’t make a decision on which servers to buy before deciding on how much budget to allocate. One delayed decision can block many other teams/engineers. Non-blocking decisions will still waste resources like meeting time, team attention, etc.

  2. Unexpected Challenges - Progress is rarely linear. Unexpected blockers/challenges will frequently come up and reacting too slowly to these is a great way to massively miss deadlines. Having a quicker cadence for making decisions can help you stay on track when facing adversity.

Mirela Spasova is an Engineering Manager at Reddit and she wrote a fantastic blog post delving into decision making and the principles her team at Reddit employs.

Sequence

The first step she employs is to break up the decision into a tree of choices where each layer splits into progressively more detailed decisions.

Deciding whether to build a certain feature will have sub-decisions on strategy, design, product and more.

Prune

You’ll face certain constraints on certain options which makes it easy to eliminate some of the paths. Removing these paths from the tree makes it easier to clarify what options you have.

Delegation

To maximize efficiency, it’s crucial to delegate sets of decisions to sub-teams within your group. Ensure that each sub-team can make their decision relatively independently from other sub-teams. Requiring too much communication between the sub-teams can lead to a mess.

Information Silos Risk

One big risk with delegation is information silos, where one sub-team overlooks important considerations from the rest of the group because they don’t all have the same information.

To minimize this, Mirela makes sure there’s the opportunity for discussion in meetings. Sub-teams have the opportunity to give asynchronous feedback on how their task is going and this is brought up in meetings.

For the rest of Mirela’s strategy, please read the full blog post here.