How Cloudflare Mitigates Thousands of DDoS Attacks Every Hour

We'll talk about the different types of DDoS attacks and how Cloudflare prevents them. Plus, resources on public speaking for software engineers, things we learned about LLMs in 2024 and more.

January 06, 2025

Hey Everyone!

Today we’ll be talking about

How Cloudflare Defines, Measures and Stops DDoS Attacks
- DDoS Attacks Explained (Volumetric, Application layer and Protocol layer attacks)
- How to Measure DDoS Attacks
- Steps for Protecting from DDoS Attacks
Tech Snippets
- Resources on Public Speaking for Software Engineers
- Things we Learned about LLMs in 2024
- How to Track Engineering Time
- Static Search Trees and how they can be 40x Faster than Binary Search

The Complete Guide to OAuth 2.0

OAuth 2.0 is the industry standard for authorization. It lets you grant a third-party application access to data on your Google/Meta/Dropbox account without sharing your account’s password.

If you’re building an app and want to add a “sign in with google” button then you’ll need to understand how OAuth works.

Recently, WorkOS published a fantastic guide on OAuth, covering everything you need to know to implement it.

The guide covers

Roles and Terminology: Explains the OAuth jargon like Resource Owner, Authorization Server, Refresh Token, etc.
Tokens and Credentials: Understand the different types of tokens like Access Tokens, Refresh Tokens, and Authorization Codes and more
OAuth Flows: Dive into the various OAuth flows, including Authorization Code Grant, PKCE, and Client Credentials, and learn when to use each one based on your application needs

Read the full guide to learn more about OAuth 2.0 and how to implement it with WorkOS.

How Cloudflare Defines, Measures and Stops DDoS Attacks

Cloudflare is one of the largest internet infrastructure companies in the world, providing CDN services, DDoS protection, DNS management and more to millions of websites. They operate a massive global network with data centers in over 330 cities worldwide, handling trillions of requests per day.

One of Cloudflare's core offerings is protection against Distributed Denial of Service (DDoS) attacks. Unfortunately, these malicious attacks are a constant threat to websites. In 2024 alone, CloudFlare mitigated over 14.5 million DDoS attacks (an average of 2,200 DDoS attacks per hour).

Recently, the Cloudflare engineering team published an awesome blog post delving into the evolution of DDoS attacks over the past decade. We’ll be summarizing the post and adding some extra context on DDoS attacks.

DDoS Attacks Explained

With a Distributed Denial of Service Attack, a hacker will use many geographically distributed machines to send traffic to a website. These machines usually belong to unsuspecting users and have been infected with malware to make them part of the attacker’s botnet.

The goal is to overwhelm the target’s backend with traffic, so the site can no longer serve legitimate users. The attacker might then request a ransom from the company, promising to stop the DDoS attack if the company pays up.

DDoS Attacks can roughly be split into 3 main types: Volumetric, Application layer and Protocol attacks.

Volumetric Attacks

These attacks are based on brute force techniques where the target server is flooded with data packets to consume bandwidth and server resources.

Volumetric attacks will frequently rely on amplification and reflection.

Amplification is where a request in a certain protocol will result in a much larger response (in terms of the number of bytes); the ratio between the request size and response size is called the Amplification Factor.

Reflection is where the attacker will spoof the source of request packets to be the target victim’s IP address. Servers will be unable to distinguish legitimate requests from spoofed ones so they’ll send the (much larger) response payload to the targeted victim’s servers and unintentionally flood them.

Network Time Protocol (NTP) DDoS attacks are an example of volumetric attacks where you can send a 234-byte spoofed request to an NTP server, which will then send a 48,000 byte response to the target victim. Attackers will repeat this on many different open NTP servers simultaneously to DDoS the victim with all the NTP responses.

Application Layer Attacks

These DDoS attacks target the “top” layer in the OSI model - the application layer. Attackers might flood the backend with HTTP requests, exploit expensive API endpoints, create many SSL/TLS handshakes, etc.

Database DDoS attacks are quite common, where a hacker will look for requests that are particularly database-intensive and then spam those in an attempt to exhaust the database resources. Scaling your database through read replicas takes time, so this attack can be pretty successful.

HTTP Floods are some of the most widely seen layer 7 DDoS attacks, where hackers will spam a web server with HTTP GET/POST requests. Sophisticated hackers will specifically design these to request resources with low usage in order to maximize the number of cache misses the web server has.

Protocol Layer Attacks

Protocol attacks will rely on weaknesses in how particular protocols are designed. Examples of these kinds of exploits are SYN floods, BGP hijacking, Smurf attacks and more.

A SYN flood attack exploits how TCP is designed, specifically the handshake process. The three-way handshake consists of SYN -> SYN-ACK -> ACK, where the client sends a synchronize (SYN) message to initiate, the server responds with a synchronize-acknowledge (SYN-ACK) message and the client then responds back with an acknowledgement (ACK) message.

In a SYN flood attack, a malicious client will send large volumes of SYN messages to the server, who will then respond back with SYN-ACK. The client will ignore these and never respond back with an ACK message. The server will waste resources (open ports) waiting for the ACK responses from the malicious client. If repeated on a large scale, this can bring the web server down since the server won’t know which requests are legitimate.

How to Measure DDoS Attacks

First of all, defining an individual DDoS attack can be surprisingly difficult. It is not just a one-time spike in requests. A DDoS attack can last for several hours/days and consist of many smaller incidents (also known as pulses).

Cloudflare analyzes a combination of factors to create a “fingerprint” that helps identify different attacks as part of the same DDoS targeting.

Some factors Cloudflare looks at are:

Attack Vectors: Are the same attack vectors being used across the set of events?
Targets: Are all the attacks focused on the same target website/entity?
Payload Signatures: Do the payloads share anything that could mark them to a certain botnet?

Once they have gotten a rough sense of which pulses are part of the attack, Cloudflare can measure how large the DDoS was.

The main metrics they use are:

Bits per Second (BPS): Measures the total data transferred per second. This is useful for evaluating network-layer attacks that aim to saturate bandwidth like UDP floods.
Requests per Second (RPS): Measures the number of protocol requests made each second. This is useful for application-layer attacks (Layer 7).
Packets per Second (PPS): Represents the number of individual packets sent to the target per second, regardless of size. This is critical for network-layer attacks (Layers 3 and 4) like SYN floods.

How to Protect from DDoS Attacks

Unfortunately, there is no single solution to protecting your service from a DDoS attack. Large companies use many different approaches. Some of them include

Rate Limiting: This is the first line of defense. You set thresholds on the number of requests your server will accept from a single IP address within a given time frame.
Caching and CDNs: You can significantly reduce the load on your web server by caching content and using a Content Delivery Network (CDN). CDNs will distribute your files across multiple servers globally, so that the impact of a DDoS attack is spread out.
Reducing the Attack Surface: Minimize the number of services exposed to the public internet. If you have an endpoint with expensive operations, then protect it through authentication (by requiring an API key).
Monitoring: You should continuously monitor network traffic to detect anomalies and potential attacks. First, establish a baseline for normal traffic patterns. That way, you can quickly identify unusual spikes or patterns that indicate an attack.
Machine Learning - services like Cloudflare use machine learning algorithms to identify suspicious traffic patterns in real time. They wrote an interesting blog post on the ML models they use here.

The Complete Guide to OAuth 2.0

OAuth 2.0 is the industry standard for authorization. It lets you grant a third-party application access to data on your Google/Meta/Dropbox account without sharing your account’s password.

If you’re building an app and want to add a “sign in with google” button then you’ll need to understand how OAuth works.

Recently, WorkOS published a fantastic guide on OAuth, covering everything you need to know to implement it.

The guide covers

Roles and Terminology: Explains the OAuth jargon like Resource Owner, Authorization Server, Refresh Token, etc.
Tokens and Credentials: Understand the different types of tokens like Access Tokens, Refresh Tokens, and Authorization Codes and more
OAuth Flows: Dive into the various OAuth flows, including Authorization Code Grant, PKCE, and Client Credentials, and learn when to use each one based on your application needs

Read the full guide to learn more about OAuth 2.0 and how to implement it with WorkOS.

Tech Snippets

Things we learned about LLMs in 2024

Over the course of 2024, a ton of rapid advancements came in LLMs. Multiple models were released that surpassed GPT-4’s capabilities and the competition drove LLM prices down dramatically. Multimodal models also became prevalent with capabilities extending to text, image, audio and video.

However, there’s still a ton of challenges around creating reliable evals, the environmental impact, agents and more.

This is a great post that summarizes the year for LLMs.

simonwillison.net/2024/Dec/31/llms-in-2024

How to Track Engineering Time

This article presents a useful playbook for tracking engineering time. It splits time spent spent into features, bugs/debt and toil.

Analyzing ratios like the time spent on features vs. bugs/debt can be very helpful for making informed decisions about resource allocation and project prioritization.

jacobian.org/2024/feb/7/tracking-engineering-time

Resources on Public Speaking for Software Engineers

The “awesome-speaking“ GitHub repository is a fantastic resource for anyone looking to improve their public speaking skills, especially in the software engineering domain.

It contains links to blog posts, books and public speaking organizations. The content ranges from storytelling techniques to tips on how to handle your nerves. It’s a great repo if you want to become a more confident speaker.

github.com/matteofigus/awesome-speaking#readme

Static Search Trees and how they can be 40x faster than Binary Search

Binary Search can be surprisingly inefficient for large datasets due to poor cache utilization. Ragnar Groot Koerkamp tackled this problem by optimizing S+ trees for high-throughput searching.

He wrote a terrific article talking about his optimizations and how he was able to achieve a 40x speedup over binary search.

curiouscoding.nl/posts/static-search-tree