Tom Strickx

Author Archives: Tom Strickx

Why BGP communities are better than AS-path prepends

Why BGP communities are better than AS-path prepends
Why BGP communities are better than AS-path prepends

The Internet, in its purest form, is a loosely connected graph of independent networks (also called Autonomous Systems (AS for short)). These networks use a signaling protocol called BGP (Border Gateway Protocol) to inform their neighbors (also known as peers) about the reachability of IP prefixes (a group of IP addresses) in and through their network. Part of this exchange contains useful metadata about the IP prefix that are used to inform network routing decisions. One example of the metadata is the full AS-path, which consists of the different autonomous systems an IP packet needs to pass through to reach its destination.

As we all want our packets to get to their destination as fast as possible, selecting the shortest AS-path for a given prefix is a good idea. This is where something called prepending comes into play.

Routing on the Internet, a primer

Let's briefly talk about how the Internet works at its most fundamental level, before we dive into some nitty-gritty details.

The Internet is, at its core, a massively interconnected network of thousands of networks. Each network owns two things that are critical:

1. An Autonomous System Number (ASN): a 32-bit integer that uniquely identifies a network. Continue reading

Cloudflare outage on June 21, 2022

Cloudflare outage on June 21, 2022

Introduction

Cloudflare outage on June 21, 2022

Today, June 21, 2022, Cloudflare suffered an outage that affected traffic in 19 of our data centers. Unfortunately, these 19 locations handle a significant proportion of our global traffic. This outage was caused by a change that was part of a long-running project to increase resilience in our busiest locations. A change to the network configuration in those locations caused an outage which started at 06:27 UTC. At 06:58 UTC the first data center was brought back online and by 07:42 UTC all data centers were online and working correctly.

Depending on your location in the world you may have been unable to access websites and services that rely on Cloudflare. In other locations, Cloudflare continued to operate normally.

We are very sorry for this outage. This was our error and not the result of an attack or malicious activity.

Background

Over the last 18 months, Cloudflare has been working to convert all of our busiest locations to a more flexible and resilient architecture. In this time, we’ve converted 19 of our data centers to this architecture, internally called Multi-Colo PoP (MCP): Amsterdam, Atlanta, Ashburn, Chicago, Frankfurt, London, Los Angeles, Madrid, Manchester, Miami, Milan, Mumbai, Newark, Osaka, São Paulo, Continue reading

ASICs at the Edge

ASICs at the Edge

At Cloudflare we pride ourselves in our global network that spans more than 200 cities in over 100 countries. To handle all the traffic passing through our network, there are multiple technologies at play. So let’s have a look at one of the cornerstones that makes all of this work… ASICs. No, not the running shoes.

What's an ASIC?

ASIC stands for Application Specific Integrated Circuit. The name already says it, it's a chip with a very narrow use case, geared towards a single application. This is in stark contrast to a CPU (Central Processing Unit), or even a GPU (Graphics Processing Unit). A CPU is designed and built for general purpose computation, and does a lot of things reasonably well. A GPU is more geared towards graphics (it's in the name), but in the last 15 years, there's been a drastic shift towards GPGPU (General Purpose GPU), in which technologies such as CUDA or OpenCL allow you to use the highly parallel nature of the GPU to do general purpose computing. A good example of GPU use is video encoding, or more recently, computer vision, used in applications such as self-driving cars.

Unlike CPUs or GPUs, ASICs are built Continue reading

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today

Massive route leak impacts major parts of the Internet, including Cloudflare

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today

What happened?

Today at 10:30UTC, the Internet had a small heart attack. A small company in Northern Pennsylvania became a preferred path of many Internet routes through Verizon (AS701), a major Internet transit provider. This was the equivalent of Waze routing an entire freeway down a neighborhood street — resulting in many websites on Cloudflare, and many other providers, to be unavailable from large parts of the Internet. This should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. To understand why, read on.

We have blogged about these unfortunate events in the past, as they are not uncommon. This time, the damage was seen worldwide. What exacerbated the problem today was the involvement of a “BGP Optimizer” product from Noction. This product has a feature that splits up received IP prefixes into smaller, contributing parts (called more-specifics). For example, our own IPv4 route 104.20.0.0/20 was turned into 104.20.0.0/21 and 104.20.8.0/21. It’s as if the road sign directing traffic to “Pennsylvania” was replaced by two road signs, one for “Pittsburgh, PA” and Continue reading

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today

Massive route leak impacts major parts of the Internet, including Cloudflare

How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today

What happened?

Today at 10:30UTC, the Internet had a small heart attack. A small company in Northern Pennsylvania became a preferred path of many Internet routes through Verizon (AS701), a major Internet transit provider. This was the equivalent of Waze routing an entire freeway down a neighborhood street — resulting in many websites on Cloudflare, and many other providers, to be unavailable from large parts of the Internet. This should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. To understand why, read on.

We have blogged about these unfortunate events in the past, as they are not uncommon. This time, the damage was seen worldwide. What exacerbated the problem today was the involvement of a “BGP Optimizer” product from Noction. This product has a feature that splits up received IP prefixes into smaller, contributing parts (called more-specifics). For example, our own IPv4 route 104.20.0.0/20 was turned into 104.20.0.0/21 and 104.20.8.0/21. It’s as if the road sign directing traffic to “Pennsylvania” was replaced by two road signs, one for “Pittsburgh, PA” and Continue reading

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Une fuite massive de routes a eu un impact sur de nombreuses parties d'Internet, y compris sur Cloudflare

Que s'est-il passé ?

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Aujourd'hui à 10h30 UTC, Internet a connu une sorte de mini crise cardiaque. Une petite entreprise du nord de la Pennsylvanie est devenue le chemin privilégié de nombreuses routes Internet à cause de Verizon (AS701), un important fournisseur de transit Internet. C’est un peu comme si Waze venait à diriger le trafic d’une autoroute complète vers une petite rue de quartier : de nombreux sites Web sur Cloudflare et beaucoup d’autres fournisseurs étaient indisponibles depuis une grande partie du réseau. Cet incident n'aurait jamais dû arriver, car Verizon n'aurait jamais dû transmettre ces itinéraires au reste d’Internet. Pour en comprendre les raisons, lisez la suite de cet article.

Nous avons déjà écrit un certain nombre d’articles par le passé sur ces événements malheureux qui sont plus fréquents qu’on ne le pense. Cette fois, les effets ont pu être observés dans le monde entier. Aujourd’hui, le problème a été aggravé par l’implication d’un produit « Optimiseur BGP » de Noction. Ce produit dispose d’une fonctionnalité qui permet de diviser les préfixes IP reçus en parties contributives plus petites (appelées « Continue reading

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Une fuite massive de routes a eu un impact sur de nombreuses parties d'Internet, y compris sur Cloudflare

Que s'est-il passé ?

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Aujourd'hui à 10h30 UTC, Internet a connu une sorte de mini crise cardiaque. Une petite entreprise du nord de la Pennsylvanie est devenue le chemin privilégié de nombreuses routes Internet à cause de Verizon (AS701), un important fournisseur de transit Internet. C’est un peu comme si Waze venait à diriger le trafic d’une autoroute complète vers une petite rue de quartier : de nombreux sites Web sur Cloudflare et beaucoup d’autres fournisseurs étaient indisponibles depuis une grande partie du réseau. Cet incident n'aurait jamais dû arriver, car Verizon n'aurait jamais dû transmettre ces itinéraires au reste d’Internet. Pour en comprendre les raisons, lisez la suite de cet article.

Nous avons déjà écrit un certain nombre d’articles par le passé sur ces événements malheureux qui sont plus fréquents qu’on ne le pense. Cette fois, les effets ont pu être observés dans le monde entier. Aujourd’hui, le problème a été aggravé par l’implication d’un produit « Optimiseur BGP » de Noction. Ce produit dispose d’une fonctionnalité qui permet de diviser les préfixes IP reçus en parties contributives plus petites (appelées « Continue reading