Archive

Category Archives for "Networking"

Shutdown season: the Q2 2025 Internet disruption summary

Cloudflare’s network currently spans more than 330 cities in over 125 countries, and we interconnect with over 13,000 network providers in order to provide a broad range of services to millions of customers. The breadth of both our network and our customer base provides us with a unique perspective on Internet resilience, enabling us to observe the impact of Internet disruptions at both a local and national level, as well as at a network level.

As we have noted in the past, this post is intended as a summary overview of observed and confirmed disruptions, and is not an exhaustive or complete list of issues that have occurred during the quarter. A larger list of detected traffic anomalies is available in the Cloudflare Radar Outage Center. Note that both bytes-based and request-based traffic graphs are used within the post to illustrate the impact of the observed disruptions — the choice of metric was generally made based on which better illustrated the impact of the disruption.

In our Q1 2025 summary post, we noted that we had not observed any government-directed Internet shutdowns during the quarter. Unfortunately, that forward progress was short-lived — in the second quarter of 2025, we Continue reading

NB535: Tomahawk Ultra Chops Congestion; Denmark Invests for Quantum Advantages

Take a Network Break! We begin with a listener question about a paper critiquing Shor’s Algorithm and quantum computing, and touch on a remote code execution vulnerability in Riverbed SteelCentral NetProfiler / NetExpress 10.8.7. We discuss a Cloudflare BGP misconfiguration that caused the Internet to hiccup, Broadcom’s new Tomahawk Ultra ASIC aimed for–you guessed it–AI... Read more »

Always Check Your Tests Against Faulty Inputs

A while ago, I published a blog post proudly describing the netlab integration test that should check for incorrect OSPF network types in netlab-generated device configurations. Almost immediately, Erik Auerswald pointed out that my test wouldn’t detect that error (it might detect other errors, though) as the OSPF network adjacency is always established even when the adjacent routers have mismatching OSPF network types.

I made one of the oldest testing mistakes: I checked whether my test would work under the correct conditions but not whether it would detect an incorrect condition.

How the Free Software Foundation Battles the LLM Bots

A Ian Kelling points out that the infrastructure for the Free Software Foundation “has been under attack since August 2024.” “Nothing has changed since the article,” FSF sysadmin a report from LibreNews noting similar issues at high-profile FOSS sites including the Fedora project, KDE GitLab infrastructure, the GNOME GitLab instance, Diaspora, and even the FOSS news site Linux Weekly News. (And “GNOME has been experiencing issues since a last November…”) Articles like the FSF’s are a way of sharing “techniques and tools”, McMahon said Tuesday. Though he adds that some system administrators also have a private mailing list “where we can coordinate and share effective strategies. The specific mitigations often cannot be published because that would give our attackers an advantage.” There’s a lot to learn from the FSF’s battle against the bots — about the tactics of sysadmins, but also about Continue reading

Immich Setup with Docker & External Library (NFS)

Immich Setup with Docker & External Library (NFS)

Recently, I started self-hosting most of the apps I use, like Memos for note-taking and Paperless-NGX for document management. The next one on the list was Immich. Immich is a self-hosted photo and video backup solution that supports features like facial recognition and automatic uploads.

Memos - Amazing Open Source, Self-hosted Notes App
That being said, I recently stumbled upon another great self-hosted note-taking app called ‘Memos’ I just couldn’t believe that I didn’t know about this until very recently.
Immich Setup with Docker & External Library (NFS)
Paperless-ngx - Self-Hosted Document Manager
I came across a great self-hosted document manager called ‘Paperless-NGX’. It not only helps with organising documents but also includes OCR functionality
Immich Setup with Docker & External Library (NFS)

In this post, we’ll look at how to set up Immich as a Docker container and also how to add an NFS share as an external library.

But, Why?

I have a lot of pictures on my NAS that I’ve collected over the years. This includes photos of friends, family, and ones from my older phones. I wanted a way to manage and organise them from one place. I also didn’t want to upload all of them to Google or Apple, which would cost quite a bit. Continue reading

HN788: Behind Megaport’s Network Automation Platform (Sponsored)

We have a network automation discussion for you today from sponsor Megaport. At the AutoCon3 conference earlier this year, Luke Gollan presented on a complex network automation project migrating Megaport’s API-driven software defined network from a legacy VXC overlay to an EVPN framework. This helped improve scalability, but was fraught with practical challenges, not the... Read more »

Cisco IOS/XE Hates Redistributed Static IPv6 Routes

Writing tests that check the correctness of network device configurations is hard (overview, more details). It’s also an interesting exercise in getting the timing just right:

  • Routing protocols are an eventually-consistent distributed system, and things eventually appear in the right place (if you got the configurations right), but you never know when exactly that will happen.
  • You can therefore set some reasonable upper bounds on when things should happen, and declare failure if the timeouts are exceeded. Even then, you’ll get false positives (as in: the test is telling you the configurations are incorrect, when it’s just a device having a bad hair day).

And just when you think you nailed it, you encounter a device that blows your assumptions out of the water.

IPB179: IPv6 DNS Gotchas

Let’s talk about common misconceptions regarding DNS and IPv6. We’ve heard these often enough that we felt we should talk through each one. We cover issues including what kind of DNS record types can be returned via IPv6 (and IPv4, too), more details on what really goes on with Happy Eyeballs, and combining A/AAAA records... Read more »

Quicksilver v2: evolution of a globally distributed key-value store (Part 2)

What is Quicksilver?

Cloudflare has servers in 330 cities spread across 125+ countries. All of these servers run Quicksilver, which is a key-value database that contains important configuration information for many of our services, and is queried for all requests that hit the Cloudflare network.

Because it is used while handling requests, Quicksilver is designed to be very fast; it currently responds to 90% of requests in less than 1 ms and 99.9% of requests in less than 7 ms. Most requests are only for a few keys, but some are for hundreds or even more keys.

Quicksilver currently contains over five billion key-value pairs with a combined size of 1.6 TB, and it serves over three billion keys per second, worldwide. Keeping Quicksilver fast provides some unique challenges, given that our dataset is always growing, and new use cases are added regularly.

Quicksilver used to store all key-values on all servers everywhere, but there is obviously a limit to how much disk space can be used on every single server. For instance, the more disk space used by Quicksilver, the less disk space is left for content caching. Also, with each added server that contains a particular Continue reading

Explore your Cloudflare data with Python notebooks, powered by marimo

Many developers, data scientists, and researchers do much of their work in Python notebooks: they’ve been the de facto standard for data science and sharing for well over a decade. Notebooks are popular because they make it easy to code, explore data, prototype ideas, and share results. We use them heavily at Cloudflare, and we’re seeing more and more developers use notebooks to work with data – from analyzing trends in HTTP traffic, querying Workers Analytics Engine through to querying their own Iceberg tables stored in R2.

Traditional notebooks are incredibly powerful — but they were not built with collaboration, reproducibility, or deployment as data apps in mind. As usage grows across teams and workflows, these limitations face the reality of work at scale.

marimo reimagines the notebook experience with these challenges in mind. It’s an open-source reactive Python notebook that’s built to be reproducible, easy to track in Git, executable as a standalone script, and deployable. We have partnered with the marimo team to bring this streamlined, production-friendly experience to Cloudflare developers. Spend less time wrestling with tools and more time exploring your data.

Today, we’re excited to announce three things:

Demystifying Ultra Ethernet

The Ultra Ethernet Consortium (UEC), of which Arista is a founding member, is a standards organisation established to enhance Ethernet for the demanding requirements of Artificial Intelligence (AI) and High-Performance Computing (HPC). Over 100 member companies and 1000 participants have collaborated to evolve Ethernet, leading to the recent publication of its 1.0 specification, which will drive hardware implementations that significantly boost cluster performance.

Cloudflare 1.1.1.1 incident on July 14, 2025

On 14 July 2025, Cloudflare made a change to our service topologies that caused an outage for 1.1.1.1 on the edge, resulting in downtime for 62 minutes for customers using the 1.1.1.1 public DNS Resolver as well as intermittent degradation of service for Gateway DNS.

Cloudflare's 1.1.1.1 Resolver service became unavailable to the Internet starting at 21:52 UTC and ending at 22:54 UTC. The majority of 1.1.1.1 users globally were affected. For many users, not being able to resolve names using the 1.1.1.1 Resolver meant that basically all Internet services were unavailable. This outage can be observed on Cloudflare Radar.

The outage occurred because of a misconfiguration of legacy systems used to maintain the infrastructure that advertises Cloudflare’s IP addresses to the Internet.

This was a global outage. During the outage, Cloudflare's 1.1.1.1 Resolver was unavailable worldwide.

We’re very sorry for this outage. The root cause was an internal configuration error and not the result of an attack or a BGP hijack. In this blog, we’re going to talk about what the failure was, why it occurred, and what we’re doing to Continue reading

Cloudflare recognized as a Visionary in 2025 Gartner® Magic Quadrant™ for SASE Platforms

We are thrilled to announce that Cloudflare has been named a Visionary in the 2025 Gartner® Magic Quadrant™ for Secure Access Service Edge (SASE) Platforms1 report. We view this evaluation as a significant recognition of our strategy to help connect and secure workspace security and coffee shop networking through our unique connectivity cloud approach. You can read more about our position in the report here.

Since launching Cloudflare One, our SASE platform, we have delivered hundreds of features and capabilities from our lightweight branch connector and intuitive native Data Loss Prevention (DLP) service to our new secure infrastructure access tools. By operating the world’s most powerful, programmable network we’ve built an incredible foundation to deliver a comprehensive SASE platform. 

Today, we operate the world's most expansive SASE network in order to deliver connectivity and security close to where users and applications are, anywhere in the world. We’ve developed our services from the ground up to be fully integrated and run on every server across our network, delivering a unified experience to our customers. And we enable these services with a unified control plane, enabling end-to-end visibility and control anywhere in the world. Tens of thousands of customers Continue reading