Writing tests that check the correctness of network device configurations is hard (overview, more details). It’s also an interesting exercise in getting the timing just right:
And just when you think you nailed it, you encounter a device that blows your assumptions out of the water.
What is DNS Delegation and what is it used for? What is new in the Delegation world, and what impact does it have on DNS security and operations? George Michaelson joins Tom Ammon and Russ White for a discussion about DNS DELEG in this episode of the Hedge.
Cloudflare has servers in 330 cities spread across 125+ countries. All of these servers run Quicksilver, which is a key-value database that contains important configuration information for many of our services, and is queried for all requests that hit the Cloudflare network.
Because it is used while handling requests, Quicksilver is designed to be very fast; it currently responds to 90% of requests in less than 1 ms and 99.9% of requests in less than 7 ms. Most requests are only for a few keys, but some are for hundreds or even more keys.
Quicksilver currently contains over five billion key-value pairs with a combined size of 1.6 TB, and it serves over three billion keys per second, worldwide. Keeping Quicksilver fast provides some unique challenges, given that our dataset is always growing, and new use cases are added regularly.
Quicksilver used to store all key-values on all servers everywhere, but there is obviously a limit to how much disk space can be used on every single server. For instance, the more disk space used by Quicksilver, the less disk space is left for content caching. Also, with each added server that contains a particular Continue reading
Many developers, data scientists, and researchers do much of their work in Python notebooks: they’ve been the de facto standard for data science and sharing for well over a decade. Notebooks are popular because they make it easy to code, explore data, prototype ideas, and share results. We use them heavily at Cloudflare, and we’re seeing more and more developers use notebooks to work with data – from analyzing trends in HTTP traffic, querying Workers Analytics Engine through to querying their own Iceberg tables stored in R2.
Traditional notebooks are incredibly powerful — but they were not built with collaboration, reproducibility, or deployment as data apps in mind. As usage grows across teams and workflows, these limitations face the reality of work at scale.
marimo reimagines the notebook experience with these challenges in mind. It’s an open-source reactive Python notebook that’s built to be reproducible, easy to track in Git, executable as a standalone script, and deployable. We have partnered with the marimo team to bring this streamlined, production-friendly experience to Cloudflare developers. Spend less time wrestling with tools and more time exploring your data.
Today, we’re excited to announce three things:
The Ultra Ethernet Consortium (UEC), of which Arista is a founding member, is a standards organisation established to enhance Ethernet for the demanding requirements of Artificial Intelligence (AI) and High-Performance Computing (HPC). Over 100 member companies and 1000 participants have collaborated to evolve Ethernet, leading to the recent publication of its 1.0 specification, which will drive hardware implementations that significantly boost cluster performance.
On 14 July 2025, Cloudflare made a change to our service topologies that caused an outage for 1.1.1.1 on the edge, resulting in downtime for 62 minutes for customers using the 1.1.1.1 public DNS Resolver as well as intermittent degradation of service for Gateway DNS.
Cloudflare's 1.1.1.1 Resolver service became unavailable to the Internet starting at 21:52 UTC and ending at 22:54 UTC. The majority of 1.1.1.1 users globally were affected. For many users, not being able to resolve names using the 1.1.1.1 Resolver meant that basically all Internet services were unavailable. This outage can be observed on Cloudflare Radar.
The outage occurred because of a misconfiguration of legacy systems used to maintain the infrastructure that advertises Cloudflare’s IP addresses to the Internet.
This was a global outage. During the outage, Cloudflare's 1.1.1.1 Resolver was unavailable worldwide.
We’re very sorry for this outage. The root cause was an internal configuration error and not the result of an attack or a BGP hijack. In this blog, we’re going to talk about what the failure was, why it occurred, and what we’re doing to Continue reading
We are thrilled to announce that Cloudflare has been named a Visionary in the 2025 Gartner® Magic Quadrant™ for Secure Access Service Edge (SASE) Platforms1 report. We view this evaluation as a significant recognition of our strategy to help connect and secure workspace security and coffee shop networking through our unique connectivity cloud approach. You can read more about our position in the report here.
Since launching Cloudflare One, our SASE platform, we have delivered hundreds of features and capabilities from our lightweight branch connector and intuitive native Data Loss Prevention (DLP) service to our new secure infrastructure access tools. By operating the world’s most powerful, programmable network we’ve built an incredible foundation to deliver a comprehensive SASE platform.
Today, we operate the world's most expansive SASE network in order to deliver connectivity and security close to where users and applications are, anywhere in the world. We’ve developed our services from the ground up to be fully integrated and run on every server across our network, delivering a unified experience to our customers. And we enable these services with a unified control plane, enabling end-to-end visibility and control anywhere in the world. Tens of thousands of customers Continue reading
Kubernetes Network Policies (KNP) are powerful resources that help secure and isolate workloads in a cluster. By defining what traffic is allowed to and from specific pods, KNPs provide the foundation for zero-trust networking and least-privilege access in cloud-native environments.
But there’s a problem: KNPs are risky, and applying them without a clear game plan can be potentially disruptive.
Without deep insight into existing traffic flows, applying a restrictive policy can instantly break connectivity killing live workloads, user sessions, or critical app dependencies. An even scarier scenario is when we implement policies that we think cover everything and workloads actually work, but after a restart or scaling operation we hit new problems. Kubernetes, with all of its features, has no built-in “dry run” mode for policies, and no first-class observability to show what would be blocked or allowed which is the right decision since Kubernetes is an orchestrator not an implementer.
This forces platform teams into a difficult choice, deploy permissive or no policies and weaken security, or Risk service disruption while debugging restrictive ones. As a result, many teams delay implementing network policies entirely only to regret it after a zero-day exploit like Log4Shell, XZ backdoor, or other vulnerabilities Continue reading
Welcome to the 22nd edition of the Cloudflare DDoS Threat Report. Published quarterly, this report offers a comprehensive analysis of the evolving threat landscape of Distributed Denial of Service (DDoS) attacks based on data from the Cloudflare network. In this edition, we focus on the second quarter of 2025. To view previous reports, visit www.ddosreport.com.
June was the busiest month for DDoS attacks in 2025 Q2, accounting for nearly 38% of all observed activity. One notable target was an independent Eastern European news outlet protected by Cloudflare, which reported being attacked following its coverage of a local Pride parade during LGBTQ Pride Month.
DDoS attacks continue to break records. During 2025 Q2, Cloudflare automatically blocked the largest ever reported DDoS attacks, peaking at 7.3 terabits per second (Tbps) and 4.8 billion packets per second (Bpps).
Overall, in 2025 Q2, hyper-volumetric DDoS attacks skyrocketed. Cloudflare blocked over 6,500 hyper-volumetric DDoS attacks, an average of 71 per day.
Although the overall number of DDoS attacks dropped compared to the previous quarter — which saw an unprecedented surge driven by a large-scale campaign targeting Cloudflare’s network and critical Internet infrastructure protected by Cloudflare — the Continue reading
When I first launched this site, many years ago, it served as a humble lab notebook and sharing short personal stories from my working life. I shared diagrams, Junos configs , and field notes written after late night maintenance windows or proof of concepts. Those stories took on a life of their own. They brought […]
The post Blog Reboot first appeared on Rick Mur.netlab release 25.07 was published yesterday. The major new features include:
But wait, there’s much more:
Testing individual components is a good start, but what happens when you need to validate how everything works together? In this post, we’ll show you how to run integration tests in Infrahub that verify your schema, data, and Git workflows in a real, running environment.
You’ll learn how to spin up isolated Infrahub instances on the fly using Docker and Testcontainers, automate schema and data loading, and catch issues before they reach production.
OpsMill has partnered with me for this post, and they also support my blog as a sponsor. The post is originally published under https://opsmill.com/blog/integration-testing-infrahub/
You don’t need to be a Python expert to follow along. We’ll walk through everything step by step, with example code and tooling recommendations. You can also follow this guide in video form on the Cisco DevNet YouTube channel:
All the sample data and code used here are available on the OpsMill GitHub repo, so you can set up your own test environment and try it yourself.
Previously, we covered how to write smoke and unit tests using the Continue reading