Archive

Category Archives for "Networking"

HN788: Behind Megaport’s Network Automation Platform (Sponsored)

We have a network automation discussion for you today from sponsor Megaport. At the AutoCon3 conference earlier this year, Luke Gollan presented on a complex network automation project migrating Megaport’s API-driven software defined network from a legacy VXC overlay to an EVPN framework. This helped improve scalability, but was fraught with practical challenges, not the... Read more »

Cisco IOS/XE Hates Redistributed Static IPv6 Routes

Writing tests that check the correctness of network device configurations is hard (overview, more details). It’s also an interesting exercise in getting the timing just right:

  • Routing protocols are an eventually-consistent distributed system, and things eventually appear in the right place (if you got the configurations right), but you never know when exactly that will happen.
  • You can therefore set some reasonable upper bounds on when things should happen, and declare failure if the timeouts are exceeded. Even then, you’ll get false positives (as in: the test is telling you the configurations are incorrect, when it’s just a device having a bad hair day).

And just when you think you nailed it, you encounter a device that blows your assumptions out of the water.

IPB179: IPv6 DNS Gotchas

Let’s talk about common misconceptions regarding DNS and IPv6. We’ve heard these often enough that we felt we should talk through each one. We cover issues including what kind of DNS record types can be returned via IPv6 (and IPv4, too), more details on what really goes on with Happy Eyeballs, and combining A/AAAA records... Read more »

Quicksilver v2: evolution of a globally distributed key-value store (Part 2)

What is Quicksilver?

Cloudflare has servers in 330 cities spread across 125+ countries. All of these servers run Quicksilver, which is a key-value database that contains important configuration information for many of our services, and is queried for all requests that hit the Cloudflare network.

Because it is used while handling requests, Quicksilver is designed to be very fast; it currently responds to 90% of requests in less than 1 ms and 99.9% of requests in less than 7 ms. Most requests are only for a few keys, but some are for hundreds or even more keys.

Quicksilver currently contains over five billion key-value pairs with a combined size of 1.6 TB, and it serves over three billion keys per second, worldwide. Keeping Quicksilver fast provides some unique challenges, given that our dataset is always growing, and new use cases are added regularly.

Quicksilver used to store all key-values on all servers everywhere, but there is obviously a limit to how much disk space can be used on every single server. For instance, the more disk space used by Quicksilver, the less disk space is left for content caching. Also, with each added server that contains a particular Continue reading

Explore your Cloudflare data with Python notebooks, powered by marimo

Many developers, data scientists, and researchers do much of their work in Python notebooks: they’ve been the de facto standard for data science and sharing for well over a decade. Notebooks are popular because they make it easy to code, explore data, prototype ideas, and share results. We use them heavily at Cloudflare, and we’re seeing more and more developers use notebooks to work with data – from analyzing trends in HTTP traffic, querying Workers Analytics Engine through to querying their own Iceberg tables stored in R2.

Traditional notebooks are incredibly powerful — but they were not built with collaboration, reproducibility, or deployment as data apps in mind. As usage grows across teams and workflows, these limitations face the reality of work at scale.

marimo reimagines the notebook experience with these challenges in mind. It’s an open-source reactive Python notebook that’s built to be reproducible, easy to track in Git, executable as a standalone script, and deployable. We have partnered with the marimo team to bring this streamlined, production-friendly experience to Cloudflare developers. Spend less time wrestling with tools and more time exploring your data.

Today, we’re excited to announce three things:

Demystifying Ultra Ethernet

The Ultra Ethernet Consortium (UEC), of which Arista is a founding member, is a standards organisation established to enhance Ethernet for the demanding requirements of Artificial Intelligence (AI) and High-Performance Computing (HPC). Over 100 member companies and 1000 participants have collaborated to evolve Ethernet, leading to the recent publication of its 1.0 specification, which will drive hardware implementations that significantly boost cluster performance.

Cloudflare 1.1.1.1 incident on July 14, 2025

On 14 July 2025, Cloudflare made a change to our service topologies that caused an outage for 1.1.1.1 on the edge, resulting in downtime for 62 minutes for customers using the 1.1.1.1 public DNS Resolver as well as intermittent degradation of service for Gateway DNS.

Cloudflare's 1.1.1.1 Resolver service became unavailable to the Internet starting at 21:52 UTC and ending at 22:54 UTC. The majority of 1.1.1.1 users globally were affected. For many users, not being able to resolve names using the 1.1.1.1 Resolver meant that basically all Internet services were unavailable. This outage can be observed on Cloudflare Radar.

The outage occurred because of a misconfiguration of legacy systems used to maintain the infrastructure that advertises Cloudflare’s IP addresses to the Internet.

This was a global outage. During the outage, Cloudflare's 1.1.1.1 Resolver was unavailable worldwide.

We’re very sorry for this outage. The root cause was an internal configuration error and not the result of an attack or a BGP hijack. In this blog, we’re going to talk about what the failure was, why it occurred, and what we’re doing to Continue reading

Cloudflare recognized as a Visionary in 2025 Gartner® Magic Quadrant™ for SASE Platforms

We are thrilled to announce that Cloudflare has been named a Visionary in the 2025 Gartner® Magic Quadrant™ for Secure Access Service Edge (SASE) Platforms1 report. We view this evaluation as a significant recognition of our strategy to help connect and secure workspace security and coffee shop networking through our unique connectivity cloud approach. You can read more about our position in the report here.

Since launching Cloudflare One, our SASE platform, we have delivered hundreds of features and capabilities from our lightweight branch connector and intuitive native Data Loss Prevention (DLP) service to our new secure infrastructure access tools. By operating the world’s most powerful, programmable network we’ve built an incredible foundation to deliver a comprehensive SASE platform. 

Today, we operate the world's most expansive SASE network in order to deliver connectivity and security close to where users and applications are, anywhere in the world. We’ve developed our services from the ground up to be fully integrated and run on every server across our network, delivering a unified experience to our customers. And we enable these services with a unified control plane, enabling end-to-end visibility and control anywhere in the world. Tens of thousands of customers Continue reading

Dry Run: Your Kubernetes network policies with Calico staged network policies

Kubernetes Network Policies (KNP) are powerful resources that help secure and isolate workloads in a cluster. By defining what traffic is allowed to and from specific pods, KNPs provide the foundation for zero-trust networking and least-privilege access in cloud-native environments.

But there’s a problem: KNPs are risky, and applying them without a clear game plan can be potentially disruptive.

Without deep insight into existing traffic flows, applying a restrictive policy can instantly break connectivity killing live workloads, user sessions, or critical app dependencies. An even scarier scenario is when we implement policies that we think cover everything and workloads actually work, but after a restart or scaling operation we hit new problems. Kubernetes, with all of its features, has no built-in “dry run” mode for policies, and no first-class observability to show what would be blocked or allowed which is the right decision since Kubernetes is an orchestrator not an implementer.

This forces platform teams into a difficult choice, deploy permissive or no policies and weaken security, or Risk service disruption while debugging restrictive ones. As a result, many teams delay implementing network policies entirely only to regret it after a zero-day exploit like Log4Shell, XZ backdoor, or other vulnerabilities Continue reading

PP070: News Roundup – Scattered Spider Bites MSPs, Microsoft Rethinks Kernel Access, North Koreans Seem Good at Their Illicit Jobs

There’s lots of juicy stories in our monthly security news roundup. The Scattered Spider hacking group makes effective use of social engineering to target MSPs, Microsoft pushes for better Windows resiliency by rethinking kernel access policies for third-party endpoint security software, and the US Justice Department files indictments against alleged operators of laptop farms that... Read more »

Hyper-volumetric DDoS attacks skyrocket: Cloudflare’s 2025 Q2 DDoS threat report

Welcome to the 22nd edition of the Cloudflare DDoS Threat Report. Published quarterly, this report offers a comprehensive analysis of the evolving threat landscape of Distributed Denial of Service (DDoS) attacks based on data from the Cloudflare network. In this edition, we focus on the second quarter of 2025. To view previous reports, visit www.ddosreport.com.

June was the busiest month for DDoS attacks in 2025 Q2, accounting for nearly 38% of all observed activity. One notable target was an independent Eastern European news outlet protected by Cloudflare, which reported being attacked following its coverage of a local Pride parade during LGBTQ Pride Month.

Key DDoS insights

  • DDoS attacks continue to break records. During 2025 Q2, Cloudflare automatically blocked the largest ever reported DDoS attacks, peaking at 7.3 terabits per second (Tbps) and 4.8 billion packets per second (Bpps).

  • Overall, in 2025 Q2, hyper-volumetric DDoS attacks skyrocketed. Cloudflare blocked over 6,500 hyper-volumetric DDoS attacks, an average of 71 per day. 

  • Although the overall number of DDoS attacks dropped compared to the previous quarter — which saw an unprecedented surge driven by a large-scale campaign targeting Cloudflare’s network and critical Internet infrastructure protected by Cloudflare — the Continue reading

Blog Reboot

When I first launched this site, many years ago, it served as a humble lab notebook and sharing short personal stories from my working life. I shared diagrams, Junos configs , and field notes written after late night maintenance windows or proof of concepts. Those stories took on a life of their own. They brought […]

The post Blog Reboot first appeared on Rick Mur.

Integration Testing in Infrahub – Validate Your Automation in Real Environments

Integration Testing in Infrahub - Validate Your Automation in Real Environments

Testing individual components is a good start, but what happens when you need to validate how everything works together? In this post, we’ll show you how to run integration tests in Infrahub that verify your schema, data, and Git workflows in a real, running environment.

You’ll learn how to spin up isolated Infrahub instances on the fly using Docker and Testcontainers, automate schema and data loading, and catch issues before they reach production.

SPONSORED

OpsMill has partnered with me for this post, and they also support my blog as a sponsor. The post is originally published under https://opsmill.com/blog/integration-testing-infrahub/

You don’t need to be a Python expert to follow along. We’ll walk through everything step by step, with example code and tooling recommendations. You can also follow this guide in video form on the Cisco DevNet YouTube channel:

All the sample data and code used here are available on the OpsMill GitHub repo, so you can set up your own test environment and try it yourself.

Quick recap

Previously, we covered how to write smoke and unit tests using the Continue reading

1 5 6 7 8 9 3,453