As Kubernetes adoption scales across enterprise architectures, platform architects face mounting pressure to implement consistent security guardrails across distributed, multi-cluster environments while maintaining operational velocity. Modern infrastructure demands a security architecture that can adapt without introducing complexity or performance penalties. Traditional approaches force architects to cobble together separate solutions for ingress protection, network policies, and application-layer security, creating operational friction and increasing attack surface.
Today, we’re announcing significant enhancements to Calico that eliminate this architectural complexity. This release introduces native Web Application Firewall (WAF) capabilities integrated directly into Calico’s Ingress Gateway, enabling platform architects to deploy a single technology stack for both ingress management and HTTP-layer threat protection. Combined with enhanced Role-Based Access Controls (RBAC) controls, and centralized observability across heterogeneous workloads, platform architects can now design and implement comprehensive security all within a unified platform.
The new features in this release can be grouped under two main categories:
The digital landscape of corporate environments has always been a battleground between efficiency and security. For years, this played out in the form of "Shadow IT" — employees using unsanctioned laptops or cloud services to get their jobs done faster. Security teams became masters at hunting these rogue systems, setting up firewalls and policies to bring order to the chaos.
But the new frontier is different, and arguably far more subtle and dangerous.
Imagine a team of engineers, deep into the development of a groundbreaking new product. They're on a tight deadline, and a junior engineer, trying to optimize his workflow, pastes a snippet of a proprietary algorithm into a popular public AI chatbot, asking it to refactor the code for better performance. The tool quickly returns the revised code, and the engineer, pleased with the result, checks it in. What they don't realize is that their query, and the snippet of code, is now part of the AI service’s training data, or perhaps logged and stored by the provider. Without anyone noticing, a critical piece of the company's intellectual property has just been sent outside the organization's control, a silent and unmonitored data leak.
This isn't a Continue reading
The revolution is already inside your organization, and it's happening at the speed of a keystroke. Every day, employees turn to generative artificial intelligence (GenAI) for help with everything from drafting emails to debugging code. And while using GenAI boosts productivity—a win for the organization—this also creates a significant data security risk: employees may potentially share sensitive information with a third party.
Regardless of this risk, the data is clear: employees already treat these AI tools like a trusted colleague. In fact, one study found that nearly half of all employees surveyed admitted to entering confidential company information into publicly available GenAI tools. Unfortunately, the risk for human error doesn’t stop there. Earlier this year, a new feature in a leading LLM meant to make conversations shareable had a serious unintended consequence: it led to thousands of private chats — including work-related ones — being indexed by Google and other search engines. In both cases, neither example was done with malice. Instead, they were miscalculations on how these tools would be used, and it certainly did not help that organizations did not have the right tools to protect their data.
While the instinct for many may be to deploy Continue reading
If you’re here on the Cloudflare blog, chances are you already understand AI pretty well. But step outside our circle, and you’ll find a surprising number of people who still don’t know what it really is — or why it matters.
We wanted to come up with a way to make AI intuitive, something you can actually see and touch to get what’s going on. Hands on, not just hand-wavy.
The idea we landed on is simple: nothing comes into the world fully formed. Like us, and like the Internet, AI didn’t show up fully formed. So we asked ourselves: what if we told the story of AI as it learns and grows?
Episode by episode, we’d give it new capabilities, explain how those capabilities work, and explore how they change the way AI interacts with the world. Giving it a voice. Letting it see. Helping it learn. And maybe even letting it imagine the future.
So we made AI Avenue, a show where I (Craig) explore the fun, human, and sometimes surprising sides of AI… with a little help from my co-host Yorick, a robot hand with a knack for comic timing and the occasional eye-roll. Together, we travel, Continue reading
A networking engineer (let’s call him Joe1) sent me an interesting challenge: they built a data center network with Cisco switches, and the switches flood LLDP packets between servers.
That would be interesting by itself (the whole network would appear as a single hub), but they’re also using DCBX (which is riding in LLDP TLVs), and the DCBX parameters are negotiated between servers (not between servers and adjacent switches), sometimes resulting in NIC resets2.
We are witnessing in real time as AI fundamentally changes how people work across every industry. Customer support agents can respond to ten times the tickets. Software engineers are reviewers of AI generated code instead of spending hours pounding out boiler plate code. Salespeople can get back to focusing on building relationships instead of tedious follow up and administration.
This technology feels magical, and Cloudflare is committed to helping companies build world class AI-driven experiences for their employees and customers.
There is a but, however. Any time a brand new technology with such widespread appeal emerges, the technology often outpaces the tools in place to govern, secure and control the technology. We're already starting to see stories of vibe coded apps leaking all their users' details. LLM chats that were intended to only be shared between colleagues, are actually out on the web, being indexed by search engines for all the world to see. AI Agents are being given the keys to the application kingdom, enabling them to work autonomously across an organization — but without proper tracking and control. And then there’s the risk of a well-meaning employee uploading confidential company or customer data into an LLM, which Continue reading
Recently, I was doing some reading on MPLS and wanted to build a lab for it. For my use case, I needed five routers connected and running OSPF between them before I could even start configuring MPLS. So before doing any MPLS work, I have to spend a lot of time setting up the lab and prerequisites like configuring IP addresses on interfaces and setting up OSPF. This is tedious, and this is exactly where Netlab can help you get up to speed.
Netlab is an open source tool that makes it easy to build and share network labs. Instead of manually dragging devices in a GUI or typing the same base configs over and over, you describe your lab in a simple YAML file. Netlab then takes care of creating the topology, assigning IP addresses, configuring routing protocols, and even pushing custom configs. Netlab works with containerlab (or vagrant) so you can spin up realistic network topologies in minutes and reproduce them anywhere automagically.
As Network Engineers, we often set up labs to help us learn and practice. Most of us use tools like EVE-NG, GNS3, or Cisco CML, where you go into Continue reading
For over two decades, we've built real-time communication on the Internet using a patchwork of specialized tools. RTMP gave us ingest. HLS and DASH gave us scale. WebRTC gave us interactivity. Each solved a specific problem for its time, and together they power the global streaming ecosystem we rely on today.
But using them together in 2025 feels like building a modern application with tools from different eras. The seams are starting to show—in complexity, in latency, and in the flexibility needed for the next generation of applications, from sub-second live auctions to massive interactive events. We're often forced to make painful trade-offs between latency, scale, and operational complexity.
Today Cloudflare is launching the first Media over QUIC (MoQ) relay network, running on every Cloudflare server in datacenters in 330+ cities. MoQ is an open protocol being developed at the IETF by engineers from across the industry—not a proprietary Cloudflare technology. MoQ combines the low-latency interactivity of WebRTC, the scalability of HLS/DASH, and the simplicity of a single architecture, all built on a modern transport layer. We're joining Meta, Google, Cisco, and others in building implementations that work seamlessly together, creating a shared foundation for the next generation of real-time Continue reading
From time to time, I like to dive into the archive and find a show that’s worth repeating. Forthwith, Derrick Winkworth and automation.
Network automation efforts tend to focus on building and maintaining configurations–but is this the right place to be putting our automation efforts? Derick Winkworth joins Tom Ammon and Russ White at the Hedge for a conversation about what engineers really do, and what this means for automation.
download
When I was cleaning the “set BGP MED” integration test, I decided that once a BGP prefix is in the BGP table of the BGP peer, there’s no need for a further wait before checking its MED value. After all:
That approach failed miserably with ArubaCX; it was time to investigate the details.
On August 21, 2025, an influx of traffic directed toward clients hosted in the Amazon Web Services (AWS) us-east-1 facility caused severe congestion on links between Cloudflare and AWS us-east-1. This impacted many users who were connecting to or receiving connections from Cloudflare via servers in AWS us-east-1 in the form of high latency, packet loss, and failures to origins.
Customers with origins in AWS us-east-1 began experiencing impact at 16:27 UTC. The impact was substantially reduced by 19:38 UTC, with intermittent latency increases continuing until 20:18 UTC.
This was a regional problem between Cloudflare and AWS us-east-1, and global Cloudflare services were not affected. The degradation in performance was limited to traffic between Cloudflare and AWS us-east-1. The incident was a result of a surge of traffic from a single customer that overloaded Cloudflare's links with AWS us-east-1. It was a network congestion event, not an attack or a BGP hijack.
We’re very sorry for this incident. In this post, we explain what the failure was, why it occurred, and what we’re doing to make sure this doesn’t happen again.
Cloudflare helps anyone to build, connect, protect, and accelerate their websites on the Internet. Most customers host their Continue reading
On July 31, 2025, just as Portugal entered the peak of another intense wildfire season, João Pina, also known as Tomahock, received an automated alert from Cloudflare. His volunteer-run project, fogos.pt, now a trusted source of real-time wildfire information for millions across Portugal, was under attack.
One of the several alerts fogos.pt received related to the DDoS attack
What started in 2015 as a late-night side project with friends around a dinner table in Aveiro has grown into a critical public resource. During wildfires, the site is where firefighters, journalists, citizens, and even government agencies go to understand what’s happening on the ground. Over the years, fogos.pt has evolved from parsing PDFs into visual maps to a full-featured app and website with historical data, weather overlays, and more. It’s also part of Project Galileo, Cloudflare’s initiative to protect vulnerable but important public interest sites at no cost.
Wildfires are not just a Portuguese challenge. They are frequent across southern Europe (Spain, Greece, currently also under alert), California, Australia, and in Canada, which in 2023 faced record-setting fires. In all these cases, reliable information can be crucial, sometimes life-saving. Other organizations offering similar public services can Continue reading
A large number of vendors claim to use industry-standard CLI, which means “something that looks like Cisco IOS, but we can’t say that in public.” The implementations of that “standard” are full of quirks; as I was making fun of Cisco IOS last week, it’s only fair to look at how others deal with BGP community propagation.
netlab has BGP configuration templates for 14 different platforms1, including these implementations that look like Cisco IOS from a distance if you squint just right2: Arista EOS, Aruba CX, and FRRouting. You can check the configuration templates if you wish; here’s the TC&DB3 overview:
During Developer Week 2024, we introduced AI face cropping in private beta. This feature automatically crops images around detected faces, and marks the first release in our upcoming suite of AI image manipulation capabilities.
AI face cropping is now available in Images for everyone. To bring this feature to general availability, we moved our CPU-based prototype to a GPU-based implementation in Workers AI, enabling us to address a number of technical challenges, including memory leaks that could hamper large-scale use.
Photograph by Suad Kamardeen (@suadkamardeen) on Unsplash
We developed face cropping with two particular use cases in mind:
Social media platforms and AI chatbots. We observed a lot of traffic from customers who use Images to turn unedited images of people into smaller profile pictures in neat, fixed shapes.
E-commerce platforms. The same product photo might appear in a grid of thumbnails on a gallery page, then again on an individual product page with a larger view. The following example illustrates how cropping can change the emphasis from the model’s shirt to their sunglasses.
Photograph by Media Modifier (@mediamodifier) on Unsplash
When handling high volumes of media content, preparing images for production can be Continue reading
The SwiNOG 40 event started with an interesting presentation on Building Trustworthy Network Automation (video) by Damien Garros (now CEO @ OpsMill) who discussed the principles one can use to build a trustworthy network automation solution, including idempotency, dry runs, and transactional changes. He also covered the crucial roles of the declarative approach, version control, and testing.
If you have ever watched any of my network automation materials, you won’t be surprised by anything he said, but if you’re just starting your network automation journey, you MUST watch this presentation to get your bearings straight.
Today, we are announcing Cloudflare’s Browser Developer Program, a collaborative initiative to strengthen partnership between Cloudflare and browser development teams.
Browser developers can apply to join here.
At Cloudflare, we aim to help build a better Internet. One way we achieve this is by providing website owners with the tools to detect and block unwanted traffic from bots through Cloudflare Challenges or Turnstile. As both bots and our detection systems become more sophisticated, the security checks required to validate human traffic become more complicated. While we aim to strike the right balance, we recognize these security measures can sometimes cause issues for legitimate browsers and their users.
A core objective of the program is to provide a space for intentional collaboration where we can work directly with browser developers to ensure that both accessibility and security can co-exist. We aim to support the evolving browser landscape, while upholding our responsibility to our customers to deliver the best security products. This program provides a dedicated channel for browser teams to share feedback, report issues, and help ensure that Cloudflare’s Challenges and Turnstile work seamlessly with all browsers.
Browser developers in Continue reading
From a network engineer’s perspective, it is not mandatory to understand the full functionality of every application running in a datacenter. However, understanding the communication patterns of the most critical applications—such as their packet and flow sizes, entropy, transport frequency, and link utilization—is essential. Additionally, knowing the required transport services, including reliability, in-order packet delivery, and lossless transmission, is important.
In AI fabrics, a neural network, including both its training and inference phases, can be considered an application. For this reason, this section first briefly explains the basic operation of the simplest neural network: the Feed Forward Neural Network (FNN). It then discusses the operation of a single neuron. Although a deep understanding of the application itself is not required, this section equips the reader with knowledge of what pieces of information are exchanged between GPUs during each phase and why these data exchanges are important.
Figure 1-7 illustrates a simple four-layer Feed Forward Neural Network (FNN) distributed across four GPUs. The two leftmost GPUs reside in Node-1, and the other two GPUs reside in Node-2. The training data is fed into the first layer. In real neural networks, this first layer is the input Continue reading
The Ultra Ethernet Specification v1.0 (UES), created by the Ultra Ethernet Consortium (UEC), defines end-to-end communication practices for Remote Direct Memory Access (RDMA) services in AI and HPC workloads over Ethernet network infrastructure. UES not only specifies a new RDMA-optimized transport layer protocol, Ultra Ethernet Transport (UET), but also defines how the full application stack—from Software through Transport, Network, Link, and Physical—can be adjusted to provide improved RDMA services while continuing to leverage well-established standards. UES includes, but is not limited to, a software API, mechanisms for low-latency and lossless packet delivery, and an end-to-end secure software communication path.
Before diving into the details of Ultra Ethernet, let’s briefly look at what we are dealing with when we talk about an AI cluster. From this point onward, we focus on Ultra Ethernet from the AI cluster perspective. This chapter first introduces the AI cluster networking. Then, it briefly explains how a neural network operates during the training process, including an short introduction to the backpropagation algorithm and its forward and backward pass functionality.
Note: This book doesn’t include any complex mathematical algorithms related backpropagation algorithm, or detailed explanation of different neural networks. I have written a book Continue reading