Artificial Intelligence (AI), powered by accelerated processing units (XPUs) like GPUs and TPUs, is transforming industries. The network interconnecting these processors is crucial for efficient and successful AI deployments. AI workloads, involving intensive training and rapid inferencing, require very high bandwidth interconnects with low and consistent latency, and the highest reliability to maximize XPU utilization and reduce AI job completion time (JCT). A best-of-breed network with AI-specific optimizations is critical for delivering AI applications, with any JCT slowdown leading to revenue loss. Typical workloads have fewer, very high-bandwidth, low-entropy flows that run for extended periods, exchanging large messages synchronously, necessitating advanced lossless forwarding and specialized operational tools. They differ from cloud networking traffic as summarized below:
June 2025 marks the 11th anniversary of Project Galileo, Cloudflare’s initiative to provide free cybersecurity protection to vulnerable organizations working in the public interest around the world. From independent media and human rights groups to community activists, Project Galileo supports those often targeted for their essential work in human rights, civil society, and democracy building.
A lot has changed since we marked the 10th anniversary of Project Galileo. Yet, our commitment remains the same: help ensure that organizations doing critical work in human rights have access to the tools they need to stay online. We believe that organizations, no matter where they are in the world, deserve reliable, accessible protection to continue their important work without disruption.
For our 11th anniversary, we're excited to share several updates including:
An interactive Cloudflare Radar report providing insights into the cyber threats faced by at-risk public interest organizations protected under the project.
An expanded commitment to digital rights in the Asia-Pacific region with two new Project Galileo partners.
New stories from organizations protected by Project Galileo working on the frontlines of civil society, human rights, and journalism from around the world.
This blog post describes yet another bizarre example of how reliable digital twins are, but don’t worry; they all work great in PowerPoint.
After “fixing” the integration tests to deal with ArubaCX’s notion of VXLAN VNI having 16 bits, the bridging test worked, but the IRB tests kept failing.
In the IRB test, the lab has two layer-3 switches. Each of them should be able to bridge within a VLAN/VXLAN segment and route across the segments.
We’ve recently added support for the FinalizationRegistry API in Cloudflare Workers. This API allows developers to request a callback when a JavaScript object is garbage-collected, a feature that can be particularly relevant for managing external resources, such as memory allocated by WebAssembly (Wasm). However, despite its availability, our general advice is: avoid using it directly in most scenarios.
Our decision to add FinalizationRegistry
— while still cautioning against using it — opens up a bigger conversation: how memory management works when JavaScript and WebAssembly share the same runtime. This is becoming more common in high-performance web apps, and getting it wrong can lead to memory leaks, out-of-memory errors, and performance issues, especially in resource-constrained environments like Cloudflare Workers.
In this post, we’ll look at how JavaScript and Wasm handle memory differently, why that difference matters, and what FinalizationRegistry
is actually useful for. We’ll also explain its limitations, particularly around timing and predictability, walk through why we decided to support it, and how we’ve made it safer to use. Finally, we’ll talk about how newer JavaScript language features offer a more reliable and structured approach to solving these problems.
JavaScript relies on automatic memory management through a Continue reading
In a previous blog post, I explained how you can use bridges in a netlab topology to create custom LAN segments. Netlab supports two other node roles (host and router), and we’ll eventually add gateways.
netlab assumes that most network devices are routers (it considers a firewall to be a router in disguise), apart from Linux hosts, but you can always change what a node is with the role node attribute:
This story is becoming more and more common in the Kubernetes world. What starts as a manageable cluster or two can quickly balloon into a sprawling, multi-cluster architecture spanning public clouds, private data centers, or a bit of both. And with that growth comes a whole new set of headaches. How do you keep tabs on compliance across wildly different configurations? When a service goes down across multiple clusters, how do you pinpoint the cause amidst the chaos? And what about those hard-to-diagnose latency issues that seem to crop up between regions?
The truth is, achieving secure and scalable multi-cluster Kubernetes isn’t about throwing more tools at the problem. It’s about having the right tools and adopting the right best practices. This is where a solution like Calico Cluster Mesh shines, offering those essential capabilities for a seamless multi-cluster experience without the complexity or overhead that you expect with traditional service meshes.
So, why are so many organizations finding themselves in this multi-cluster maze? Often, it’s driven by solid business reasons:
Did you know that there’s an Ethernet link between the Packet Forwarding Engine (PFE – data plane) and Routing Engine (RE – control plane) in every Juniper MX? That’s why you have to run two VMs to emulate it (sometimes conveniently packed into one larger VM, proving RFC 1925 rule 6a).
That Ethernet link happens to have the MTU fixed at 1500 bytes. Guess what happens in the world where everyone uses jumbo frames? Did you say fragmentation? Bingo! And what do you think happens when one of those fragments gets dropped due to control-plane policing, and the rest of them are stuck in the reassembly queue? You’ll find the gory details in a lengthy blog post by Nitzan Tzelniker.
The metrics include:
Note: Grafana Cloud has a free service tier that can be used to test this example.
We all write code, but how do we know the changes we make in the future won’t break something that used to work? That’s where testing becomes important.
The idea is to catch problems early, ideally before they reach production. In the Python world, one of the most common ways to do this is with a tool called pytest. It lets you write tests to check that your code behaves the way you expect and helps you catch issues before they become a bigger problem.
Originally published under - https://www.opsmill.com/pytest-plugin-infrahub/
When working with Infrahub, testing is just as important. You might want to make sure your GraphQL queries are valid, your Jinja2 templates render correctly, or your transformations behave as expected.
Infrahub simplifies this by offering a pytest plugin that doesn’t require Python code; you define tests using plain YAML. This makes testing more accessible to teams across roles and speeds up the feedback loop during development.
These kinds of unit tests aren’t just about convenience, they help establish a production-ready automation system. With automated checks built into your process, every change is validated consistently, reducing the chance of something breaking unexpectedly. That consistency builds trust when your Continue reading
I got an interesting question from a reader. He listened to my podcast with Eric Chou and decided to try to learn in public:
Currently, I’m studying for the CCNP ENARSI exam, and would like to start posting my labs to LinkedIn, and perhaps even upload my lab topologies and configs to Git.
That’s a great idea. I would minimize the LinkedIn part1 and focus on Git:
This guide walks you through installing and configuring the ELK Stack (Elasticsearch, Logstash, Kibana, and Filebeat) using Docker Compose. It is fully updated for Elasticsearch 9.0.2 and explains the necessary changes for versions 8+ and above, including the required security setup and user permissions. Prerequisites Ensure you have the following installed on your system: The […]
<p>The post Deploying ELK Stack with Docker Compose (2025 Edition) first appeared on IPNET.</p>
TL;DR - For anyone who doesn’t want to go through the full post, here’s the short version. I bought the UGreen NASync DXP2800 (2 bay) from Amazon for £249 and paired it with two Seagate Ironwolf 8TB HDDs, around £180 each.
The unit comes with an Intel N100 CPU, 8GB of RAM (upgradeable to 16GB, but there’s only one RAM slot), and a 2.5Gb/s LAN port. It has a solid build, was easy to set up, and I actually like the UI. Sure, it lacks a lot of features compared to Synology or QNAP, but since I’m mainly using it for file storage, I’m happy with the purchase.
The short answer is, this is the best bang for the buck. For £249, I’m getting a 2-bay NAS with an N100 CPU, 8GB of RAM, a 2.5Gb/s LAN port, and two NVMe slots.
I’ve been wanting to buy a NAS for over Continue reading
Password hygiene drives IT professionals crazy–people forget their passwords, will not change them often enough, and choose weak ones. But are IT folks immune to these problems? What is the psychology behind passwords, and how do we do better? Karl Buhl joins Tom and Russ to talk about passwords.
download