AMD Says “Helios” Racks And MI400 Series GPUs On Track For 2H 2026

While releasing an update to its InferenceX AI inference benchmark test, formerly known as InferenceMax and thus far only having Nvidia and AMD testing systems using it, the analysts SemiAnalysis declared correctly that thus far only Nvidia, AWS, and Google have created rackscale systems that are deployed today, and that AMD was working on it.

AMD Says “Helios” Racks And MI400 Series GPUs On Track For 2H 2026 was written by Timothy Prickett Morgan at The Next Platform.

Real-time visualization of AI / ML traffic matrix

Heatmap is available on GitHub. The application provides a real-time traffic matrix visualization of end-to-end traffic flowing across an Ethernet fabric. Each axis represents an ordered list of network addresses. The x-axis is a flow source and the y-axis is a flow destination.

For example, the Heatmap above comes from a large high performance compute cluster running a mixture of tasks. Traffic is concentrated along the diagonal, indicating that the job scheduler is packing related tasks in racks so that most traffic is confined to the rack.

Note: Live Dashboards links to a number dashboards showing live traffic, including the Heatmap above.

The next Heatmap shows a very different traffic pattern. In this case, RoCEv2 traffic generated by GPUs performing a NCCL AllReduce/AllGather collective operation using a ring algorithm. During the collective operation, each GPU sends data to its immediate neighbor (modulo the number of GPUs) in a logical ring, resulting in two nearly continuous lines on either size of the diagonal: one for forward traffic, and the other for return traffic associated with each flow.
The final example comes from a large data center hosting a mix of front end workloads. Unlike the backend networks, this network combines internal (East/West) Continue reading

Cloudflare One is the first SASE offering modern post-quantum encryption across the full platform

During Security Week 2025, we launched the industry’s first cloud-native post-quantum Secure Web Gateway (SWG) and Zero Trust solution, a major step towards securing enterprise network traffic sent from end user devices to public and private networks.

But this is only part of the equation. To truly secure the future of enterprise networking, you need a complete Secure Access Service Edge (SASE)

Today, we complete the equation: Cloudflare One is the first SASE platform to support modern standards-compliant post-quantum (PQ) encryption in our Secure Web Gateway, and across Zero Trust and Wide Area Network (WAN) use cases.  More specifically, Cloudflare One now offers post-quantum hybrid ML-KEM (Module-Lattice-based Key-Encapsulation Mechanism) across all major on-ramps and off-ramps.

To complete the equation, we added support for post-quantum encryption to our Cloudflare IPsec (our cloud-native WAN-as-a-Service) and Cloudflare One Appliance (our physical or virtual WAN appliance that establish Cloudflare IPsec connections). Cloudflare IPsec uses the IPsec protocol to establish encrypted tunnels from a customer’s network to Cloudflare’s global network, while IP Anycast is used to automatically route that tunnel to the nearest Cloudflare data center. Cloudflare IPsec simplifies configuration and provides high availability; if a specific data center becomes unavailable, traffic Continue reading

Cloudflare outage on February 20, 2026

On February 20, 2026, at 17:48 UTC, Cloudflare experienced a service outage when a subset of customers who use Cloudflare’s Bring Your Own IP (BYOIP) service saw their routes to the Internet withdrawn via Border Gateway Protocol (BGP).

The issue was not caused, directly or indirectly, by a cyberattack or malicious activity of any kind. This issue was caused by a change that Cloudflare made to how our network manages IP addresses onboarded through the BYOIP pipeline. This change caused Cloudflare to unintentionally withdraw customer prefixes.

For some BYOIP customers, this resulted in their services and applications being unreachable from the Internet, causing timeouts and failures to connect across their Cloudflare deployments that used BYOIP. The website for Cloudflare’s recursive DNS resolver (1.1.1.1) saw 403 errors as well. The total duration of the incident was 6 hours and 7 minutes with most of that time spent restoring prefix configurations to their state prior to the change.

Cloudflare engineers reverted the change and prefixes stopped being withdrawn when we began to observe failures. However, before engineers were able to revert the change, ~1,100 BYOIP prefixes were withdrawn from the Cloudflare network. Some customers were able to restore their Continue reading

Code Mode: give agents an entire API in 1,000 tokens

Model Context Protocol (MCP) has become the standard way for AI agents to use external tools. But there is a tension at its core: agents need many tools to do useful work, yet every tool added fills the model's context window, leaving less room for the actual task.

Code Mode is a technique we first introduced for reducing context window usage during agent tool use. Instead of describing every operation as a separate tool, let the model write code against a typed SDK and execute the code safely in a Dynamic Worker Loader. The code acts as a compact plan. The model can explore tool operations, compose multiple calls, and return just the data it needs. Anthropic independently explored the same pattern in their Code Execution with MCP post.

Today we are introducing a new MCP server for the entire Cloudflare API — from DNS and Zero Trust to Workers and R2 — that uses Code Mode. With just two tools, search() and execute(), the server is able to provide access to the entire Cloudflare API over MCP, while consuming only around 1,000 tokens. The footprint stays fixed, no matter how many API endpoints exist.

For a large API like Continue reading

Packet Trimming Deep Dive – Part I

 Introduction


The previous chapter introduced the Ultra Ethernet (UE) Transport Layer and its endpoint-centric congestion control mechanisms: Network Signaled Congestion Control (NSCC) and Receiver Credit-based Congestion Control (RCCC). This chapter moves down to the UE Network Layer and introduces Packet Trimming (PT).

While node-based approaches rely on NIC-to-NIC feedback loops, Packet Trimming allows network switches to actively intervene during periods of high utilization. Instead of silently dropping packets under congestion, the network provides an explicit and fast signal that enables immediate recovery.

The primary goal of Packet Trimming is to prevent incast congestion, a situation in which multiple ingress ports simultaneously overwhelm a single egress port. In AI and HPC workloads, many-to-one traffic patterns are common—for example, when multiple workers send data to a single parameter server. Under these conditions, egress buffers can be exhausted very quickly. In a best-effort network, this typically results in tail drops. The receiver then waits for a retransmission timeout, which introduces long tail latency and disrupts synchronization across distributed workloads. Packet Trimming replaces this silent packet loss with an explicit congestion signal that travels faster than the data itself.

The process begins at the source UE node. The NIC marks outgoing data packets with Continue reading

Public Videos: EVPN in MPLS-Based Environments

While we’re mostly discussing EVPN in conjunction with VXLAN encapsulation, its initial use case was as an alternate control plane for MPLS networks.

Krzysztof Szarkowicz had a great presentation describing the specifics of EVPN in MPLS-Based Environments a few years ago. Those videos (part of the EVPN Technical Deep Dive webinar) are now public; you can watch them without an ipSpace.net account.

Looking for more binge-watching materials? You’ll find them here.

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference

Adding big blocks of SRAM to collections of AI tensor engines, or better still, a waferscale collection of such engines, turbocharges AI inference, as has been shown time and again by AI upstarts Cerebras Systems, SambaNova Systems (which Intel is rumored to have taken a run at late last year), Groq (just eaten by Nvidia for $20 billion), and Graphcore (eaten by SoftBank for $600 million a year and a half ago) as they compare against GPUs from Nvidia and AMD.

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference was written by Timothy Prickett Morgan at The Next Platform.

Cisco IOS/XR OSPFv2 Not-So-Passive Interfaces

What’s wrong with me? Why do I have to uncover another weirdness every single time I run netlab integration tests on a new platform? Today, it’s Cisco IOS/XR (release 25.2.1) and its understanding of what “passive” means. According to the corresponding documentation, the passive interface configuration command is exactly what I understood it to be:

Use the passive command in appropriate mode to suppress the sending of OSPF protocol operation on an interface.

However, when I ran the OSPFv2 passive interface integration test with an IOS/XR container, it kept failing with neighbor is in Init state (the first and only time I ever encountered such an error after testing over two dozen platforms).

Project Calico 3.30+ Hackathon: Show Us What You Can Build!

Calico Hackathon Logo

Build the Future of Cloud-Native Networking! 🚀

The Calico community moves fast. With the releases of Calico 3.30 and 3.31, brings improvements in scalability, network security, and visibility. Now, we want to see what YOU can do with them!

We’re excited to officially invite you to the Project Calico 3.30+ Community Hackathon.

Whether you’re a seasoned eBPF expert or a newcomer to the Gateway API, we welcome your innovation and  your ideas!

🔥 What’s in the Toolkit?

We’ve packed Calico 3.30+ with powerful features ready for you to hack on:

  • 🔹 Goldmane & Whisker: High-performance flow insights meets a sleek, operator-friendly UI.
  • 🔹 Staged Policies: The “Safety First” way to test Zero Trust before enforcing it.
  • 🔹 Calico Ingress Gateway: Modern, Envoy-powered traffic management via the Gateway API.
  • 🔹 Calico Cloud Ready: Connect open-source clusters to a free-forever, read-only tier for instant visualization and troubleshooting.
  • 🔹 IPAM for Load Balancers: Consistent IP strategies for MetalLB and beyond.
  • 🔹 Advanced QoS: Fine-grained bandwidth and packet rate controls.

💡 Inspiration: What Can You Build?

Whether you’re a networking guru or an automation Continue reading

1 2 3 3,845