Securing Kubernetes Traffic with Calico Ingress Gateway

If you’ve managed traffic in Kubernetes, you’ve likely navigated the world of Ingress controllers. For years, Ingress has been the standard way of getting our HTTP/S services exposed. But let’s be honest, it often felt like a compromise. We wrestled with controller-specific annotations to unlock critical features, blurred the lines between infrastructure and application concerns, and sometimes wished for richer protocol support or a more standardized approach. This “pile of vendor annotations,” while functional, highlighted the limitations of a standard that struggled to keep pace with the complex demands of modern, multi-team environments and even led to security vulnerabilities.

Wait a second, is this the ‘Ingress vs. Gateway API’ debate?

Yes, and it’s a crucial one. The Kubernetes Gateway API isn’t just an Ingress v2; it’s a fundamental redesign, the “future” of Kubernetes ingress, built by the community to address these very challenges head-on.

What makes Gateway API different?

There are three main points that I came across while evaluating GatewayAPI and Ingress controllers:

  1. Standardization & Portability: It aims to provide a core, standard way to manage ingress, reducing reliance on vendor-specific hacks and making it easier to switch implementations – change the class, and it should “just work.”
  2. Continue reading

HS106: Planning for the Epochalypse

IT teams deal with technology lifecycle issues all the time–including Y2K, which enterprises across the world grappled with for years. The Epochalypse, or Year 2038 Problem, is similar. Specifically, some Linux systems’ date-time counters will go from positive to negative at a specific date in 2038, potentially wreaking havoc on embedded systems and any other... Read more »

A QUIC Progress Report

There has been a major change in the landscape of the internet over the past few years with the progressive introduction of the QUIC transport protocol. Here I’d like to look at where we are up to with the deployment of QUIC on the public Internet. In so doing we also need to consider whether the DNS is ossifying in front of our eyes!

QO100 early success

I have heard and been heard via QO-100! As a licensed radio amateur have sent signals via satellite as far away as Brazil.

What it is

QO-100 is the first geostationary satellite with an amateur radio payload. A “repeater”, if you will. Geostationary means that you just aim your antenna (dish) once, and you can use it forever.

This is amazing for tweaking and experimenting. Other amateur radio satellites are only visible in the sky for minutes at a time, and you have to chase them across the sky to make a contact before it’s gone.

They also fly lower, meaning they can only see a small part of the world at a time. QO-100 can at all times see and be seen by all of Africa, Europe, India, and parts of Brazil.

Needs a bit more equipment, though

Other “birds” (satellites) can be accessed using a normal handheld FM radio and something like an arrow antenna. Well, you should actually have two radios, so that you can hear yourself on the downlink while transmitting.

There are also linear amateur radio satellites. For them you need SSB radios, which narrows down which radios you can use. And you still need Continue reading

AI network performance monitoring using containerlab

AI Metrics is available on GitHub. The application provides performance metrics for AI/ML RoCEv2 network traffic, for example, large scale CUDA compute tasks using NVIDIA Collective Communication Library (NCCL) operations for inter-GPU communications: AllReduce, Broadcast, Reduce, AllGather, and ReduceScatter.

The screen capture is from a containerlab topology that emulates a AI compute cluster connected by a leaf and spine network. The metrics include:

  • Total Traffic Total traffic entering fabric
  • Operations Total RoCEv2 operations broken out by type
  • Core Link Traffic Histogram of load on fabric links
  • Edge Link Traffic Histogram of load on access ports
  • RDMA Operations Total RDMA operations
  • RDMA Bytes Average RDMA operation size
  • Credits Average number of credits in RoCEv2 acknowledgements
  • Period Detected period of compute / exchange activity on fabric (in this case just over 0.5 seconds)
  • Congestion Total ECN / CNP congestion messages
  • Errors Total ingress / egress errors
  • Discards Total ingress / egress discards
  • Drop Reasons Packet drop reasons

Note: Clicking on peaks in the charts shows values at that time.

This article gives step-by-step instructions to run the demonstration.

git clone https://github.com/sflow-rt/containerlab.git
Download the sflow-rt/containerlab project from GitHub.
git clone https://github.com/sflow-rt/containerlab.git
cd containerlab
./run-clab
Run the above commands Continue reading

Finding Source Routing Paths

In the previous blog post, we discussed the generic steps that network devices (or a centralized controller) must take to discover paths across a network. Today, we’ll see how these principles are applied in source routing, one of the three main ways to move packets across a network.

Brief recap: In source routing, the sender has to specify the (loose or strict) path a packet should take across the network. The sender thus needs a mechanism to determine that path, and as always, there are numerous solutions to this challenge. We’ll explore a few of them, using the sample topology shown in the following diagram.

Cloudflare service outage June 12, 2025

On June 12, 2025, Cloudflare suffered a significant service outage that affected a large set of our critical services, including Workers KV, WARP, Access, Gateway, Images, Stream, Workers AI, Turnstile and Challenges, AutoRAG, Zaraz, and parts of the Cloudflare Dashboard.

This outage lasted 2 hours and 28 minutes, and globally impacted all Cloudflare customers using the affected services. The cause of this outage was due to a failure in the underlying storage infrastructure used by our Workers KV service, which is a critical dependency for many Cloudflare products and relied upon for configuration, authentication and asset delivery across the affected services. Part of this infrastructure is backed by a third-party cloud provider, which experienced an outage today and directly impacted availability of our KV service.

We’re deeply sorry for this outage: this was a failure on our part, and while the proximate cause (or trigger) for this outage was a third-party vendor failure, we are ultimately responsible for our chosen dependencies and how we choose to architect around them.

This was not the result of an attack or other security event. No data was lost as a result of this incident. Cloudflare Magic Transit and Magic WAN, DNS, cache, proxy, Continue reading

AMD Plots Interception Course With Nvidia GPU And System Roadmaps

To a certain extent, Nvidia and AMD are not really selling GPU compute capacity as much as they are reselling just enough HBM memory capacity and bandwidth to barely balance out the HBM memory they can get their hands on, thereby justifying the ever-embiggening amount of compute their GPU complexes get overstuffed with.

AMD Plots Interception Course With Nvidia GPU And System Roadmaps was written by Timothy Prickett Morgan at The Next Platform.

It’s Been Three Years, So It’s Time For Another PCI-Express Speed Bump

PCI-SIG, the organization that oversees the roadmap for the critical PCI-Express peripheral attachment specification, is continuing to keep to its three-year drumbeat for releasing the next iteration of the interconnect spec and already has its sights on the one after that, expected to be released in 2028 and appear in devices in 2030 or so.

It’s Been Three Years, So It’s Time For Another PCI-Express Speed Bump was written by Jeffrey Burt at The Next Platform.

N4N030: Network Shapes and Sizes

What shape is your network? In other words, what is its topology? On today’s episode, we discover the different types of network topologies and designs used in the enterprise, data center, and service provider networks. We cover leaf/spine, hub and spoke, point to point, mesh, and others. We also talk about how topologies affect traffic... Read more »