IS-IS does not use IPv4 or IPv6, so it should be a no-brainer to run it over IPv4 unnumbered or IPv6 LLA interfaces. The latter is true; the former is smack in the middle of the It Depends™ territory.
Want to know more or test the devices you’re usually working with? The Running IS-IS Over Unnumbered/LLA-only Interfaces lab exercise is just what you need.
Click here to start the lab in your browser using GitHub Codespaces (or set up your own lab infrastructure). After starting the lab environment, change the directory to basic/7-unnumbered and execute netlab up.
The AI revolution is here, and it’s running on Kubernetes. From fraud detection systems to generative AI platforms, AI-powered applications are no longer experimental projects; they’re mission-critical infrastructure. But with great power comes great responsibility, and for Kubernetes platform teams, that means rethinking security.
But this rapid adoption comes with a challenge: 13% of organizations have already reported breaches of AI models or applications, while another 8% don’t even know if they’ve been compromised. Even more concerning, 97% of breached organizations reported that they lacked proper AI access controls. To address this, we must recognize that AI architectures introduce entirely new attack vectors that traditional security models aren’t equipped to handle.
AI workloads running in Kubernetes environments introduce a new set of security challenges. Traditional security models often fall short in addressing the unique complexities of AI pipelines, specifically related to The Multi-Cluster Problem, The East-West Traffic Dilemma, and Egress Control Complexity. Let’s explore each of these critical attack vectors in detail.
Most enterprise AI deployments don’t run in a single cluster. Instead, they typically follow this pattern:
Training Infrastructure (GPU-Heavy)
Since time immemorial, I have used the ip host router configuration command to get host-to-IP mappings in networking labs without going through the hassle of setting up a DNS server. Some devices even accepted multiple IP addresses in the ip host command, allowing you to list all router interfaces in a single command and get reverse (IP-to-host) mapping working like a charm. Or so I thought 🤦♂️
It turns out I’m too old, and what I know is sometimes no longer true. It seems that the last implementation working as I expected is Cisco IOS Classic ☹️
The latest version of the AI Metrics dashboard uses industry standard sFlow telemetry from network switches to monitor the number of trimmed packets to use as a congestion metric.
Ultra Ethernet Specification Update describes how the Ultra Ethernet Transport (UET) Protocol has the ability to leverage optional “packet trimming” in network switches, which allows packets to be truncated rather than dropped in the fabric during congestion events. As packet spraying causes reordering, it becomes more complicated to detect loss. Packet trimming gives the receiver and the sender an early explicit indication of congestion, allowing immediate loss recovery in spite of reordering, and is a critical feature in the low-RTT environments where UET is designed to operate.
cumulus@switch:~$ nv set system forwarding packet-trim profile packet-trim-default cumulus@switch:~$ nv config apply
NVIDIA Cumulus Linux release 5.14 for NVIDA Spectrum Ethernet Switches includes support for Packet Trimming. The above command enables packet trimming, sets the DSCP remark value to 11, sets the truncation size to 256 bytes, sets the switch priority to 4, and sets the eligibility to all ports on the switch with traffic class 1, 2, and 3. NVIDA BlueField host adapters respond to trimmed packets to ensure fast congestion recovery.
It’s been over four years since I published the last Software Gone Wild episode. In the meantime, I spent most of my time developing an open-source labbing tool, so it should be no surprise that the first post-hiatus episode focused on a netlab use case: how Ethan Banks (of the PacketPushers fame) is using the tool to quickly check the technology details for his N is for Networking podcast.
As expected, our discussion took us all over the place, including (according to Riverside AI):
In my earlier blog post, Troubleshooting OT Security: Why IT-Site1 Can’t Ping OT_Site1R, we discovered the reason for this issue. Our “who done it” is simple. For security reasons, we are using Cisco TrustSec to keep them from communicating. Which... Read More ›
The post Why IT-Site1 Can’t Ping OT_Site1R – Show and Tell Time #1 appeared first on Networking with FISH.
A single cyberattack or system outage can threaten not just one financial institution, but the stability of a vast portion of the entire financial sector. For today’s financial enterprises, securing dynamic infrastructure like Kubernetes is a core operational and regulatory challenge. The solution lies in achieving DORA compliance for Kubernetes, which transforms your cloud-native infrastructure into a resilient, compliant, and secure backbone for critical financial services.
Before DORA (Digital Operational Resilience Act), rules for financial companies primarily focused on making sure they had enough financial capital to cover losses. But what if a cyberattack or tech failure brought a large part of the financial system down? Even with plenty of financial capital, a major outage could stop most operations and cause big problems for the whole financial market. DORA steps in to fix this. It’s all about making sure financial firms can withstand, respond to, and recover quickly from cyberattacks and other digital disruptions.
The Digital Operational Resilience Act (DORA) is a European Union (EU) regulation that came into effect on January 17, 2025 and is designed to strengthen the security of financial entities. It establishes uniform requirements across the financial Continue reading
Whenever you see Gerhard Stein and Thomas Weible from Flexoptix in a list of presenters, three things immediately become obvious:
Their SwiNOG 40 presentation (video) met all three expectation. We learned how well transceivers cope with high temperatures and what happens when you try to melt them with a heat gun.
We’re making it easier to run your Node.js applications on Cloudflare Workers by adding support for the node:http client and server APIs. This significant addition brings familiar Node.js HTTP interfaces to the edge, enabling you to deploy existing Express.js, Koa, and other Node.js applications globally with zero cold starts, automatic scaling, and significantly lower latency for your users — all without rewriting your codebase. Whether you're looking to migrate legacy applications to a modern serverless platform or build new ones using the APIs you already know, you can now leverage Workers' global network while maintaining your existing development patterns and frameworks.
Cloudflare Workers operate in a unique serverless environment where direct tcp connection isn't available. Instead, all networking operations are fully managed by specialized services outside the Workers runtime itself — systems like our Open Egress Router (OER) and Pingora that handle connection pooling, keeping connections warm, managing egress IPs, and all the complex networking details. This means as a developer, you don't need to worry about TLS negotiation, connection management, or network optimization — it's all handled for you automatically.
This fully-managed approach is actually why Continue reading
netlab release 25.09 includes:
But wait, there’s more (as always):
Distributed AI training requires careful setup of both hardware and software resources. In a UET-based system, the environment initialization proceeds through several key phases, each ensuring that GPUs, network interfaces, and processes are correctly configured before training begins:
1. Fabric Endpoint (FEP) Creation
Each GPU process is associated with a logical Fabric Endpoint (FEP) that abstracts the connection to its NIC port. FEPs, together with the connected switch ports, form a Fabric Plane (FP)—an isolated, high-performance data path. The NICs advertise their capabilities via LLDP messages to ensure compatibility and readiness.
2. Vendor UET Provider Publication
Once FEPs are created, they are published to the Vendor UET Provider, which exposes them as Libfabric domains. This step makes the Fabric Addresses (FAs) discoverable, but actual communication objects (endpoints, address vectors) are created later by the application processes. This abstraction ensures consistent interaction with the hardware regardless of vendor-specific implementations.
3. Job Launcher and Environment Variables
When a distributed training job is launched, the job launcher (e.g., Torchrun) sets up environment variables for each process. These include the master rank IP and port, local and global ranks, and the total number of processes.
4. Environment Variable Continue reading
My last post about Go got some attention.
In fact, two of my posts got attention that day, which broke my nginx since I was running livecount behind nginx, making me run out of file descriptors when thousands of people had the page opened.
It’s a shame that I had to turn off livecount, since it’d be cool to see the stats. But I was out of the country, with unreliable access to both Internet and even electricity in hotels, so I couldn’t implement the real fix until I got back, when it had already mostly died down.
I knew this was a problem with livecount, of course, and I even allude to it in its blog post.
Anyway, back to programming languages.
The reactions to my post can be summarized as:
I respect the first two. The last one has to be from people who are too emotionally invested with their tools, and take articles like this Continue reading
After the release of the Ubuntu 24.04 edition of Linux For Network Engineers (LFNE) I’ve got some questions from the community. Here are two new flavors of LFNE based on your requests. LFNE AlmaLinux 10 OS For Red Hat fans who prefer a RHEL-style environment. Since CentOS is no longer maintained, AlmaLinux is the closest […]
<p>The post Linux For Network Engineers (LFNE) – AlmaLinux & Alpine Editions first appeared on IPNET.</p>