Packet Trimming Deep Dive – Part I

 Introduction


The previous chapter introduced the Ultra Ethernet (UE) Transport Layer and its endpoint-centric congestion control mechanisms: Network Signaled Congestion Control (NSCC) and Receiver Credit-based Congestion Control (RCCC). This chapter moves down to the UE Network Layer and introduces Packet Trimming (PT).

While node-based approaches rely on NIC-to-NIC feedback loops, Packet Trimming allows network switches to actively intervene during periods of high utilization. Instead of silently dropping packets under congestion, the network provides an explicit and fast signal that enables immediate recovery.

The primary goal of Packet Trimming is to prevent incast congestion, a situation in which multiple ingress ports simultaneously overwhelm a single egress port. In AI and HPC workloads, many-to-one traffic patterns are common—for example, when multiple workers send data to a single parameter server. Under these conditions, egress buffers can be exhausted very quickly. In a best-effort network, this typically results in tail drops. The receiver then waits for a retransmission timeout, which introduces long tail latency and disrupts synchronization across distributed workloads. Packet Trimming replaces this silent packet loss with an explicit congestion signal that travels faster than the data itself.

The process begins at the source UE node. The NIC marks outgoing data packets with Continue reading

Public Videos: EVPN in MPLS-Based Environments

While we’re mostly discussing EVPN in conjunction with VXLAN encapsulation, its initial use case was as an alternate control plane for MPLS networks.

Krzysztof Szarkowicz had a great presentation describing the specifics of EVPN in MPLS-Based Environments a few years ago. Those videos (part of the EVPN Technical Deep Dive webinar) are now public; you can watch them without an ipSpace.net account.

Looking for more binge-watching materials? You’ll find them here.

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference

Adding big blocks of SRAM to collections of AI tensor engines, or better still, a waferscale collection of such engines, turbocharges AI inference, as has been shown time and again by AI upstarts Cerebras Systems, SambaNova Systems (which Intel is rumored to have taken a run at late last year), Groq (just eaten by Nvidia for $20 billion), and Graphcore (eaten by SoftBank for $600 million a year and a half ago) as they compare against GPUs from Nvidia and AMD.

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference was written by Timothy Prickett Morgan at The Next Platform.

N4N049: Understanding Firewalls

Today, Ethan and Holly provide an overview of firewalls. While cybersecurity is a separate discipline from network engineering, much of what happens in cybersecurity is interesting at the packet level, so there’s a good deal of overlap. It’s likely that as a network engineer, you’ll be managing, or at least dealing with, firewalls in your... Read more »

IPB194: Navel Gazing at NAT in IPv6

Ed, Nick, and Tom discuss the need for Network Address Translation v6 to v6 (NAT66). While Network Prefix Translation (NPTv6) exists, its limitations make it insufficient for real-world business needs. They also highlight that without a standardized NAT66, the market is forcing vendors to implement their own, hindering widespread IPv6 adoption. Episode Links: IPv6-to-IPv6 Network... Read more »

Cisco IOS/XR OSPFv2 Not-So-Passive Interfaces

What’s wrong with me? Why do I have to uncover another weirdness every single time I run netlab integration tests on a new platform? Today, it’s Cisco IOS/XR (release 25.2.1) and its understanding of what “passive” means. According to the corresponding documentation, the passive interface configuration command is exactly what I understood it to be:

Use the passive command in appropriate mode to suppress the sending of OSPF protocol operation on an interface.

However, when I ran the OSPFv2 passive interface integration test with an IOS/XR container, it kept failing with neighbor is in Init state (the first and only time I ever encountered such an error after testing over two dozen platforms).

D2DO294: AI in My Vuln Research Workflow

Kat Traxler, Principal Security Researcher at Vectra AI, returns to the podcast to discuss her AI-powered vulnerability research workflow. She explains how she uses two different AI models to act as the “blackboard” while she applies her expertise to triage AI-generated ideas to increase her productivity. She also asks a concerning question: As AI automates... Read more »

Project Calico 3.30+ Hackathon: Show Us What You Can Build!

Calico Hackathon Logo

Build the Future of Cloud-Native Networking! 🚀

The Calico community moves fast. With the releases of Calico 3.30 and 3.31, brings improvements in scalability, network security, and visibility. Now, we want to see what YOU can do with them!

We’re excited to officially invite you to the Project Calico 3.30+ Community Hackathon.

Whether you’re a seasoned eBPF expert or a newcomer to the Gateway API, we welcome your innovation and  your ideas!

🔥 What’s in the Toolkit?

We’ve packed Calico 3.30+ with powerful features ready for you to hack on:

  • 🔹 Goldmane & Whisker: High-performance flow insights meets a sleek, operator-friendly UI.
  • 🔹 Staged Policies: The “Safety First” way to test Zero Trust before enforcing it.
  • 🔹 Calico Ingress Gateway: Modern, Envoy-powered traffic management via the Gateway API.
  • 🔹 Calico Cloud Ready: Connect open-source clusters to a free-forever, read-only tier for instant visualization and troubleshooting.
  • 🔹 IPAM for Load Balancers: Consistent IP strategies for MetalLB and beyond.
  • 🔹 Advanced QoS: Fine-grained bandwidth and packet rate controls.

💡 Inspiration: What Can You Build?

Whether you’re a networking guru or an automation Continue reading

1 4 5 6 7 8 3,852