Networking Archives - Page 7 of 3462

netlab 25.09: IPv6 RA, Link Impairments, and Performance Gains

Link impairment (implemented with Linux netem queuing discipline) defined in lab topology or configured/controlled with the netlab tc command
Configurable IPv6 Router Advertisement parameters
The files plugin to store the content of short files (including custom configuration templates) directly in the lab topology
Support for Nokia SR-OS container (SR-SIM)
Support for very large topologies (tested so far: approximately 3000 lab devices)

But wait, there’s more (as always):

Ultra Ethernet: Fabric Setup

Introduction: Job Environment Initialization

Distributed AI training requires careful setup of both hardware and software resources. In a UET-based system, the environment initialization proceeds through several key phases, each ensuring that GPUs, network interfaces, and processes are correctly configured before training begins:

1. Fabric Endpoint (FEP) Creation

Each GPU process is associated with a logical Fabric Endpoint (FEP) that abstracts the connection to its NIC port. FEPs, together with the connected switch ports, form a Fabric Plane (FP)—an isolated, high-performance data path. The NICs advertise their capabilities via LLDP messages to ensure compatibility and readiness.

2. Vendor UET Provider Publication

Once FEPs are created, they are published to the Vendor UET Provider, which exposes them as Libfabric domains. This step makes the Fabric Addresses (FAs) discoverable, but actual communication objects (endpoints, address vectors) are created later by the application processes. This abstraction ensures consistent interaction with the hardware regardless of vendor-specific implementations.

3. Job Launcher and Environment Variables

When a distributed training job is launched, the job launcher (e.g., Torchrun) sets up environment variables for each process. These include the master rank IP and port, local and global ranks, and the total number of processes.

4. Environment Variable Continue reading

Ideal programming language

My last post about Go got some attention.

In fact, two of my posts got attention that day, which broke my nginx since I was running livecount behind nginx, making me run out of file descriptors when thousands of people had the page opened.

It’s a shame that I had to turn off livecount, since it’d be cool to see the stats. But I was out of the country, with unreliable access to both Internet and even electricity in hotels, so I couldn’t implement the real fix until I got back, when it had already mostly died down.

I knew this was a problem with livecount, of course, and I even allude to it in its blog post.

Anyway, back to programming languages.

The reactions to my post can be summarized as:

Oh yes, these things are definite flaws in the language.
What you’re saying is true, but it’s not a problem. Your post is pointless.
You’re dumb. You don’t understand Go. Here let me explain your own blog post to you […]

I respect the first two. The last one has to be from people who are too emotionally invested with their tools, and take articles like this Continue reading

Linux For Network Engineers (LFNE) – AlmaLinux & Alpine Editions

After the release of the Ubuntu 24.04 edition of Linux For Network Engineers (LFNE) I’ve got some questions from the community. Here are two new flavors of LFNE based on your requests. LFNE AlmaLinux 10 OS For Red Hat fans who prefer a RHEL-style environment. Since CentOS is no longer maintained, AlmaLinux is the closest […]

<p>The post Linux For Network Engineers (LFNE) – AlmaLinux & Alpine Editions first appeared on IPNET.</p>

Broadcom Lands Shepherding Deal For OpenAI “Titan” XPU

Broadcom turned in its financial results for its third quarter last night, and all of the tongues in the IT sector are wagging about how the chip maker and enterprise software giant has landed a fourth customer for its burgeoning custom XPU design and shepherding business. …

Broadcom Lands Shepherding Deal For OpenAI “Titan” XPU was written by Timothy Prickett Morgan at The Next Platform.

HN795: Adventures In Latency

Monitoring and troubleshooting latency can be tricky. If it’s in the network, was it the IP stack? A NIC? A switch buffer? A middlebox somewhere on the WAN? If it’s the application, can you, the network engineer, bring receipts to the app team? And what if you need to build and operate a network that’s... Read more »

Hedge 279: Learning Theory

Returning to a thread here at the Hedge, Rick Graziani joins Tom and Russ to discuss a college professor’s perspective on why network engineers should learn the theory, and not just the configuration.

download

Calico Egress Gateway: A Cost-Effective NAT for Kubernetes

The Need for a Kubernetes NAT Gateway

When Kubernetes workloads need to connect to the outside world, whether to access external APIs, integrate with external systems, or connect to partner networks, they often face a unique challenge. The problem? Pod IP addresses inside Kubernetes clusters are dynamic and non-routable. For external systems to recognize and trust this traffic, workloads need a consistent, dependable identity. This means outbound connections require fixed, routable IP addresses that external services can rely on. This is where Network Address Translation (NAT) becomes essential. It assigns Kubernetes pods with a static, consistent IP for all outbound traffic, ensuring those connections work properly.

If you’re running Kubernetes in the cloud, a common solution is to use your cloud provider’s managed NAT gateway service. These are easy to use, but they can come at a cost. In AWS, Azure, and Google Cloud, cloud-managed NAT gateways charge both an hourly fee and a per-gigabyte data processing fee. For high-traffic deployments, those charges can quickly add up, sometimes even exceeding your compute costs.

The good news: with Calico, you can handle NAT from inside your Kubernetes cluster, avoiding cloud NAT gateway fees and giving you more control over how egress Continue reading

Addressing the unauthorized issuance of multiple TLS certificates for 1.1.1.1

Over the past few days Cloudflare has been notified through our vulnerability disclosure program and the certificate transparency mailing list that unauthorized certificates were issued by Fina CA for 1.1.1.1, one of the IP addresses used by our public DNS resolver service. From February 2024 to August 2025, Fina CA issued twelve certificates for 1.1.1.1 without our permission. We did not observe unauthorized issuance for any properties managed by Cloudflare other than 1.1.1.1.

We have no evidence that bad actors took advantage of this error. To impersonate Cloudflare's public DNS resolver 1.1.1.1, an attacker would not only require an unauthorized certificate and its corresponding private key, but attacked users would also need to trust the Fina CA. Furthermore, traffic between the client and 1.1.1.1 would have to be intercepted.

While this unauthorized issuance is an unacceptable lapse in security by Fina CA, we should have caught and responded to it earlier. After speaking with Fina CA, it appears that they issued these certificates for the purposes of internal testing. However, no CA should be issuing certificates for domains and IP addresses without checking control. At Continue reading

Linux For Network Engineers (LFNE) – Now on Ubuntu 24.04

The Linux For Network Engineers (LFNE) Docker container has been refreshed and is now built on Ubuntu 24.04 LTS. For those new to it, LFNE is a ready-to-use Linux environment preloaded with the most popular tools used by network engineers—from packet capture and traffic analysis utilities to configuration helpers and scripting support. Instead of spending […]

<p>The post Linux For Network Engineers (LFNE) – Now on Ubuntu 24.04 first appeared on IPNET.</p>

How Many Lab Devices Can netlab Handle?

TL&DR: Over 3000

A few weeks ago, Christian opened an issue describing how netlab breaks when the lab topology has more than 250 devices. We fixed that, only to get into another morass: some code has complexity higher than O(n) (meaning that going from 100 to 200 devices makes things more than twice as slow). Christian is working on one of those problems at the moment (it’s not that his ginormous labs won’t start, it just takes a long time), and I decided it’s time to polish a few other bits of the code.

TCG057: Following the Progress of the Model Context Protocol (MCP) With John Capobianco

John Capobianco is back! Just months after our first Model Context Protocol (MCP) discussion, John returns to showcase how this “USB-C of software” has transformed from experimental technology to an enterprise-ready solutions. We explore the game-changing OAuth 2.1 security updates, witness live demonstrations of packet analysis through natural language with Gemini CLI, and discover how... Read more »

Worth Reading 090325

Among the plethora of advanced attacker tools that exemplify how threat actors continuously evolve their tactics, techniques, and procedures (TTPs) to evade detection and maximize impact, PipeMagic, a highly modular backdoor used by Storm-2460 masquerading as a legitimate open-source ChatGPT Desktop Application, stands out as particularly advanced.

In this episode of PING, APNIC’s Chief Scientist, Geoff Huston, explores the economic inevitability of centrality in the modern Internet.

Philipp delivers a sober message for innovators: invention is only half the battle; defending your invention can define your company’s fate.

Google now estimates that the specs for a Cryptographically Relevant Quantum Computer (CRQC), which can break conventional public key encryption in a useful amount of time, are lower than they had previously estimatedﾅby 95%.

In this report, I’ll focus on the material presented at the DELEG and DNSOP Working Groups.

NAN099: Bridging the Gap Between Innovative Tech and Everyday Users

New technologies, tools, and innovations help move IT forward, but it can be hard for users to keep up. Network Automation Nerds welcomes guest William Collins, a dynamic force in the world of technology. As a passionate tech evangelist, he helps to bridge the gap between emerging technologies such as AI and everyday users with... Read more »

D2DO281: Faddom: Providing a Unified Source of Truth for Security and IT Operations (Sponsored)

Faddom is re-envisioning what application dependency mapping and infrastructure inventory can be in the era of cloud and hybrid IT. Join us today on this sponsored episode as we speak with Faddom’s Itamar Rotem, CPO and Ofer Regev, CTO, about how Faddom’s discovery process can help to improve migrations for any size organization and help... Read more »

AI Week 2025: Recap

How do we embrace the power of AI without losing control?

That was one of our big themes for AI Week 2025, which has now come to a close. We announced products, partnerships, and features to help companies successfully navigate this new era.

Everything we built was based on feedback from customers like you that want to get the most out of AI without sacrificing control and safety. Over the next year, we will double down on our efforts to deliver world-class features that augment and secure AI. Please keep an eye on our Blog, AI Avenue, Product Change Log and CloudflareTV for more announcements.

This week we focused on four core areas to help companies secure and deliver AI experiences safely and securely:

Securing AI environments and workflows
Protecting original content from misuse by AI
Helping developers build world-class, secure, AI experiences
Making Cloudflare better for you with AI

Thank you for following along with our first ever AI week at Cloudflare. This recap blog will summarize each announcement across these four core areas. For more information, check out our “This Week in NET” recap episode also featured at the end of this blog.

Securing AI Continue reading

SwiNOG 40: Submarine Cables

If you know as much about submarine cables (the thingies that carry 90% of international Internet traffic) as I do (= nothing), you SHOULD watch the Technical Update on Submarine Cables (video) presentation Liam Taylor had at the SwiNOG 40 event. Have fun ;)

PP076: RF Risks and How to See Unseen Threats

Our airwaves are alive with radio frequencies (RF). Right now billions of devices around the world are chattering invisibly over Wi-Fi, Bluetooth, Zigbee, and other protocols you might not have heard of. On today’s show we peer into the invisible world to better understand the RF threat environment. Our guest is Brett Walkenhorst, CTO of... Read more »

The impact of the Salesloft Drift breach on Cloudflare and our customers

Last week, Cloudflare was notified that we (and our customers) are affected by the Salesloft Drift breach. Because of this breach, someone outside Cloudflare got access to our Salesforce instance, which we use for customer support and internal customer case management, and some of the data it contains. Most of this information is customer contact information and basic support case data, but some customer support interactions may reveal information about a customer's configuration and could contain sensitive information like access tokens. Given that Salesforce support case data contains the contents of support tickets with Cloudflare, any information that a customer may have shared with Cloudflare in our support system—including logs, tokens or passwords—should be considered compromised, and we strongly urge you to rotate any credentials that you may have shared with us through this channel.

As part of our response to this incident, we did our own search through the compromised data to look for tokens or passwords and found 104 Cloudflare API tokens. We have identified no suspicious activity associated with those tokens, but all of these have been rotated in an abundance of caution. All customers whose data was compromised in this breach have been informed directly by Continue reading

HS111: When Someone Makes Your Cloud Service Go Poof!

The modern enterprise is built on cloud, with most organizations using SaaS for their “horizontal” work horse layers, such as communications, conferencing, HR, and payroll. That makes the enterprise entirely dependent on the good-faith execution and good-will delivery of the cloud providers. Those providers have a huge economic incentive to reliably deliver software – but... Read more »

« Previous 1 … 5 6 7 8 9 … 3,462 Next »