Archive

Category Archives for "Networking"

Akvorado release 2.0

Akvorado 2.0 was released today! Akvorado collects network flows with IPFIX and sFlow. It enriches flows and stores them in a ClickHouse database. Users can browse the data through a web console. This release introduces an important architectural change and other smaller improvements. Let’s dive in! 🤿

$ git diff --shortstat v1.11.5
 493 files changed, 25015 insertions(+), 21135 deletions(-)

New “outlet” service

The major change in Akvorado 2.0 is splitting the inlet service into two parts: the inlet and the outlet. Previously, the inlet handled all flow processing: receiving, decoding, and enrichment. Flows were then sent to Kafka for storage in ClickHouse:

Akvorado flow processing before the change: flows are received and processed
by the inlet, sent to Kafka and stored in ClickHouse
Akvorado flow processing before the introduction of the outlet service

Network flows reach the inlet service using UDP, an unreliable protocol. The inlet must process them fast enough to avoid losing packets. To handle a high number of flows, the inlet spawns several sets of workers to receive flows, fetch metadata, and assemble enriched flows for Kafka. Many configuration options existed for scaling, which increased complexity for users. The code needed to avoid blocking at any cost, making the processing pipeline complex and sometimes unreliable, particularly the BMP receiver.1 Adding new features became difficult Continue reading

Changing Colors and Line Styles in netlab Graphs

Last week, I explained how to generate network topology graphs (using GraphViz or D2 graphing engines) from a netlab lab topology. Let’s see how we can make them look nicer (or at least more informative). We’ll work with a simple leaf-and-spine topology with four nodes1:

Baseline leaf-and-spine topology
defaults.device: frr
provider: clab

nodes: [ s1, s2, l1, l2 ]
links: [ s1-l1, s1-l2, s2-l1, s2-l2 ]

This is the graph generated by netlab create followed by dot graph.dot -T png -o graph.png:

Cloudflare’s 2025 Annual Founders’ Letter

Cloudflare launched 15 years ago this week. We like to celebrate our birthday by announcing new products and features that give back to the Internet, which we’ll do a lot of this week. But, on this occasion, we've also been thinking about what's changed on the Internet over the last 15 years and what has not.

With some things there's been clear progress: when we launched in 2010 less than 10 percent of the Internet was encrypted, today well over 95 percent is encrypted. We're proud of the role we played in making that happen.

Some other areas have seen limited progress: IPv6 adoption has grown steadily but painfully slowly over the last 15 years, in spite of our efforts. That's a problem because as IPv4 addresses have become scarce and expensive it’s held back new entrants and driven up the costs of things like networking and cloud computing.

The Internet’s Business Model

Still other things have remained remarkably consistent: the basic business model of the Internet has for the last 15 years been the same — create compelling content, find a way to be discovered, and then generate value from the resulting traffic. Whether that was through ads or Continue reading

🖥️ Running Local LLMs: Experiments and Insights

✨ Summary Large Language Models (LLMs) have powered the AI wave of the last 3–4 years. While most are closed-source, a vibrant ecosystem of open-weight and open-source models has emerged. As a long-time AI user, I wanted to peek under the hood: how do GenAI models work, and what happens when you actually run them … Continue reading 🖥️ Running Local LLMs: Experiments and Insights

Getting Started With Infrahub MCP Server

Getting Started With Infrahub MCP Server

In this post, we’ll be looking at how to use the Infrahub MCP server. But, before we get there, we’ll go through some background on the Model Context Protocol (MCP) itself, show a simple example to explain how it works, and then connect it back to Infrahub. This will give us the basics before moving on to the Infrahub-specific setup. Here’s what we’ll cover:

  • A quick background on what MCP is and why it’s needed
  • A simple example of an MCP server to show how it works
  • How to connect an MCP server to host applications like Claude Desktop and Cursor
  • Setting up and using the Infrahub MCP server
  • Example use cases where the Infrahub MCP server can help in real workflows
SPONSORED

Disclaimer – OpsMill has partnered with me for this post, and they also support my blog as a sponsor. The post is originally published under https://opsmill.com/blog/getting-started-infrahub-mcp-server/

What is Model Context Protocol (MCP)?

If you're doing anything with AI (and honestly, who isn’t these days), you’ve probably heard of Model Context Protocol, or MCP. Anthropic introduced MCP in November 2024, which means it hasn’t been around for long and is still evolving quickly.

MCP is a communication Continue reading

Pleasant Surprise: Google AI Overview

When I was writing a blog post, I needed a link to the netlab lab topology documentation, so I searched for “netlab lab topology” (I know I’m lazy, but it felt quicker than navigating the sidebar menu).

The AI overview I got was way too verbose, but it nailed the Key Concepts and How It Works well enough that I could just use them in the netlab README.md file. Maybe this AI thing is becoming useful after all ;)

Ultra Ethernet: Libfabric Resource Initialization

Introduction

Ultra Ethernet uses the libfabric communication framework to let endpoints interact with AI frameworks and, ultimately, with each other across GPUs. Libfabric provides a high-performance, low-latency API that hides the details of the underlying transport, so AI frameworks do not need to manage the low-level details of endpoints, buffers, or the underlying address tables that map communication paths. This makes applications more portable across different fabrics while still providing access to advanced features such as zero-copy transfers and RDMA, which are essential for large-scale AI workloads.

During system initialization, libfabric coordinates with the appropriate provider—such as the UET provider—to query the network hardware and organize communication around three main objects: the Fabric, the Domain, and the Endpoint. Each object manages specific sub-objects and resources. For example, a Domain handles memory registration and hardware resources, while an Endpoint is associated with completion queues, transmit/receive buffers, and transport metadata. Ultra Ethernet maps these objects directly to the network hardware, ensuring that when GPUs begin exchanging training data, the communication paths are already aligned for low-latency, high-bandwidth transfers.

Once initialization is complete, AI frameworks issue standard libfabric calls to send and receive data. Ultra Ethernet ensures that this data flows efficiently across Continue reading

You don’t need quantum hardware for post-quantum security

Organizations have finite resources available to combat threats, both by the adversaries of today and those in the not-so-distant future that are armed with quantum computers. In this post, we provide guidance on what to prioritize to best prepare for the future, when quantum computers become powerful enough to break the conventional cryptography that underpins the security of modern computing systems.  We describe how post-quantum cryptography (PQC) can be deployed on your existing hardware to protect from threats posed by quantum computing, and explain why quantum key distribution (QKD) and quantum random number generation (QRNG) are neither necessary nor sufficient for security in the quantum age.

Are you quantum ready?

“Quantum” is becoming one of the most heavily used buzzwords in the tech industry. What does it actually mean, and why should you care?

At its core, “quantum” refers to technologies that harness principles of quantum mechanics to perform tasks that are not feasible with classical computers. Quantum computers have exciting potential to unlock advancements in materials science and medicine, but also pose a threat to computer security systems. The term Q-day refers to the day that adversaries possess quantum computers that are large and stable enough to Continue reading

Use Additional BGP Paths for IBGP Load Balancing

I wrote about the optimal BGP path selection with BGP additional paths in 2021, and I probably mentioned (in one of the 360 BGP-related blog posts) that you need it to implement IBGP load balancing in networks using BGP route reflectors. If you want to try that out, check out the IBGP Load Balancing with BGP Additional Paths lab exercise.

Click here to start the lab in your browser using GitHub Codespaces (or set up your own lab infrastructure). After starting the lab environment, change the directory to lb/4-ibgp-add-path and execute netlab up.

Connect and secure any private or public app by hostname, not IP — free for everyone in Cloudflare One

Connecting to an application should be as simple as knowing its name. Yet, many security models still force us to rely on brittle, ever-changing IP addresses. And we heard from many of you that managing those ever-changing IP lists was a constant struggle. 

Today, we’re taking a major step toward making that a relic of the past.

We're excited to announce that you can now route traffic to Cloudflare Tunnel based on a hostname or a domain. This allows you to use Cloudflare Tunnel to build simple zero-trust and egress policies for your private and public web applications without ever needing to know their underlying IP. This is one more step on our mission to strengthen platform-wide support for hostname- and domain-based policies in the Cloudflare One SASE platform, simplifying complexity and improving security for our customers and end users. 

Grant access to applications, not networks

In August 2020, the National Institute of Standards (NIST) published Special Publication 800-207, encouraging organizations to abandon the "castle-and-moat" model of security (where trust is established on the basis of network location) and move to a Zero Trust model (where we “verify anything and everything attempting to establish access").

Continue reading

Arista EOS Hates a Routing Instance with No Interfaces

I always ask engineers reporting a netlab bug to provide a minimal lab topology that would reproduce the error, sometimes resulting in “interesting” side effects. For example, I was trying to debug a BGP-related Arista EOS issue using a netlab topology similar to this one:

defaults.device: eos
module: [ bgp ]
nodes:
  a: { bgp.as: 65000 }
  b: { bgp.as: 65001 }

Imagine my astonishment when the two switches failed to configure BGP. Here’s the error message I got when running the netlab’s deploy device configurations Ansible playbook:

The RUM Diaries: enabling Web Analytics by default

Measuring and improving performance on the Internet can be a daunting task because it spans multiple layers: from the user’s device and browser, to DNS lookups and the network routes, to edge configurations and origin server location. Each layer introduces its own variability such as last-mile bandwidth constraints, third-party scripts, or limited CPU resources, that are often invisible unless you have robust observability tooling in place. Even if you gather data from most of these Internet hops, performance engineers still need to correlate different metrics like front-end events, network processing times, and server-side logs in order to pinpoint where and why elusive “latency” occurs to understand how to fix it.

We want to solve this problem by providing a powerful, in-depth monitoring solution that helps you debug and optimize applications, so you can understand and trace performance issues across the Internet, end to end.

That’s why we’re excited to announce the start of a major upgrade to Cloudflare’s performance analytics suite: Web Analytics as part of our real user monitoring (RUM) tools will soon be combined with network-level insights to help you pinpoint performance issues anywhere on a packet’s journey — from a visitor’s browser, through Cloudflare’s network, to your Continue reading

TCG058: Creating the Internet Layer That Should Have Been With Avery Pennarun

In this deep dive episode, we explore the evolution of networking with Avery Pennarun, Co-Founder and CEO of Tailscale. Avery shares his extensive journey through VPN technologies, from writing his first mesh VPN protocol in 1997 called “Tunnel Vision” to building Tailscale, a zero-trust networking solution. We discuss how Tailscale reimagines the OSI stack by... Read more »

NAN100: A Retrospective On 100 Episodes of Network Automation Nerds

Network Automation Nerds has reached a special milestone: episode 100! Eric Chou looks back on 5 years of conversations with network automation pioneers, practitioners, and visionaries. Drew Conry-Murray from the Packet Pushers joins Eric, along with online guest Ioannis Theodoridis, to find out why Eric started the podcast, his goals for all these conversations, a... Read more »