⚠️ WARNING ⚠️ This blog post contains graphic depictions of probability. Reader discretion is advised.
Measuring performance is tricky. You have to think about accuracy and precision. Are your sampling rates high enough? Could they be too high?? How much metadata does each recording need??? Even after all that, all you have is raw data. Eventually for all this raw performance information to be useful, it has to be aggregated and communicated. Whether it's in the form of a dashboard, customer report, or a paged alert, performance measurements are only useful if someone can see and understand them.
This post is a collection of things I've learned working on customer performance escalations within Cloudflare and analyzing existing tools (both internal and commercial) that we use when evaluating our own performance. A lot of this information also comes from Gil Tene's talk, How NOT to Measure Latency. You should definitely watch that too (but maybe after reading this, so you don't spoil the ending). I was surprised by my own blind spots and which assumptions turned out to be wrong, even though they seemed "obviously true" at the start. I expect I am not alone in these regards. For that Continue reading
netlab uses point-to-point links provided by the underlying virtualization software to implement links with two nodes and Linux bridges to implement links with more than two nodes connected to them. That’s usually OK if you don’t care about the bridge implementation details, but what if you’d like to use a bridge (or a layer-2 switch if you happen to be of marketing persuasion) you’re familiar with?
You could always implement a bridged segment with a set of links connecting edge nodes to a VLAN-capable device. For example, you could use the following topology to connect two Linux hosts through a bridge running Arista EOS:
IPv4 addresses have become a costly commodity, driven by their growing scarcity. With the original pool of 4.3 billion addresses long exhausted, organizations must now rely on the secondary market to acquire them. Over the years, prices have surged, often exceeding $30–$50 USD per address, with costs varying based on block size and demand. Given the scarcity, these prices are only going to rise, particularly for businesses that haven’t transitioned to IPv6. This rising cost and limited availability have made efficient IP address management more critical than ever. In response, we’ve evolved how we handle BYOIP (Bring Your Own IP) prefixes to give customers greater flexibility.
Historically, when customers onboarded a BYOIP prefix, they were required to assign it to a single service, binding all IP addresses within that prefix to one service before it was advertised. Once set, the prefix's destination was fixed — to direct traffic exclusively to that service. If a customer wanted to use a different service, they had to onboard a new prefix or go through the cumbersome process of offboarding and re-onboarding the existing one.
As a step towards addressing this limitation, we’ve introduced a new level of flexibility: customers can Continue reading
I was considering an AI add-on that would have access to the netlab documentation and help you figure out how to use it for a few years, but never got around to implementing it (and surprisingly, with all the AI hype out there, there were no volunteers submitting pull requests). A few weeks ago, someone suggested adding an MCP server as an interface to ipSpace.net content, but the discussion quickly devolved into vague ideas.
However, as ChatGPT now has access to the live Internet, I decided to try out whether it can get the job done with a bit of prompting.
TL&DR: After a hiccup, it worked surprisingly well.
I’ve worked with Cisco, Arista, and Juniper switches most of my life, but when I first started using UniFi switches in my homelab, I found myself a bit confused. The way VLANs are configured on UniFi switches is slightly different from what I was used to. In this post, I’ll go through how to configure VLANs on UniFi switches, specifically focusing on the USW-Pro-Max-16 and USW-Lite-8 models.
VLAN stands for Virtual LAN, and it's a way to logically segment a network, even if all devices are connected to the same physical switch. Different vendors use slightly different terms when it comes to VLAN port types. For example, Cisco calls them access and trunk ports, while others might refer to them as untagged and tagged ports.
An untagged (or access) port is typically used to connect end devices like PCs or printers. These devices have no awareness of VLANs, they just send regular Ethernet frames. When the switch receives a frame on an access port, it tags it with the VLAN configured for that port before forwarding it internally or out via a trunk port.
Tagged (or trunk) ports are used between switches or to other Continue reading
In an era where digital threats evolve faster than ever, cybersecurity isn't just a back-office concern — it's a critical business priority. At Cloudflare, we understand the responsibility that comes with operating in a connected world. As part of our ongoing commitment to security and transparency, Cloudflare is proud to have joined the United States Cybersecurity and Infrastructure Security Agency’s (CISA) “Secure by Design” pledge in May 2024.
By signing this pledge, Cloudflare joins a growing coalition of companies committed to strengthening the resilience of the digital ecosystem. This isn’t just symbolic — it's a concrete step in aligning with cybersecurity best practices and our commitment to protect our customers, partners, and data.
A central goal in CISA’s Secure by Design pledge is promoting transparency in vulnerability reporting. This initiative underscores the importance of proactive security practices and emphasizes transparency in vulnerability management — values that are deeply embedded in Cloudflare’s Product Security program. We believe that openness around vulnerabilities is foundational to earning and maintaining the trust of our customers, partners, and the broader security community.
Transparency in vulnerability reporting is essential for building trust between companies and customers. In 2008, Continue reading
Several excellent books have been published over the past decade on Deep Learning (DL) and Datacenter Networking. However, I have not found a book that covers these topics together—as an integrated deep learning training system—while also highlighting the architecture of the datacenter network, especially the backend network, and the demands it must meet.
This book aims to bridge that gap by offering insights into how Deep Learning workloads interact with and influence datacenter network design.
Deep Learning is a subfield of Machine Learning (ML), which itself is a part of the broader concept of Artificial Intelligence (AI). Unlike traditional software systems where machines follow explicitly programmed instructions, Deep Learning enables machines to learn from data without manual rule-setting.
At its core, Deep Learning is about training artificial neural networks. These networks are mathematical models composed of layers of artificial neurons. Different types of networks suit different tasks—Convolutional Neural Networks (CNNs) for image recognition, and Large Language Models (LLMs) for natural language processing, to name a few.
Training a neural network involves feeding it labeled data and adjusting its internal parameters through a process called backpropagation. During the forward pass, the model Continue reading
With the rise of traffic from AI agents, what’s considered a bot is no longer clear-cut. There are some clearly malicious bots, like ones that DoS your site or do credential stuffing, and ones that most site owners do want to interact with their site, like the bot that indexes your site for a search engine, or ones that fetch RSS feeds.
Historically, Cloudflare has relied on two main signals to verify legitimate web crawlers from other types of automated traffic: user agent headers and IP addresses. The User-Agent
header allows bot developers to identify themselves, i.e. MyBotCrawler/1.1
. However, user agent headers alone are easily spoofed and are therefore insufficient for reliable identification. To address this, user agent checks are often supplemented with IP address validation, the inspection of published IP address ranges to confirm a crawler's authenticity. However, the logic around IP address ranges representing a product or group of users is brittle – connections from the crawling service might be shared by multiple users, such as in the case of privacy proxies and VPNs, and these ranges, often maintained by cloud providers, change over time.
Cloudflare will always try to block malicious bots, but Continue reading
Andrew Yourtchenko and Dr. Tony Przygienda left wonderful comments to my Screen Scraping in 2025 blog post, but unfortunately they prefer commenting on a closed platform with ephemeral content; the only way to make their thoughts available to a wider audience is by reposting them. Andrew first:
I keep saying CLI is an API. However, it is much simpler and an easier way to adapt to the changes, if these three conditions are met:
The Infrahub Python SDK allows you to interact with Infrahub programmatically and can be used to query, create, modify, and delete data. In a previous blog post, we looked at how to query data using the Python SDK and explored various examples, including filters, relationships, and how to retrieve related data.
Originally published under - https://www.opsmill.com/infrahub-python-sdk-create-modify-delete/
In this post, we’ll focus on how to create, modify, delete and upsert data using the SDK. We’ll walk through practical examples that show how to add new resources, update existing ones, and delete data from Infrahub.
Throughout this post, we’ll be using the Infrahub sandbox, which is freely available. The sandbox already has some data in it, so if you’d like to follow along or try this yourself, you can use it without needing to set up anything.
In the previous post, we covered the basics of using the Python SDK, including how to install it and set up the client object. If you’re new to the SDK, I recommend going back to that first article to start from the install.
To get started today, I’ve generated an API token on the Infrahub demo instance Continue reading