For much of the two-plus years since ChatGPT hit the market and kicked off the generative AI frenzy, the market tilted toward well-resourced hyperscalers like Google, Amazon Web Services, and Microsoft as well as Tier 2 cloud service providers, with powerful – and expensive – accelerators and massive large language models like Meta’s Llama with 405 billion parameters. …
⚠️ WARNING ⚠️ This blog post contains graphic depictions of probability. Reader discretion is advised.
Measuring performance is tricky. You have to think about accuracy and precision. Are your sampling rates high enough? Could they be too high?? How much metadata does each recording need??? Even after all that, all you have is raw data. Eventually for all this raw performance information to be useful, it has to be aggregated and communicated. Whether it's in the form of a dashboard, customer report, or a paged alert, performance measurements are only useful if someone can see and understand them.
This post is a collection of things I've learned working on customer performance escalations within Cloudflare and analyzing existing tools (both internal and commercial) that we use when evaluating our own performance. A lot of this information also comes from Gil Tene's talk, How NOT to Measure Latency. You should definitely watch that too (but maybe after reading this, so you don't spoil the ending). I was surprised by my own blind spots and which assumptions turned out to be wrong, even though they seemed "obviously true" at the start. I expect I am not alone in these regards. For that Continue reading
This will be probably last visualization example for a while because I stopped working with network visualizations for some time now. But I wanted to finish publishing some last examples … Read More
How far ahead should you plan, and what things belong in your strategic plan? Conventional wisdom holds that a 3-year planning horizon is “about right”–but in a period of rapid technical and geopolitical change (such as we’re arguably in right now) does that go too far out, particularly when agile methodologies recommend shorter action plans... Read more »
You could always implement a bridged segment with a set of links connecting edge nodes to a VLAN-capable device. For example, you could use the following topology to connect two Linux hosts through a bridge running Arista EOS:
There are many reasons why Nvidia is the hardware juggernaut of the AI revolution, and one of them, without question, is the NVLink memory sharing port that started out on its “Pascal” P100 GOU accelerators way back in 2016. …
Take a Network Break! Guest co-host Tom Hollingsworth steps in for Johna Johnson. We start with Google patching a significant Chrome vulnerability and de-elevating Chrome running with admin rights when it launches on Windows. On the news front, we discuss a report, unconfirmed as of recording time, that Arista is acquiring VeloCloud, then discuss Broadcom... Read more »
IPv4 addresses have become a costly commodity, driven by their growing scarcity. With the original pool of 4.3 billion addresses long exhausted, organizations must now rely on the secondary market to acquire them. Over the years, prices have surged, often exceeding $30–$50 USD per address, with costs varying based on block size and demand. Given the scarcity, these prices are only going to rise, particularly for businesses that haven’t transitioned to IPv6. This rising cost and limited availability have made efficient IP address management more critical than ever. In response, we’ve evolved how we handle BYOIP (Bring Your Own IP) prefixes to give customers greater flexibility.
Historically, when customers onboarded a BYOIP prefix, they were required to assign it to a single service, binding all IP addresses within that prefix to one service before it was advertised. Once set, the prefix's destination was fixed — to direct traffic exclusively to that service. If a customer wanted to use a different service, they had to onboard a new prefix or go through the cumbersome process of offboarding and re-onboarding the existing one.
As a step towards addressing this limitation, we’ve introduced a new level of flexibility: customers can Continue reading
AI is no longer on the horizon. It’s part of how people and products work today. And as AI finds its way into more business applications and processes, it can create new risks. On today’s Tech Bytes, sponsored by Palo Alto Networks, we talk about how Palo Alto Networks is addressing those risks so that... Read more »
I was considering an AI add-on that would have access to the netlab documentation and help you figure out how to use it for a few years, but never got around to implementing it (and surprisingly, with all the AI hype out there, there were no volunteers submitting pull requests). A few weeks ago, someone suggested adding an MCP server as an interface to ipSpace.net content, but the discussion quickly devolved into vague ideas.
However, as ChatGPT now has access to the live Internet, I decided to try out whether it can get the job done with a bit of prompting.
TL&DR: After a hiccup, it worked surprisingly well.
I’ve worked with Cisco, Arista, and Juniper switches most of my life, but when I first started using UniFi switches in my homelab, I found myself a bit confused. The way VLANs are configured on UniFi switches is slightly different from what I was used to. In this post, I’ll go through how to configure VLANs on UniFi switches, specifically focusing on the USW-Pro-Max-16 and USW-Lite-8 models.
Quick Recap on VLANs
VLAN stands for Virtual LAN, and it's a way to logically segment a network, even if all devices are connected to the same physical switch. Different vendors use slightly different terms when it comes to VLAN port types. For example, Cisco calls them access and trunk ports, while others might refer to them as untagged and tagged ports.
An untagged (or access) port is typically used to connect end devices like PCs or printers. These devices have no awareness of VLANs, they just send regular Ethernet frames. When the switch receives a frame on an access port, it tags it with the VLAN configured for that port before forwarding it internally or out via a trunk port.
Tagged (or trunk) ports are used between switches or to other Continue reading
If you are a neocloud – and there seem to be more of these popping up like mushrooms in a moist North Carolina spring in the mountains – then you are going to need a pricing edge and a niche offering to compete with the big clouds and rival neoclouds. …
While studying for the CCIE Service Provider certification, Andrew Ohanian assembled a workbook to help him prepare. It’s packed with lab exercises, and Andrew has turned it into a free Web resource that anyone can access. On today’s Heavy Networking, we talk with Andrew about what’s in the guide, the state of the CCIE SP,... Read more »
In an era where digital threats evolve faster than ever, cybersecurity isn't just a back-office concern — it's a critical business priority. At Cloudflare, we understand the responsibility that comes with operating in a connected world. As part of our ongoing commitment to security and transparency, Cloudflare is proud to have joined the United States Cybersecurity and Infrastructure Security Agency’s (CISA)“Secure by Design” pledge in May 2024.
By signing this pledge, Cloudflare joins a growing coalition of companies committed to strengthening the resilience of the digital ecosystem. This isn’t just symbolic — it's a concrete step in aligning with cybersecurity best practices and our commitment to protect our customers, partners, and data.
A central goal in CISA’s Secure by Design pledge is promoting transparency in vulnerability reporting. This initiative underscores the importance of proactive security practices and emphasizes transparency in vulnerability management — values that are deeply embedded in Cloudflare’s Product Security program. We believe that openness around vulnerabilities is foundational to earning and maintaining the trust of our customers, partners, and the broader security community.
Why transparency in vulnerability reporting matters
Transparency in vulnerability reporting is essential for building trust between companies and customers. In 2008, Continue reading
Michael Costello shares his career journey on today’s Total Network Operations. Currently on the Board of Directors at NANOG and a Distinguished Engineer at Saviynt, Michael talks about his early days learning the ropes as a junior network engineer, trying to start an ISP, his stint in graduate school, and a very interesting role at... Read more »
Our IPv6 Basics series continues with link-local addresses. Link-local addresses are unicast addresses used for addressing on a single link. The intent of link-local addresses is to let devices that may not have a router or global unicast address allocation mechanism still be able to communicate on a network segment. On today’s show we dig... Read more »
Several excellent books have been published over the past decade on Deep Learning (DL) and Datacenter Networking. However, I have not found a book that covers these topics together—as an integrated deep learning training system—while also highlighting the architecture of the datacenter network, especially the backend network, and the demands it must meet.
This book aims to bridge that gap by offering insights into how Deep Learning workloads interact with and influence datacenter network design.
So, what is Deep Learning?
Deep Learning is a subfield of Machine Learning (ML), which itself is a part of the broader concept of Artificial Intelligence (AI). Unlike traditional software systems where machines follow explicitly programmed instructions, Deep Learning enables machines to learn from data without manual rule-setting.
At its core, Deep Learning is about training artificial neural networks. These networks are mathematical models composed of layers of artificial neurons. Different types of networks suit different tasks—Convolutional Neural Networks (CNNs) for image recognition, and Large Language Models (LLMs) for natural language processing, to name a few.
Training a neural network involves feeding it labeled data and adjusting its internal parameters through a process called backpropagation. During the forward pass, the model Continue reading
With the rise of traffic from AI agents, what’s considered a bot is no longer clear-cut. There are some clearly malicious bots, like ones that DoS your site or do credential stuffing, and ones that most site owners do want to interact with their site, like the bot that indexes your site for a search engine, or ones that fetch RSS feeds.
Historically, Cloudflare has relied on two main signals to verify legitimate web crawlers from other types of automated traffic: user agent headers and IP addresses. The User-Agent header allows bot developers to identify themselves, i.e. MyBotCrawler/1.1. However, user agent headers alone are easily spoofed and are therefore insufficient for reliable identification. To address this, user agent checks are often supplemented with IP address validation, the inspection of published IP address ranges to confirm a crawler's authenticity. However, the logic around IP address ranges representing a product or group of users is brittle – connections from the crawling service might be shared by multiple users, such as in the case of privacy proxies and VPNs, and these ranges, often maintained by cloud providers, change over time.
Cloudflare will always try to block malicious bots, but Continue reading
Here is a question for you. What is harder to get right now: 1,665 of Nvidia’s “Blackwell” B200 GPU compute engines or 10 megawatts of power for a four year contract in the Northeast region of the United States? …
IT environments today have a passing resemblance to those from 15 or 20 years ago, when enterprise workloads mostly ran on industry standard servers connected through networks and into storage systems that were all contained within the four walls of a datacenter, where performance as the name of the game and was protected by a perimeter of security designed to keep the bad guys out. …