Archive

Category Archives for "Networking"

Ultra Ethernet: Event Queue

Event Queue Creation (fi_eq_open)


Phase 1: Application – Request & Definition


The purpose of this phase is to specify the type, size, and capabilities of the Event Queue (EQ) your application needs. Event queues are used to report events associated with control operations. They can be linked to memory registration, address vectors, connection management, and fabric- or domain-level events. Reported events are either associated with a requested operation or affiliated with a call that registers for specific types of events, such as listening for connection requests. By preparing a struct fi_eq_attr, the application describes exactly what it needs so the provider can allocate the EQ properly.

In addition to basic properties like .size (number of events the queue can hold) and .wait_obj (how the application waits for events), the .flags field can request specific EQ capabilities. Common flags include:


  • FI_WRITE: Requests support for user-inserted events via fi_eq_write(). If this flag is set, the provider must allow the application to invoke fi_eq_write().
  • FI_REMOTE_WRITE: Requests support for remote write completions being reported to this EQ.
  • FI_RMA: Requests support for Remote Memory Access events (e.g., RMA completions) to be delivered to this EQ.

Flags are encoded as a bitmask, so multiple Continue reading

Changing the Layout of netlab Topology Graphs

After I published the updated netlab topology graphs article, Samuel K. Lam quickly made a comment along the lines of now we know how the graph representing the following topology was made, adding a nice ASCII art that illustrated the point I was trying to make much better than my graphs:

ASCII art representing the BGP leak lab

ASCII art representing the BGP leak lab

Let’s see how close we can get to that ideal topology diagram with GraphViz and D2 graphs.

Geolocation and Starlink

"Where are you?" is not an easy question to answer on the Internet. The Internet did not adopt a geographic address plan which means that you are going to need a lot of additional information if you want to map an IP address into a location at the level of a country or a city. But what can you do about a satellite service that Provides Internet access for ships at sea aircraft flying international routes?

NB545: CISA Orders Immediate Patch of Cisco Vulnerabilities; Firewall Upgrade Blocks Emergency Calls

There’s an abundance of vulnerabilities in this week’s Network Break. We start with a red alert on a cluster of Cisco vulnerabilities in its firewall and threat defense products. On the news front, the vulnerability spotlight stays on Cisco as the US Cybersecurity and Infrastructure Security Agency (CISA) issues an emergency directive to all federal... Read more »

15 years of helping build a better Internet: a look back at Birthday Week 2025

Cloudflare launched fifteen years ago with a mission to help build a better Internet. Over that time the Internet has changed and so has what it needs from teams like ours.  In this year’s Founder’s Letter, Matthew and Michelle discussed the role we have played in the evolution of the Internet, from helping encryption grow from 10% to 95% of Internet traffic to more recent challenges like how people consume content. 

We spend Birthday Week every year releasing the products and capabilities we believe the Internet needs at this moment and around the corner. Previous Birthday Weeks saw the launch of IPv6 gateway in 2011,  Universal SSL in 2014, Cloudflare Workers and unmetered DDoS protection in 2017, Cloudflare Radar in 2020, R2 Object Storage with zero egress fees in 2021,  post-quantum upgrades for Cloudflare Tunnel in 2022, Workers AI and Encrypted Client Hello in 2023. And those are just a sample of the launches.

This year’s themes focused on helping prepare the Internet for a new model of monetization that encourages great content to be published, fostering more opportunities to build community both inside and outside of Cloudflare, and evergreen missions like making more features available to Continue reading

Android Phones Might Ask for /64 Delegated Prefix

I’m too old to be fighting with windmills, but sometimes I have to get a rant off my chest. This one was triggered by the latest episode of the hilarious1 “DHCPv6 on Android” soap opera


In a 720-degree turnaround, Android 11 supports DHCPv6, but only for prefix delegation purposes. Yes, you got it right, in a year or two, every phone might want to have a dedicated /64 prefix assigned to it on WiFi segments2.

Want more details? Well, there’s a high-level overview published on the Android Developers blog and a corresponding message sent to the v6ops mailing list. Let’s see how much sense that makes.

Ultra Ethernet: Domain Creation Process in Libfabric

Creating a domain object is the step where the application establishes a logical context for a NIC within a fabric, enabling endpoints, completion queues, and memory regions to be created and managed consistently.

Phase 1: Application (Discovery & choice — selecting a domain snapshot)

During discovery, the provider had populated one or more fi_info entries — each entry was a snapshot describing one possible NIC/port/transport combination. Each fi_info contained nested attribute structures for fabric, domain, and endpoint: fi_fabric_attr, fi_domain_attr, and fi_ep_attr. The fi_domain_attr substructure captured the domain-level template the provider had reported during discovery (memory registration modes, MR key sizes, counts and limits, capability and mode bitmasks, CQ/CTX limits, authentication key sizes, etc.).

When the application had decided which NIC/port it wanted to use, it selected a single fi_info entry whose fi_domain_attr matched its needs. That chosen fi_info became the authoritative configuration for domain creation, containing both the application’s requested settings and the provider-reported capabilities. At this phase, the application moved forward from fabric initialization to domain creation.

To create the domain, the application called the fi_domain function:


API Call → Create Domain object

    Within Fabric ID: 0xF1DFA01

    Using fi_info structure: 0xCAFE43E

    On Continue reading

TNO043: Under the Manhole Cover: The Architecture of an Internet Exchange

In an IT world full of abstraction, overlays, and virtualization, it’s important to remember the physical infrastructure that supports all those things. So let’s get inside Mass IX, the Massachusetts Internet Exchange, to get a holistic view of the logical architecture and protocol mechanics of peering and Internet exchanges, as well as the iron, steel,... Read more »

An AI Index for all our customers

Today, we’re announcing the private beta of AI Index for domains on Cloudflare, a new type of web index that gives content creators the tools to make their data discoverable by AI, and gives AI builders access to better data for fair compensation.

With AI Index enabled on your domain, we will automatically create an AI-optimized search index for your website, and expose a set of ready-to-use standard APIs and tools including an MCP server, LLMs.txt, and a search API. Our customers will own and control that index and how it’s used, and you will have the ability to monetize access through Pay per crawl and the new x402 integrations. You will be able to use it to build modern search experiences on your own site, and more importantly, interact with external AI and Agentic providers to make your content more discoverable while being fairly compensated.

For AI builders—whether developers creating agentic applications, or AI platform companies providing foundational LLM models—Cloudflare will offer a new way to discover and retrieve web content: direct pub/sub connections to individual websites with AI Index. Instead of indiscriminate crawling, builders will be able to subscribe to specific sites that have opted in for Continue reading

Introducing Observatory and Smart Shield — see how the world sees your website, and make it faster in one click

Modern users expect instant, reliable web experiences. When your application is slow, they don’t just complain — they leave. Even delays as small as 100 ms have been shown to have a measurable impact on revenue, conversions, bounce rate, engagement and more

If you’re responsible for delivering on these expectations to the users of your product, you know there are many monitoring tools that show you how visitors experience your website, and can let you know when things are slow or causing issues. This is essential, but we believe understanding the condition is only half the story. The real value comes from integrating monitoring and remedies in the same view, giving customers the ability to quickly identify and resolve issues.

That's why today, we're excited to launch the new and improved Observatory, now in open beta. This monitoring and observability tool goes beyond charts and graphs, by also telling you exactly how to improve your application's performance and resilience, and immediately showing you the impact of those changes. And we’re releasing it to all subscription tiers (including Free!), available today.

But wait, there’s more! To make your users’ experience in Cloudflare even faster, we’re launching Smart Shield, Continue reading

Monitoring AS-SETs and why they matter

Introduction to AS-SETs

An AS-SET, not to be confused with the recently deprecated BGP AS_SET, is an Internet Routing Registry (IRR) object that allows network operators to group related networks together. AS-SETs have been used historically for multiple purposes such as grouping together a list of downstream customers of a particular network provider. For example, Cloudflare uses the AS13335:AS-CLOUDFLARE AS-SET to group together our list of our own Autonomous System Numbers (ASNs) and our downstream Bring-Your-Own-IP (BYOIP) customer networks, so we can ultimately communicate to other networks whose prefixes they should accept from us. 

In other words, an AS-SET is currently the way on the Internet that allows someone to attest the networks for which they are the provider. This system of provider authorization is completely trust-based, meaning it's not reliable at all, and is best-effort. The future of an RPKI-based provider authorization system is coming in the form of ASPA (Autonomous System Provider Authorization), but it will take time for standardization and adoption. Until then, we are left with AS-SETs.

Because AS-SETs are so critical for BGP routing on the Internet, network operators need to be able to monitor valid and invalid AS-SET memberships for Continue reading

Cloudflare just got faster and more secure, powered by Rust

Cloudflare is relentless about building and running the world’s fastest network. We have been tracking and reporting on our network performance since 2021: you can see the latest update here.

Building the fastest network requires work in many areas. We invest a lot of time in our hardware, to have efficient and fast machines. We invest in peering arrangements, to make sure we can talk to every part of the Internet with minimal delay. On top of this, we also have to invest in the software we run our network on, especially as each new product can otherwise add more processing delay.

No matter how fast messages arrive, we introduce a bottleneck if that software takes too long to think about how to process and respond to requests. Today we are excited to share a significant upgrade to our software that cuts the median time we take to respond by 10ms and delivers a 25% performance boost, as measured by third-party CDN performance tests.

We've spent the last year rebuilding major components of our system, and we've just slashed the latency of traffic passing through our network for millions of our customers. At the same time, we've made our Continue reading

Network performance update: Birthday Week 2025

We are committed to being the fastest network in the world because improvements in our performance translate to improvements for the own end users of your application. We are excited to share that Cloudflare continues to be the fastest network for the most peered networks in the world.

We relentlessly measure our own performance and our performance against peers. We publish those results routinely, starting with our first update in June 2021 and most recently with our last post in September 2024.

Today’s update breaks down where we have improved since our update last year and what our priorities are going into the next year. While we are excited to be the fastest in the greatest number of last-mile ISPs, we are never done improving and have more work to do.

How do we measure this metric, and what are the results?

We measure network performance by attempting to capture what the experience is like for Internet users across the globe. To do that we need to simulate what their connection is like from their last-mile ISP to our networks.

We start by taking the 1,000 largest networks in the world based on estimated population. We use that to give Continue reading

Eliminating Cold Starts 2: shard and conquer

Five years ago, we announced that we were Eliminating Cold Starts with Cloudflare Workers. In that episode, we introduced a technique to pre-warm Workers during the TLS handshake of their first request. That technique takes advantage of the fact that the TLS Server Name Indication (SNI) is sent in the very first message of the TLS handshake. Armed with that SNI, we often have enough information to pre-warm the request’s target Worker.

Eliminating cold starts by pre-warming Workers during TLS handshakes was a huge step forward for us, but “eliminate” is a strong word. Back then, Workers were still relatively small, and had cold starts constrained by limits explained later in this post. We’ve relaxed those limits, and users routinely deploy complex applications on Workers, often replacing origin servers. Simultaneously, TLS handshakes haven’t gotten any slower. In fact, TLS 1.3 only requires a single round trip for a handshake – compared to three round trips for TLS 1.2 – and is more widely used than it was in 2021.

Earlier this month, we finished deploying a new technique intended to keep pushing the boundary on cold start reduction. The new technique (or old, depending on your perspective) uses Continue reading

Code Mode: the better way to use MCP

It turns out we've all been using MCP wrong.

Most agents today use MCP by directly exposing the "tools" to the LLM.

We tried something different: Convert the MCP tools into a TypeScript API, and then ask an LLM to write code that calls that API.

The results are striking:

  1. We found agents are able to handle many more tools, and more complex tools, when those tools are presented as a TypeScript API rather than directly. Perhaps this is because LLMs have an enormous amount of real-world TypeScript in their training set, but only a small set of contrived examples of tool calls.

  2. The approach really shines when an agent needs to string together multiple calls. With the traditional approach, the output of each tool call must feed into the LLM's neural network, just to be copied over to the inputs of the next call, wasting time, energy, and tokens. When the LLM can write code, it can skip all that, and only read back the final results it needs.

In short, LLMs are better at writing code to call MCP, than at calling MCP directly.

What's MCP?

For those that aren't familiar: Model Context Protocol is a standard protocol Continue reading

Introducing new regional Internet traffic and Certificate Transparency insights on Cloudflare Radar

Since launching during Birthday Week in 2020, Radar has announced significant new capabilities and data sets during subsequent Birthday Weeks. We continue that tradition this year with a two-part launch, adding more dimensions to Radar’s ability to slice and dice the Internet.

First, we’re adding regional traffic insights. Regional traffic insights bring a more localized perspective to the traffic trends shown on Radar.

Second, we’re adding detailed Certificate Transparency (CT) data, too. The new CT data builds on the work that Cloudflare has been doing around CT since 2018, including Merkle Town, our initial CT dashboard.

Both features extend Radar's mission of providing deeper, more granular visibility into the health and security of the Internet. Below, we dig into these new capabilities and data sets.

Introducing regional Internet traffic insights on Radar

Cloudflare Radar initially launched with visibility into Internet traffic trends at a national level: want to see how that Internet shutdown impacted traffic in Iraq, or what IPv6 adoption looks like in India? It’s visible on Radar. Just a year and a half later, in March 2022, we launched Autonomous System (ASN) pages on Radar. This has enabled us to bring more granular visibility Continue reading

Ultra Ethernet: Fabric Creation Process in Libfabric

[edit Oct-8, 2025]
New version of this subject can be found here: Ultra Ethernet: Fabric Object - What it is and How it is created

Phase 1: Application (Discovery & choice)

After the UET provider populated fi_info structures for each NIC/port combination during discovery, the application can begin the object creation process. It first consults the in-memory fi_info list to identify the entry that best matches its requirements. Each fi_info contains nested attribute structures describing fabric, domain, and endpoint capabilities, including fi_fabric_attr (fabric name, provider identifier, version information), fi_domain_attr (memory registration mode, key details, domain capabilities), and fi_ep_attr (endpoint type, reliable versus unreliable semantics, size limits, and supported capabilities). The application examines the returned entries and selects the fi_info that satisfies its needs (for example: provider == "uet", fabric name == "UET", required capabilities, reliable transport, or a specific memory registration mode). The chosen fi_info then provides the attributes — effectively serving as hints — that the application passes into subsequent creation calls such as fi_fabric(), fi_domain(), and fi_endpoint(). Each fi_info acts as a self-contained “capability snapshot,” describing one possible combination of NIC, port, and transport mode.


Phase 2: Libfabric Core (dispatch & wiring)

When the application calls fi_fabric(), the Continue reading

How Cloudflare uses the world’s greatest collection of performance data to make the world’s fastest global network even faster

Cloudflare operates the fastest network on the planet. We’ve shared an update today about how we are overhauling the software technology that accelerates every server in our fleet, improving speed globally.

That is not where the work stops, though. To improve speed even further, we have to also make sure that our network swiftly handles the Internet-scale congestion that hits it every day, routing traffic to our now-faster servers.

We have invested in congestion control for years. Today, we are excited to share how we are applying a superpower of our network, our massive Free Plan user base, to optimize performance and find the best way to route traffic across our network for all our customers globally.

Early results have seen performance increases that average 10% faster than the prior baseline. We achieved this by applying different algorithmic methods to improve performance based on the data we observe about the Internet each day. We are excited to begin rolling out these improvements to all customers.

How does traffic arrive in our network?

The Internet is a massive collection of interconnected networks, each composed of many machines (“nodes”). Data is transmitted by breaking it up into small packets, and passing them Continue reading