Archive

Category Archives for "Networking"

netlab 2.0.0: Hosts, Bridges, and SRv6

netlab release 2.0.0 is out. I spent the whole week fixing bugs and running integration tests, so I’m too brain-dead to go into the details. These are the major features we added (more about them in a few days; the details are in the release notes):

Other changes include:

TNO028: Move From Monitoring to Full Internet Stack Observability: New Strategies for NetOps (Sponsored)

Network monitoring, Internet monitoring, and observability are all key components of NetOps. We speak with sponsor Catchpoint to understand how Catchpoint can help network operators proactively identify and resolve issues before they impact customers. We discuss past and current network monitoring strategies and the challenges that operators face with both on-prem and cloud monitoring, along... Read more »

TL013: The Process Communication Model: An Algorithm for Effective Communication

On this episode of Technically Leadership, we’re joined by Aleksandra Lemańska to learn about the Process Communication Model (PCM), a framework for enhancing communication. Alex calls PCM an algorithm for people, and it can be useful for improving interactions with engineers and technical folks operating in high-stress environments. We talk about how PCM works, understanding... Read more »

First-party tags in seconds: Cloudflare integrates Google tag gateway for advertisers

If you’re a marketer, advertiser, or a business owner that runs your own website, there’s a good chance you’ve used Google tags in order to collect analytics or measure conversions. A Google tag is a single piece of code you can use across your entire website to send events to multiple destinations like Google Analytics and Google Ads. 

Historically, the common way to deploy a Google tag meant serving the JavaScript payload directly from Google’s domain. This can work quite well, but can sometimes impact performance and accurate data measurement. That’s why Google developed a way to deploy a Google tag using your own first-party infrastructure using server-side tagging. However, this server-side tagging required deploying and maintaining a separate server, which comes with a cost and requires maintenance.

That’s why we’re excited to be Google’s launch partner and announce our direct integration of Google tag gateway for advertisers, providing many of the same performance and accuracy benefits of server-side tagging without the overhead of maintaining a separate server.  

Any domain proxied through Cloudflare can now serve your Google tags directly from that domain. This allows you to get better measurement signals for your website and can enhance your Continue reading

Forwarding Packets Across a Network

After inspecting the confusing bridging/routing/switching terminology and a brief detour into the control/data plane details, let’s talk about how packets actually move across a network.

As always, things were simpler when networks were implemented with a single cable. In that setup, all nodes were directly reachable, and the only challenge was figuring out the destination node’s address; it didn’t matter whether it was a MAC address, an IP address, or a Fiber Channel address. On a single cable, you could just broadcast, like, “Who has this service?” and someone would reply, “I’m the printer you’re looking for.” That’s how many early non-IP protocols operated.

Resilience in the RPKI

I would like to look at the ways in which the operators of the number Resource Public Key Infrastructure (RPKI) have deployed this infrastructure in a way that maximises its available and performance and hardens it against potential service interruptions, or in other words, an examination of the resilience of the RPKI infrastructure.

QUIC restarts, slow problems: udpgrm to the rescue

At Cloudflare, we do everything we can to avoid interruption to our services. We frequently deploy new versions of the code that delivers the services, so we need to be able to restart the server processes to upgrade them without missing a beat. In particular, performing graceful restarts (also known as "zero downtime") for UDP servers has proven to be surprisingly difficult.

We've previously written about graceful restarts in the context of TCP, which is much easier to handle. We didn't have a strong reason to deal with UDP until recently — when protocols like HTTP3/QUIC became critical. This blog post introduces udpgrm, a lightweight daemon that helps us to upgrade UDP servers without dropping a single packet.

Here's the udpgrm GitHub repo.

Historical context

In the early days of the Internet, UDP was used for stateless request/response communication with protocols like DNS or NTP. Restarts of a server process are not a problem in that context, because it does not have to retain state across multiple requests. However, modern protocols like QUIC, WireGuard, and SIP, as well as online games, use stateful flows. So what happens to the state associated with a flow when a server process is Continue reading

Screen Scraping in 2025

Dr. Tony Przygienda left a very valid (off-topic) comment to my Breaking APIs or Data Models Is a Cardinal Sin blog post:

If, on the other hand, the customers would not camp for literally tens of years on regex scripts scraping screens, lots of stuff could progress much faster.

He’s right, particularly from Juniper’s perspective; they were the first vendor to use a data-driven approach to show commands. Unfortunately, we’re still not living in a perfect world:

Multi-vendor support for dropped packet notifications


The sFlow Dropped Packet Notification Structures extension was published in October 2020. Extending sFlow to provide visibility into dropped packets offers significant benefits for network troubleshooting, providing real-time network wide visibility into the specific packets that were dropped as well the reason the packet was dropped. This visibility instantly reveals the root cause of drops and the impacted connections. Packet discard records complement sFlow's existing counter polling and packet sampling mechanisms and share a common data model so that all three sources of data can be correlated, for example, packet sampling reveals the top consumers of bandwidth on a link, helping to get to the root cause of congestion related packet drops reported for the link.

Today the following network operating systems include support for the drop notification extension in their sFlow agent implementations:

Two additional sFlow dropped packet notification implementations are in the pipeline and should be available later this year:

CNCF and Synadia Reach an Agreement on NATS

Last month, Synadia, the primary maintainer of the NATS messaging system, tried to withdraw NATS from the open source governance of Cloud Native Computing Foundation (CNCF). Its motive was to try to profit from NATS by Synadia had previously donated NATS to the Cloud Native Computing Foundation (CNCF) in 2018. Now, the Cloud Native Computing Foundation (CNCF) and NATS project will continue in the CNCF’s cloud native open source ecosystem with Synadia’s continued support and involvement. A spokesperson for Synadia did not immediately respond to a TNS request for comment. Not So Fast Synadia had planned to regain control of the