As far as we can tell, the export controls on crippled GPU compute engines announced by the US Department of Commerce back in April have had a disproportionately hard impact on AMD compared to Nvidia, as far as we can tell. …
China Export Controls Whack AMD Datacenter GPU Business was written by Timothy Prickett Morgan at The Next Platform.
If you’re a marketer, advertiser, or a business owner that runs your own website, there’s a good chance you’ve used Google tags in order to collect analytics or measure conversions. A Google tag is a single piece of code you can use across your entire website to send events to multiple destinations like Google Analytics and Google Ads.
Historically, the common way to deploy a Google tag meant serving the JavaScript payload directly from Google’s domain. This can work quite well, but can sometimes impact performance and accurate data measurement. That’s why Google developed a way to deploy a Google tag using your own first-party infrastructure using server-side tagging. However, this server-side tagging required deploying and maintaining a separate server, which comes with a cost and requires maintenance.
That’s why we’re excited to be Google’s launch partner and announce our direct integration of Google tag gateway for advertisers, providing many of the same performance and accuracy benefits of server-side tagging without the overhead of maintaining a separate server.
Any domain proxied through Cloudflare can now serve your Google tags directly from that domain. This allows you to get better measurement signals for your website and can enhance your Continue reading
Nvidia co-founder and chief executive officer Jensen Huang did not do his OEM and ODM partners, who are the company’s main route to bring the infrastructure underpinning GPU systems to market, any favors when he suggested its “Hopper” GPU platforms would be blown away by their “Blackwell” kickers. …
Supermicro Hiccups On Hopper, Pulls $40 Billion Guidance For Fiscal 2026 was written by Timothy Prickett Morgan at The Next Platform.
After inspecting the confusing bridging/routing/switching terminology and a brief detour into the control/data plane details, let’s talk about how packets actually move across a network.
As always, things were simpler when networks were implemented with a single cable. In that setup, all nodes were directly reachable, and the only challenge was figuring out the destination node’s address; it didn’t matter whether it was a MAC address, an IP address, or a Fiber Channel address. On a single cable, you could just broadcast, like, “Who has this service?” and someone would reply, “I’m the printer you’re looking for.” That’s how many early non-IP protocols operated.
At Cloudflare, we do everything we can to avoid interruption to our services. We frequently deploy new versions of the code that delivers the services, so we need to be able to restart the server processes to upgrade them without missing a beat. In particular, performing graceful restarts (also known as "zero downtime") for UDP servers has proven to be surprisingly difficult.
We've previously written about graceful restarts in the context of TCP, which is much easier to handle. We didn't have a strong reason to deal with UDP until recently — when protocols like HTTP3/QUIC became critical. This blog post introduces udpgrm, a lightweight daemon that helps us to upgrade UDP servers without dropping a single packet.
Here's the udpgrm GitHub repo.
In the early days of the Internet, UDP was used for stateless request/response communication with protocols like DNS or NTP. Restarts of a server process are not a problem in that context, because it does not have to retain state across multiple requests. However, modern protocols like QUIC, WireGuard, and SIP, as well as online games, use stateful flows. So what happens to the state associated with a flow when a server process is Continue reading
Dr. Tony Przygienda left a very valid (off-topic) comment to my Breaking APIs or Data Models Is a Cardinal Sin blog post:
If, on the other hand, the customers would not camp for literally tens of years on regex scripts scraping screens, lots of stuff could progress much faster.
He’s right, particularly from Juniper’s perspective; they were the first vendor to use a data-driven approach to show commands. Unfortunately, we’re still not living in a perfect world:
For decades, discussions around quantum computing has felt similar to family driving vacations, with someone in the back seat constantly asking “are we there yet?” …
Cisco Pulls Together A Quantum Network Architecture was written by Jeffrey Burt at The Next Platform.
In a previous blog post, I described how OSPF route selection rules prevent a summary LSA from being inserted back into an area from which it was generated. That works nicely for area prefixes turned directly into summary LSAs, but how does the loop prevention logic work for summarized prefixes (what OSPF calls area ranges)?
TL&DR: It doesn’t, unless… ;)
Today the following network operating systems include support for the drop notification extension in their sFlow agent implementations:
Two additional sFlow dropped packet notification implementations are in the pipeline and should be available later this year:
Has your browsing experience ever been disrupted by this error page? Sometimes Cloudflare returns "Error 500" when our servers cannot respond to your web request. This inability to respond could have several potential causes, including problems caused by a bug in one of the services that make up Cloudflare's software stack.
We know that our testing platform will inevitably miss some software bugs, so we built guardrails to gradually and safely release new code before a feature reaches all users. Health Mediated Deployments (HMD) is Cloudflare’s data-driven solution to automating software updates across our global network. HMD works by querying Thanos, a system for storing and scaling Prometheus metrics. Prometheus collects detailed data about the performance of our services, and Thanos makes that data accessible across our distributed network. HMD uses these metrics to determine whether new code should continue to roll out, pause for further evaluation, or be automatically reverted to prevent widespread issues.
Cloudflare engineers configure signals from their service, such as alerting rules or Service Level Objectives (SLOs). For example, the following Service Level Indicator (SLI) checks the rate of HTTP 500 errors over 10 minutes returned from a service in our software stack.
sum(rate(http_request_count{code="500"}[10m])) Continue reading