Amazon Web Services may not be the first of the hyperscalers and cloud builders to create its own custom compute engines, but it has been hot on the heels of Google, which started using its homegrown TPU accelerators for AI workloads in 2015. …
Broadcom’s latest Trident ASIC will include a neural net inference engine on the chip that can analyze traffic and take action in the packet pipeline, but it’s up to customers to build rules and signatures based on their own training data. Broadcom has also announced it will lay off approximately 1,300 VMware employees. Identity provider... Read more »
Today's Network Break discusses a new Trident ASIC with an on-chip neural net inference engine, Broadcom staff cuts at VMware, more bad news from an Okta breach, financial results, and more.
Some friends shared a Reddit post the other day that made me both shake my head and ponder the state of the networking industry. Here is the locked post for your viewing pleasure. It was locked because the comments were going to devolve into a mess eventually. The person making the comment seems to be honest and sincere in their approach to “layer 3 going away”. The post generated a lot of amusement from the networking side of IT about how this person doesn’t understand the basics but I think there’s a deeper issue going on.
Trails To Nowhere
Our visibility of the state of the network below the application interface is very general in today’s world. That’s because things “just work” to borrow an overused phrase. Aside from the occasional troubleshooting exercise to find out why packets destined for Azure or AWS are failing along the way when is the last time you had to get really creative in finding a routing issue in someone else’s equipment? We spend more time now trying to figure out how to make our own networks operate efficiently and less time worrying about what happens to the packets when they leave our organization. Continue reading
On the 26th of January, I’ll be teaching a webinar over at Safari Books Online (subscription service) called Modern Network Troubleshooting. From the blurb:
The first section of this class considers the nature of resilience, and how design tradeoffs result in different levels of resilience. The class then moves into a theoretical understanding of failures, how network resilience is measured, and how the Mean Time to Repair (MTTR) relates to human and machine-driven factors. One of these factors is the unintended consequences arising from abstractions, covered in the next section of the class.
The class then moves into troubleshooting proper, examining the half-split formal troubleshooting method and how it can be combined with more intuitive methods. This section also examines how network models can be used to guide the troubleshooting process. The class then covers two examples of troubleshooting reachability problems in a small network, and considers using ChaptGPT and other LLMs in the troubleshooting process. A third, more complex example is then covered in a data center fabric.
What happens when you let a bunch of people work on different aspects of a solution without them ever talking to each other? You get DNS over IPv6. As nicely explained by Geoff Huston, this is just one of the bad things that could happen:
What happens when you let a bunch of people work on different aspects of a solution without them ever talking to each other? You get DNS over IPv6. As nicely explained by Geoff Huston, this is just one of the bad things that could happen:
Around 30 years after we got the first website, the powers that be realized it might make sense to put this is how you access a web server information (including its IPv4 and IPv6 address, and HTTP(S) support information) directly into DNS, using HTTPS Resource Records. It took us long enough 🤷♂️
Around 30 years after we got the first website, the powers that be realized it might make sense to put this is how you access a web server information (including its IPv4 and IPv6 address, and HTTP(S) support information) directly into DNS, using HTTPS Resource Records. It took us long enough 🤷♂️
Server makers Dell, Hewlett Packard Enterprise, and Lenovo, who are the three largest original manufacturers of systems in the world, ranked in that order, are adding to the spectrum of interconnects they offer to their enterprise customers. …
Public clouds abstract away much of the nitty-gritty work that goes into provisioning infrastructure, including networking. Application teams can quickly connect resources and deploy applications without having to know much about the plumbing that links everything together. When they compare the public cloud experience to standing up applications in an on-prem data center, the on-prem... Read more »
Today on Heavy Networking, sponsored by Juniper, we’ll talk about how Juniper’s Apstra software can help you operate your on-prem data center more like a public cloud; meaning service provisioning that’s repeatable, standardized, and straightforward to consume. We’ll also talk about how Apstra now works with Terraform to help streamline network self-service.
Many have waited years to hear someone like Prahlad Venkatapuram, Senior Director of Engineering at Meta, say what came out this week at the RISC-V Summit:
“We’ve identified that RISC-V is the way to go for us moving forward for all the products we have in the roadmap. …
Two years ago, Cloudflare undertook a significant upgrade to our compute server hardware as we deployed our cutting-edge 11th Generation server fleet, based on AMD EPYC Milan x86 processors. It's nearly time for another refresh to our x86 infrastructure, with deployment planned for 2024. This involves upgrading not only the processor itself, but many of the server's components. It must be able to accommodate the GPUs that drive inference on Workers AI, and leverage the latest advances in memory, storage, and security. Every aspect of the server is rigorously evaluated — including the server form factor itself.
One crucial variable always in consideration is temperature. The latest generations of x86 processors have yielded significant leaps forward in performance, with the tradeoff of higher power draw and heat output. In this post we will explore this trend, and how it informed our decision to adopt a new physical footprint for our next-generation fleet of servers.
In preparation for the upcoming refresh, we conducted an extensive survey of the x86 CPU landscape. AMD recently introduced its latest offerings: Genoa, Bergamo, and Genoa-X, featuring the power of their innovative Zen 4 architecture. At the same time, Intel unveiled Sapphire Rapids as Continue reading