Earlier this year, Pete Lumbis returned as an ipSpace.net webinar guest speaker with a great presentation describing data center switching ASICs from the perspective of networking engineers. After a brief intro, he started with ASIC Basics… a topic which generated a 25-minute Q&A session.
Several engineers formerly working for a large virtualization vendor were pretty upset with me when I claimed that the virtualization consultants promote “disaster recovery using stretched VLANs” designs instead of alternatives that would implement proper separation of failure domains.
Guess what… it’s even worse than I thought.
Here’s a sequence of comments I received after reposting one of my “disaster recovery doesn’t need stretched VLANs” blog posts on LinkedIn sometime in late 2019:
We did a number of Software Gone Wild podcasts trying to figure out whether smart NICs address a real need or whether it’s just another vendor attempt to explore all potential markets. As expected, we got opposing views from Luke Gorrie claiming a NIC should be as simple as possible to Silvano Gai explaining how dedicated hardware performs the same operations at lower cost, lower power consumption and way higher speeds.
In theory, there’s no doubt that Silvano is right. Just look at how expensive some router line cards are, and try to figure out how much it would cost to get 25.6 Tbps of forwarding performance that we’ll get in a single ASIC (Tomahawk-4) in software (assuming ~10 Gbps per CPU core). High-speed core packet forwarding has to be done in dedicated hardware.
When planning to move your workloads to a public cloud you might want to consider the minor detail of public IPv4 connectivity (I know of at least one public cloud venture that couldn’t get their business off the ground because they couldn’t get enough public IPv4 addresses).
Here’s a question along these lines that one of the attendees of our public cloud networking course sent me:
You might remember my occasional rants about lack of engineering in networking. A long while ago David Barroso nicely summarized the situation in a tweet responding to my BGP and Car Safety blog post:
If we were in a proper engineering we’d be discussing how to regulate and add safeties to an important tech that is unsafe and hard to operate. Instead, we blog about how to do crazy shit to it or how it’s a hot mess. Let’s be honest, if BGP was a car it’d be one pulled by horses.
Justin Pietsch published another must-read article, this time dealing with operational complexity of load balancers and IP multicast. Here are just a few choice quotes to get you started:
You might find what he learned useful the next time you’re facing a unicorn-colored slide deck from your favorite software-defined or intent-based vendor ;))
In December 2019 I finally turned my focus on business challenges first presentation into a short webinar session (part of Business Aspects of Networking Technologies webinar) starting with defining the problem before searching for a solution including three simple questions:
This is a guest blog post by Matthias Luft, Principal Platform Security Engineer @ Salesforce, and a regular ipSpace.net guest speaker.
Having spent my career in various roles in IT security, Ivan and I always bounced thoughts on the overlap between networking and security (and, more recently, Cloud/Container) around. One of the hot challenges on that boundary that regularly comes up in network/security discussions is the topic of this blog post: microsegmentation and host-based firewalls (HBFs).
Nadeem Lughmani created an excellent solution for the securing your cloud deployment hands-on exercise in our public cloud online course. His Terraform-based solution includes:
Last week I published an overview of how complex (networking-wise) Docker Swarm services can get. This time let’s focus on something that should have been way simpler: running container-based services on a single Linux host.
In the first part of this article I’m focusing on the basics, including exposed ports, and published ports. The behind-the-scenes details are coming in a week or so; in the meantime you can enjoy (most of them) in the Docker Networking Deep Dive webinar.
In June 2020 I published the first part of Redundant Server Connectivity in Layer-3-Only Fabrics article describing the target design and application-layer requirements.
During the summer I added the details of multi-subnet server and client connectivity and a few conclusions.
The last Fallacy of Distributed Computing I addressed in the introductory part of How Networks Really Work webinar was The Network Is Homogenous. No, it’s not and it never was… for more details watch this video.
One of the most annoying part in every training content development project was the ubiquitous question somewhere at the end of the process: “and now we’d need a few review questions”. I’m positive anyone ever involved in a similar project can feel the pain that question causes…
Writing good review questions requires a particularly devious state of mind, sometimes combined with “I would really like to get the answer to this one” (obviously you’d mark such questions as “needs further research”, and if you’re Donald Knuth the question would be “prove that P != NP").
Remember the claim that networking is becoming obsolete and that everyone else will simply bypass the networking teams (source)?
Good news for you – there are many fast growing overlay solutions that are adopted by apps and security teams and bypass the networking teams altogether.
That sounds awesome in a VC pitch deck. Let’s see how well that concept works out in reality using Docker Swarm as an example (Kubernetes is probably even worse).
Greg Ferro is back with some great technical content, this time explaining why hardware-based packet capture might return unexpected results.
Justin Pietsch published a fantastic recap of his experience running OSPF in AWS infrastructure. You MUST read what he wrote, here’s the TL&DR summary:
Dinesh Dutt made similar claims on one of our podcasts, and I wrote numerous blog posts on the same topic. Not that anyone would care or listen, it’s so much better to watch vendor slide decks full of latest unicorn dust… but in the end, it’s usually not the protocol that’s broken, but the network design.
The can we trust routing protocols series of blog posts I wrote in April 2020 (part 1, part 2, response from Jeff Tantsura) culminated in an interesting discussion with Russ White and Nick Russo now published as The Hedge Episode 43.
I got a question along these lines from a friend of mine:
Google recently announced a huge data center build in country to open new GCP regions. Does that mean I should invest into mastering GCP or should I focus on some other public cloud platform?
As always, the right answer is “it depends”, for example:
Brett Lykins published an excellent description of what an automation Minimum Viable Product could be.
Not surprisingly, he’s almost perfectly in sync with what we’ve been telling networking engineers in ipSpace.net Network Automation online course:
Here’s an interesting factoid: when using default IS-IS configuration (running L1 + L2 on all routers in your network), every router inserts every IP prefix from anywhere in your network into L2 topology… at least on Junos.
For more details read this article by Chris Parker.