A few years ago, we were fortunate enough to have Pete Lumbis talking about ASICs for Networking Engineers as part of the Data Center Fabric Architectures webinar.
One of the topics he couldn’t possibly skip was the question of how many packet buffers one needs in a data center switch.
One of my readers sent me a question along these lines:
How do we determine the number of spines needed in a leaf-and-spine fabric? It’s easy to calculate the number of leaf nodes from the required number of server ports, and two spines give you the redundancy. Does it make sense to have more spines if two are good enough from the capacity perspective?
There are at least two factors to consider:
One of my readers sent me a question along these lines:
How do we determine the number of spines needed in a leaf-and-spine fabric? It’s easy to calculate the number of leaf nodes from the required number of server ports, and two spines give you the redundancy. Does it make sense to have more spines if two are good enough from the capacity perspective?
There are at least two factors to consider:
A senior engineer at Juniper Networks wasn’t happy with me mentioning resource hogs and Junos platforms in the same statement. Instead of engaging in never-ending angels dancing on pins deliberations comparing the virtues of Junos with other network operating systems, I decided to throw a bit of real-life data into the mix – I created a simple script that measures:
A senior engineer at Juniper Networks wasn’t happy with me mentioning resource hogs and Junos platforms in the same statement. Instead of engaging in never-ending angels dancing on pins deliberations comparing the virtues of Junos with other network operating systems, I decided to throw a bit of real-life data into the mix – I created a simple script that measures:
Ish wrote an interesting comment on my Network Automation Expert Beginners blog post. He started with:
[Our network has] about 40 sites, but we don’t do total refresh cycles in bulk, just as needed. Everything we do is sporadic, and I’m trying to see the ROI on learning automation for things that are done once in a while that don’t take much time to do manually anyway.
There are two aspects to this part of his comment:
Ish wrote an interesting comment on my Network Automation Expert Beginners blog post. He started with:
[Our network has] about 40 sites, but we don’t do total refresh cycles in bulk, just as needed. Everything we do is sporadic, and I’m trying to see the ROI on learning automation for things that are done once in a while that don’t take much time to do manually anyway.
There are two aspects to this part of his comment:
A heavy netlab user sent me an email along these lines:
We’re running multiple labs in parallel on the same server, and we’re experiencing all sorts of clashes like overlapping management IP addresses. We “solved” that by using static device identifiers in our labs, but I’m wondering if there’s a better way of doing it?
That’s exactly the sort of real-life challenges I love working on, so it wasn’t hard to get me excited, and the results are bundled in netlab release 1.5.
A heavy netlab user sent me an email along these lines:
We’re running multiple labs in parallel on the same server, and we’re experiencing all sorts of clashes like overlapping management IP addresses. We “solved” that by using static device identifiers in our labs, but I’m wondering if there’s a better way of doing it?
That’s exactly the sort of real-life challenges I love working on, so it wasn’t hard to get me excited, and the results are bundled in netlab release 1.5.
One of the best descriptions of what ChatGPT does and what it cannot do I found so far comes from an ancient and military historian. The what is ChatGPT and what is an essay parts are a must-read, the preparing to be disrupted conclusion is pure gold:
I do think there are classrooms that will be disrupted by ChatGPT, but those are classrooms where something is already broken.
I can’t help but think of the never-ending brouhaha about exam brain dumps.
One of the best descriptions of what ChatGPT does and what it cannot do I found so far comes from an ancient and military historian. The what is ChatGPT and what is an essay parts are a must-read, the preparing to be disrupted conclusion is pure gold:
I do think there are classrooms that will be disrupted by ChatGPT, but those are classrooms where something is already broken.
I can’t help but think of the never-ending brouhaha about exam brain dumps.
The Routing Protocols Overview part of How Networks Really Work webinar introduced the concepts of distance-vector and link-state routing protocols. Next step: the basics of link-state routing protocols.
The Routing Protocols Overview part of How Networks Really Work webinar introduced the concepts of distance-vector and link-state routing protocols. Next step: the basics of link-state routing protocols.
In the Designing Active-Active and Disaster Recovery Data Centers I tried to give networking engineers a high-level overview of challenges one might face when designing a highly-available application stack, and used that information to show why the common “solutions” like stretched VLANs make little sense if one cares about application availability (as opposed to auditor report). Some (customer) engineers like that approach; here’s the feedback I received not long ago:
As ever, Ivan cuts to the quick and provides not just the logical basis for a given design, but a wealth of advice, pointers, gotchas stemming from his extensive real-world experience. What is most valuable to me are those “gotchas” and what NOT to do, again, logically explained. You won’t find better material IMHO.
Please note that I’m talking about generic multi-site scenarios. From the high-level connectivity and application architecture perspective there’s not much difference between a multi-site on-premises (or collocation) deployment, a hybrid cloud, or a multicloud deployment.
In the Designing Active-Active and Disaster Recovery Data Centers I tried to give networking engineers a high-level overview of challenges one might face when designing a highly-available application stack, and used that information to show why the common “solutions” like stretched VLANs make little sense if one cares about application availability (as opposed to auditor report). Some (customer) engineers like that approach; here’s the feedback I received not long ago:
As ever, Ivan cuts to the quick and provides not just the logical basis for a given design, but a wealth of advice, pointers, gotchas stemming from his extensive real-world experience. What is most valuable to me are those “gotchas” and what NOT to do, again, logically explained. You won’t find better material IMHO.
Please note that I’m talking about generic multi-site scenarios. From the high-level connectivity and application architecture perspective there’s not much difference between a multi-site on-premises (or collocation) deployment, a hybrid cloud, or a multicloud deployment.
One of my readers sent me a question along these lines:
Do I have to have an IBGP session between Customer Edge (CE) routers in a multihomed site if they run EBGP with the upstream provider(s)?
Let’s start with a simple diagram and a refactoring of the question:
One of my readers sent me a question along these lines:
Do I have to have an IBGP session between Customer Edge (CE) routers in a multihomed site if they run EBGP with the upstream provider(s)?
Let’s start with a simple diagram and a refactoring of the question:
Javier Antich, the author of the fantastic AI/ML in Networking webinar, spent years writing the Machine Learning for Network and Cloud Engineers book that is now available in paperback and Kindle format.
I’ve seen a final draft of the book and it’s definitely worth reading. You should also invest some time into testing the scenarios Javier created. Here’s what I wrote in the foreword:
Artificial Intelligence (AI) has been around for decades. It was one of the exciting emerging (and overhyped) topics when I attended university in the late 1980s. Like today, the hype failed to deliver, resulting in long, long AI winter.
Javier Antich, the author of the fantastic AI/ML in Networking webinar, spent years writing the Machine Learning for Network and Cloud Engineers book that is now available in paperback and Kindle format.
I’ve seen a final draft of the book and it’s definitely worth reading. You should also invest some time into testing the scenarios Javier created. Here’s what I wrote in the foreword:
Artificial Intelligence (AI) has been around for decades. It was one of the exciting emerging (and overhyped) topics when I attended university in the late 1980s. Like today, the hype failed to deliver, resulting in long, long AI winter.
It’s incredible how little CPU resources some network devices consume in a steady state – a netlab user managed to run almost 100 Mikrotik routers on a 24-core server. Starting them simultaneously (like vagrant up tries to do when used with the vagrant-libvirt plugin) is a different story. The router virtual machines are configured with two CPU cores for a good reason, and if they don’t get enough CPU cycles during the boot time, they get sluggish, Vagrant gives up, and the lab start procedure fails.
One could use a nasty workaround: