Author Archives: Ivan Pepelnjak
Author Archives: Ivan Pepelnjak
Every complex enough network automation solution has to introduce a high-level (user-manageable) data model that is eventually transformed into a low-level (device) data model.
The transformation code (business logic) is one of the most complex pieces of a network automation solution, and there’s only one way to ensure it works properly: you test the heck out of it ;) Let me show you how we solved that challenge in netlab.
All the Kubernetes Service Mesh videos from the Kubernetes Networking Deep Dive webinar with Stuart Charlton are now public. Enjoy!
Daniel Dib found the ancient OSPF Protocol Analysis (RFC 1245) that includes the Router CPU section. Please keep in mind the RFC was published in 1991 (35 years ago):
Steve Deering presented results for the Dijkstra calculation in the “MOSPF meeting report” in [3]. Steve’s calculation was done on a DEC 5000 (10 mips processor), using the Stanford internet as a model. His graphs are based on numbers of networks, not number of routers. However, if we extrapolate that the ratio of routers to networks remains the same, the time to run Dijkstra for 200 routers in Steve’s implementation was around 15 milliseconds.
Daniel Dib found the ancient OSPF Protocol Analysis (RFC 1245) that includes the Router CPU section. Please keep in mind the RFC was published in 1991 (35 years ago):
Steve Deering presented results for the Dijkstra calculation in the “MOSPF meeting report” in [3]. Steve’s calculation was done on a DEC 5000 (10 mips processor), using the Stanford internet as a model. His graphs are based on numbers of networks, not number of routers. However, if we extrapolate that the ratio of routers to networks remains the same, the time to run Dijkstra for 200 routers in Steve’s implementation was around 15 milliseconds.
In the Dealing with LAG Member Failures blog post, we figured out how easy it is to deal with a LAG member failure in a traditional MLAG cluster. The failover could happen in hardware, and even if it’s software-driven, it does not depend on the control plane.
Let’s add a bit of complexity and replace a traditional layer-2 fabric with a VXLAN fabric. The MLAG cluster members still use an MLAG peer link and an anycast VTEP IP address (more details).
In the Dealing with LAG Member Failures blog post, we figured out how easy it is to deal with a LAG member failure in a traditional MLAG cluster. The failover could happen in hardware, and even if it’s software-driven, it does not depend on the control plane.
Let’s add a bit of complexity and replace a traditional layer-2 fabric with a VXLAN fabric. The MLAG cluster members still use an MLAG peer link and an anycast VTEP IP address (more details).
netlab release 1.8.2 contains dozens of bug fixes and minor tweaks to device configuration templates. We also added a few safeguards including:
netlab release 1.8.2 contains dozens of bug fixes and minor tweaks to device configuration templates. We also added a few safeguards including:
In the previous blog post on this topic, I described how node and global VRFs work in netlab.
TL&DR: If you use the same VRF on multiple devices, it’s better to define it globally.
However, you might not need every VRF on every lab device in a more complex lab topology. Considering that, netlab tries to minimize the number of VRFs configured on lab devices using a simple rule: a VRF is configured on a lab device only if the device has at least one interface in that VRF.
In the previous blog post on this topic, I described how node and global VRFs work in netlab.
TL&DR: If you use the same VRF on multiple devices, it’s better to define it globally.
However, you might not need every VRF on every lab device in a more complex lab topology. Considering that, netlab tries to minimize the number of VRFs configured on lab devices using a simple rule: a VRF is configured on a lab device only if the device has at least one interface in that VRF.
Here’s another BGP lab challenge to start your weekend: use RIB-to-FIB filters to reduce the forwarding table size on access routers in a large Service Provider network.
Here’s another BGP lab challenge to start your weekend: use RIB-to-FIB filters to reduce the forwarding table size on access routers in a large Service Provider network.
Craig Weinhold pointed me to a complex topic I managed to ignore in my MLAG Deep Dive series: how does an MLAG cluster reroute around a failure of a LAG member link?
In this blog post, we’ll focus on traditional MLAG cluster implementations using a peer link; another blog post will explore the implications of using VXLAN and EVPN to implement MLAG clusters.
We’ll also ignore the interesting question of “how is the LAG member link failure detected?”1 and focus on “what happens next?” using the sample MLAG topology:
Craig Weinhold pointed me to a complex topic I managed to ignore in my MLAG Deep Dive series: how does an MLAG cluster reroute around a failure of a LAG member link?
In this blog post, we’ll focus on traditional MLAG cluster implementations using a peer link; another blog post will explore the implications of using VXLAN and EVPN to implement MLAG clusters.
We’ll also ignore the interesting question of “how is the LAG member link failure detected?”1 and focus on “what happens next?” using the sample MLAG topology:
Erik Auerswald pointed me to an interesting open-source project. LibreQoS implements decent QoS using software switching on many-core x86 platforms. It’s implemented as a bump-in-the-wire software solution, so you should be able to plug it into your network just before a major congestion point and let it handle the packet dropping and prioritization.
Obviously, the concept is nothing new. I wrote about a similar problem in xDSL networks in 2009.
Erik Auerswald pointed me to an interesting open-source project. LibreQoS implements decent QoS using software switching on many-core x86 platforms. It’s implemented as a bump-in-the-wire software solution, so you should be able to plug it into your network just before a major congestion point and let it handle the packet dropping and prioritization.
Obviously, the concept is nothing new. I wrote about a similar problem in xDSL networks in 2009.
You might remember Béla Várkonyi’s use of LISP to build resilient ground-to-airplane networks from last week’s repost. It seems he’s not exactly happy with the current level of LISP support, at least based on what he wrote as a response to Jeff McLaughlin’s claim that “I can tell you that our support for EVPN does not, in any way, indicate the retirement of LISP for SD-Access.”:
Nice to hear the Cisco intends to support LISP. However, it is removed from IOS XR already. So it is not that clear…
If Cisco will stop supporting LISP, then we will be forced to create our own LISP routers, since we need it for extreme mobility environments.
You might remember Béla Várkonyi’s use of LISP to build resilient ground-to-airplane networks from last week’s repost. It seems he’s not exactly happy with the current level of LISP support, at least based on what he wrote as a response to Jeff McLaughlin’s claim that “I can tell you that our support for EVPN does not, in any way, indicate the retirement of LISP for SD-Access.”:
Nice to hear the Cisco intends to support LISP. However, it is removed from IOS XR already. So it is not that clear…
If Cisco will stop supporting LISP, then we will be forced to create our own LISP routers, since we need it for extreme mobility environments.
Some networking vendors realized that one way to gain mindshare is to make their network operating systems available as free-to-download containers or virtual machines. That’s the right way to go; I love their efforts and point out who went down that path whenever possible1 (as well as others like Cisco who try to make our lives miserable).
However, those virtual machines better work out of the box, or you’ll get frustrated engineers who will give up and never touch your warez again, or as someone said in a LinkedIn comment to my blog post describing how Junos vPTX consistently rejects its DHCP-assigned IP address: “If I had encountered an issue like this before seeing Ivan’s post, I would have definitely concluded that I am doing it wrong.”2
Some networking vendors realized that one way to gain mindshare is to make their network operating systems available as free-to-download containers or virtual machines. That’s the right way to go; I love their efforts and point out who went down that path whenever possible1 (as well as others like Cisco who try to make our lives miserable).
However, those virtual machines better work out of the box, or you’ll get frustrated engineers who will give up and never touch your warez again, or as someone said in a LinkedIn comment to my blog post describing how Junos vPTX consistently rejects its DHCP-assigned IP address: “If I had encountered an issue like this before seeing Ivan’s post, I would have definitely concluded that I am doing it wrong.”2