Ivan Pepelnjak

Author Archives: Ivan Pepelnjak

Weird: Wrong Subnet Mask Causing Unicast Flooding

When I still cared about CCIE certification, I was always tripped up by the weird scenario with (A) mismatched ARP and MAC timeouts and (B) default gateway outside of the forwarding path. When done just right you could get persistent unicast flooding, and I’ve met someone who reported average unicast flooding reaching ~1 Gbps in his data center fabric.

One would hope that we wouldn’t experience similar problems in modern leaf-and-spine fabrics, but one of my readers managed to reproduce the problem within a single subnet in FabricPath with anycast gateway on spine switches when someone misconfigured a subnet mask in one of the servers.

Weird: Wrong Subnet Mask Causing Unicast Flooding

When I still cared about CCIE certification, I was always tripped up by the weird scenario with (A) mismatched ARP and MAC timeouts and (B) default gateway outside of the forwarding path. When done just right you could get persistent unicast flooding, and I’ve met someone who reported average unicast flooding reaching ~1 Gbps in his data center fabric.

One would hope that we wouldn’t experience similar problems in modern leaf-and-spine fabrics, but one of my readers managed to reproduce the problem within a single subnet in FabricPath with anycast gateway on spine switches when someone misconfigured a subnet mask in one of the servers.

Validate Ansible YAML Data with JSON Schema

When I published the Optimize Network Data Models series a long while ago, someone made an interesting comment along the lines of “You should use JSON Schema to validate the data model.

It took me ages to gather the willpower to tame that particular beast, but I finally got there. In the next installment of the Data Models saga I described how you can use JSON Schema to validate Ansible inventory data and your own YAML- or JSON-based data structures.

To learn more about data validation, error handling, unit- and system testing, and CI/CD pipelines in network automation, join our automation course.

Validate Ansible YAML Data with JSON Schema

When I published the Optimize Network Data Models series a long while ago, someone made an interesting comment along the lines of “You should use JSON Schema to validate the data model.

It took me ages to gather the willpower to tame that particular beast, but I finally got there. In the next installment of the Data Models saga I described how you can use JSON Schema to validate Ansible inventory data and your own YAML- or JSON-based data structures.

To learn more about data validation, error handling, unit- and system testing, and CI/CD pipelines in network automation, join our automation course.

New on ipSpace.net: Virtualizing Network Devices Q&A

A few weeks ago we published an interesting discussion on network operating system details based on an excellent set of questions by James Miles.

Unfortunately we got so far into the weeds at that time that we answered only half of James' questions. In the second Q&A session Dinesh Dutt and myself addressed the rest of them including:

  • How hard is it to virtualize network devices?
  • What is the expected performance degradation?
  • Does it make sense to use containers to do that?
  • What are the operational implications of running virtual network devices?
  • What will be the impact on hardware vendors and networking engineers?

And of course we couldn’t avoid the famous last question: “Should network engineers program network devices?

You’ll need Standard or Expert ipSpace.net subscription to watch the videos.

New on ipSpace.net: Virtualizing Network Devices Q&A

A few weeks ago we published an interesting discussion on network operating system details based on an excellent set of questions by James Miles.

Unfortunately we got so far into the weeds at that time that we answered only half of James’ questions. In the second Q&A session Dinesh Dutt and myself addressed the rest of them including:

  • How hard is it to virtualize network devices?
  • What is the expected performance degradation?
  • Does it make sense to use containers to do that?
  • What are the operational implications of running virtual network devices?
  • What will be the impact on hardware vendors and networking engineers?

And of course we couldn’t avoid the famous last question: “Should network engineers program network devices?

You’ll need Standard or Expert ipSpace.net subscription to watch the videos.

Video: Simplify Device Configurations with Cumulus Linux

The designers of Cumulus Linux CLI were always focused on simplifying network device configurations. One of the first features along these lines was BGP across unnumbered interfaces, then they introduced simplified EVPN configurations, and recently auto-MLAG and auto-BGP.

You can watch a short description of these features by Dinesh Dutt and Pete Lumbis in Simplify Network Configuration with Cumulus Linux and Smart Datacenter Defaults videos (part of Cumulus Linux section of Data Center Fabrics webinar).

You need Free ipSpace.net Subscription to watch the video.

Video: Simplify Device Configurations with Cumulus Linux

The designers of Cumulus Linux CLI were always focused on simplifying network device configurations. One of the first features along these lines was BGP across unnumbered interfaces, then they introduced simplified EVPN configurations, and recently auto-MLAG and auto-BGP.

You can watch a short description of these features by Dinesh Dutt and Pete Lumbis in Simplify Network Configuration with Cumulus Linux and Smart Datacenter Defaults videos (part of Cumulus Linux section of Data Center Fabrics webinar).

You need Free ipSpace.net Subscription to watch the video.

Automation Win: Recreating Cisco ACI Tenants in Public Cloud

This blog post was initially sent to the subscribers of our SDN and Network Automation mailing list. Subscribe here.

Most automation projects are gradual improvements of existing manual processes, but every now and then the stars align and you get a perfect storm, like what Adrian Giacommetti encountered during one of his automation projects.

The customer had well-defined security policies implemented in Cisco ACI environment with tenants, endpoint groups, and contracts. They wanted to recreate those tenants in a public cloud, but it took way too long as the only migration tool they had was an engineer chasing GUI screens on both platforms.

Automation Win: Recreating Cisco ACI Tenants in Public Cloud

This blog post was initially sent to the subscribers of our SDN and Network Automation mailing list. Subscribe here.

Most automation projects are gradual improvements of existing manual processes, but every now and then the stars align and you get a perfect storm, like what Adrian Giacommetti encountered during one of his automation projects.

The customer had well-defined security policies implemented in Cisco ACI environment with tenants, endpoint groups, and contracts. They wanted to recreate those tenants in a public cloud, but it took way too long as the only migration tool they had was an engineer chasing GUI screens on both platforms.

Must Read: Redistributing Full BGP Feed into OSPF

The idea of redistributing the full Internet routing table (840.000 routes at this moment) into OSPF sound as ridiculous as it is, but when fat fingers strike it should be relatively easy to recover, right? Just disable redistribution (assuming you can still log into the offending device) and move on.

Wrong. As Dmytro Shypovalov explained in an extensive blog post, you might have to restart all routers in your OSPF domain to recover.

And that, my friends, is why OSPF is a single failure domain, and why you should never run OSPF between your data center fabric and servers or VM appliances.

Must Read: Redistributing Full BGP Feed into OSPF

The idea of redistributing the full Internet routing table (840.000 routes at this moment) into OSPF sounds as ridiculous as it is, but when fat fingers strike, it should be relatively easy to recover, right? Just turn off redistribution (assuming you can still log into the offending device) and move on.

Wrong. As Dmytro Shypovalov explained in an extensive blog post, you might have to restart all routers in your OSPF domain to recover.

And that, my friends, is why OSPF is a single failure domain, and why you should never run OSPF between your data center fabric and servers or VM appliances.

Validating Data in GitOps-Based Automation

Anyone using text files as a poor man’s database eventually stumbles upon the challenge left as a comment in Automating Cisco ACI Environments blog post:

The biggest challenge we face is variable preparation and peer review process before committing variables to Git. I’d be particularly interested on how you overcome this challenge?

We spent hours describing potential solutions in Validation, Error Handling and Unit Tests part of Building Network Automation Solutions online course, but if you never built a network automation solution using Ansible YAML files as source-of-truth the above sentence might sound a lot like Latin, so let’s make it today’s task to define the problem.

Validating Data in GitOps-Based Automation

Anyone using text files as a poor man’s database eventually stumbles upon the challenge left as a comment in Automating Cisco ACI Environments blog post:

The biggest challenge we face is variable preparation and peer review process before committing variables to Git. I’d be particularly interested on how you overcome this challenge?

We spent hours describing potential solutions in Validation, Error Handling and Unit Tests part of Building Network Automation Solutions online course, but if you never built a network automation solution using Ansible YAML files as source-of-truth the above sentence might sound a lot like Latin, so let’s make it today’s task to define the problem.