Author Archives: Ivan Pepelnjak
Author Archives: Ivan Pepelnjak
TL&DR: If you happen to like working with containers, you could use netsim-tools release 0.5 to provision your container-based Arista EOS labs.
Why does it matter? Lab setup is blindingly fast, and it’s easier to integrate your network devices with other containers, not to mention the crazy idea of running your network automation CI pipeline on Gitlab CPU cycles. Also, you could use the same netsim-tools topology file and provisioning scripts to set up container-based or VM-based lab.
Some networking engineers breeze through our Network Automation online course, others disappear after a while… and a few of those come back years later with a spectacular production-grade solution.
Stephen Harding is one of those. He attended the automation course in spring 2019 and I haven’t heard from him in almost two years… until he submitted one of the most mature data center fabric automation solutions I’ve seen.
Not only that, he documented the solution in a long series of must-read blog posts. Hope you’ll find them useful; I liked them so much I immediately saved them to Internet Archive (just in case).
One of my readers sent me a series of “how do I get started with…” questions including:
I’ve been doing networking and security for 5 years, and now I am responsible for our cloud infrastructure. Anything to do with networking and security in the cloud is my responsibility along with another team member. It is all good experience but I am starting to get concerned about not knowing automation, IaC, or any programming language.
No need to worry about that, what you need (to start with) is extremely simple and easy-to-master. Infrastructure-as-Code is a simple concept: infrastructure configuration is defined in machine-readable format (mostly text files these days) and used by a remediation tool like Terraform that compares the actual state of the deployed infrastructure with the desired state as defined in the configuration files, and makes changes to the actual state to bring it in line with how it should look like.
Here’s an interesting fact: cloud-based stuff often refuses to die; it might become insufferably slow, but would still respond to the health checks. The usual fast failover approach used in traditional high-availability clusters is thus of little use.
For more details, read the Fail-Fast is Failing… Fast ACM Queue article.
Ansible and Jinja2 are not an ideal platform for data manipulation, but sometimes it’s easier to hack together something in Jinja2 than writing a Python filter. In those cases, you might find the Data Model Transformation with Jinja2 by Philippe Jounin extremely useful.
As I started Software Gone Wild podcast in June 2014, I wanted to help networking engineers grow beyond the traditional networking technologies. It’s only fitting to conclude this project almost seven years and 116 episodes later with a similar theme Avi Freedman proposed when we started discussing podcast topics in late 2020: how do we make networking attractive to young engineers.
I was listening to an excellent container networking podcast and enjoyed it thoroughly until the guest said something along the lines of:
With Kubernetes networking policy, you no longer have to be a networking expert to do container network security.
That’s not even wrong. You didn’t have to be a networking expert to write traffic filtering rules for ages.
A junior networking engineer asked me for a list of recommended entry-level networking blogs. I have no idea (I haven’t been in that position for ages); the best I can do is to share my list of networking-related RSS feeds and the process I’m using to collect interesting blogs:
A while ago, someone made a remark on my suggestions that networking engineers should focus on getting fluent with cloud networking and automation:
The running thing is, we can all learn this stuff, but not without having an opportunity.
I tend to forcefully disagree with that assertion. What opportunity do you need to test open-source tools or create a free cloud account? My response was thus correspondingly gruff:
Last week I described the new features added to netsim-tools release 0.4, including support for unnumbered interfaces and OSPF routing. Now let’s see how I used them to build a multi-vendor lab to test which platforms could be made to interoperate when running OSPF over unnumbered Ethernet interfaces.
I needed to define an unnumbered addressing pool first:
addressing: core: unnumbered: true
I wanted to run OSPF on all devices in the lab:
module: [ ospf ]
Have you ever wondered what the Kubernetes fuss is all about? Why would you ever want to use it? Stuart Charlton tried to answer that question in the introduction part of his fantastic Kubernetes Networking Deep Dive webinar.
It’s almost exactly three months since I announced ipSpace.net going on an extended coffee break. We had some ideas of what we plan to do at that time, but there were still many gray areas, and thanks to tons of discussions I had with many of my friends, subscribers, and readers, they mostly crystallized into this:
You’re trusting me to deliver. We added a “you might want to read this first” warning to the checkout process, and there was no noticeable drop in revenue. Thanks a million for your vote of confidence!
TL&DR: Client clock skew could result in AWS authentication failure when running terraform apply
When I wanted to compare AWS and Azure orchestration speeds I encountered a crazy Terraform error message when running terraform apply:
module.network.aws_vpc.My_VPC: Creating... Error: Error creating VPC: AuthFailure: AWS was not able to validate the provided access credentials status code: 401, request id: ...
Obviously I did all the usual stuff before googling for a solution:
Here’s a message I got from one of my subscribers (probably based on one of my recent public cloud rants):
I often think the cloud stuff has been sent to try us in IT – the struggle could be tough enough when we were dealing with waterfall development and monolithic projects. When products took years to develop, and years to understand.
And now we’re being asked to be agile and learn new stuff all the time about moving targets that barely have documentation at all, never mind accurate doco! We had obviously got into our comfort zone and needed shaking out of it!
Always interested to hear your experiences with the cloud networking though – it’s what I subscribed to ipspace.net for TBH as I think it’s the most complete reference source for that purpose and a vital part of enterprise networking these days!
TL&DR: The new release of netsim-tools includes unnumbered interfaces, configuration modules, and OSPF configuration.
In mid-March, we enjoyed another excellent presentation by Dinesh Dutt, this time focused on running OSPF in leaf-and-spine fabrics. He astonished me when he mentioned unnumbered Ethernet interfaces being available on all major network operating systems. It was time to test things out, and I wanted to use my networking simulation builder to build the test lab.
It’s amazing how easy it is to create a chatbot that will send messages to a Discord channel… just follow John Capobianco’s step by step tutorial.
In the second half of my chat with David Bombal we focused on automation and AI in networking. Even though we discussed many things, including the dangers of doing a repeatable job, and how to make yourself unique, David chose a nice click-bait headline Will AI Replace the Networking Engineers?. According to Betteridge’s law of headlines the answer is still NO, but it’s obvious AI will replace the low-level easy-to-automate jobs (as textile workers found out almost 200 years ago).
While pondering that statement, keep in mind that AI is more than just machine learning (the overhyped stuff). According to one loose definition, “Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions”
When I was complaining about the speed (or lack thereof) of Azure orchestration system, someone replied “I tried to do $somethingComplicated on AWS and it also took forever”
Following the “opinions are great, data is better” mantra (as opposed to “never let facts get in the way of a good story” supposedly practiced by some podcasters), I decided to do a short experiment: create a very similar environment with Azure and AWS.
I took simple Terraform deployment configuration for AWS and Azure. Both included a virtual network, two subnets, a route table, a packet filter, and a VM with public IP address. Here are the observed times:
TL&DR: Azure Route Server works as advertised. Setting it up is excruciatingly slow. You might want to start the process just before taking a long lunch break.
I decided to take Azure Route Server for a ride. Simple setup, two Networking Virtual Appliance (NVA) instances running Quagga to advertise a single prefix (just to see how multipathing works).
Here’s the diagram of what I set up: