Matt O

Author Archives: Matt O

5 Dev Tools for Network Engineers

This entry is part 1 of 1 in the series DevOps for Networking

I’d like to write about five things that you as a hardcore, operations-focused network engineer can do to evolve your skillsets, and take advantage of some of the methodologies that have for so long given huge benefits to the software development community. I won’t be showing you how to write code – this is less about programming, and more about the tools that software developers use every day to work more efficiently. I believe in this, there is a lot of potential benefit to network engineering and operations.

I’m of the opinion that “once you know what you don’t know, you’re halfway there”. After all, if you don’t know what you don’t know, then you can’t very well learn what you don’t know, can you? In that spirit, this article will introduce a few concepts briefly, and every single one will require a lot of hands-on practice and research to really understand thoroughly. However, it’s a good starting point, and I think if you can add even a few of these skills, your marketability as a network engineer will increase dramatically.


Proper Version Control

As a developer, version Continue reading

Why Network Automation Won’t Kill Your Job

I’ve been focusing lately on shortening the gap between traditional automation teams and network engineering. This week I was fortunate enough to attend the DevOps 4 Networks event, and though I’d like to save most of my thoughts for a post dedicated to the event, I will say I was super pleased to spend the time with the legends of this industry. There are a lot of bright people looking at this space right now, and I am really enjoying the community that is emerging.

I’ve heard plenty of excuses for NOT automating network tasks. These range from “the network is too crucial, automation too risky” to “automating the network means I, as a network engineer, will be put out of a job”.

To address the former, check out Ivan Pepelnjak’s podcast with Jeremy Schulman of Schprokits, where they discuss blast radius (regarding network automation).

I’d like to talk about that second excuse for a little bit, because I think there’s an important point to consider.


A Recent Example

A few years back, I was working for a small reseller helping small companies consolidate their old physical servers into a cheap cluster of virtual hosts. For every sizing discussion that Continue reading

Cisco NX-API 1.0 Update

If you weren’t paying attention, it was easy to miss. NX-API, Cisco’s new JSON/XML switch API is now shipping as version 1.0. NX-API originated on the Nexus 9000 platform created by the Insieme group, and I’ve explored this in detail before.

In review, NX-API is a new, programmatic method of interacting with a Cisco Nexus switch. In many ways, Cisco is playing catch-up here, since this interface is really just a wrapper for the CLI (admittedly with some convenient output parsing), and most of their competitors have had similar interfaces for a while. Nevertheless, it is better than scraping an SSH session, so it’s worth looking into.

I’d like to go over a few new things you should know about if you are or will be working with this interface.


NX-API 1.0 Updates

From a strictly API perspective, not a lot seems to have changed. I would be more specific, but as of yet I’ve been unable to find release notes from Cisco on what’s changed from 0.1 to 1.0. If I ever find something like this, I’ll get my hands on it – part of publishing a good API means publishing good documentation, and Continue reading

Network Troubleshooting with ThousandEyes

My first experience with ThousandEyes was a year ago at Network Field Day 6, where they were kind enough to give us a tour of their office, and introduce us to their products. I’ve been fairly distracted since then, but kept an eye on what other delegates like Bob McCouch were doing with the product since that demo.

A year later, at Network Field Day 8, they presented again. If you’ve never heard of ThousandEyes, and/or would like an overview, watch Mohit’s (CEO) NFD8 introduction:


Debugging the Internet

One of the things that really stuck out a year ago, and was reinforced tenfold this year, was that ThousandEyes was not introducing any new protocols to the industry – at a time when all of the headlines were talking about new protocols (i.e. OpenFlow). Numerous tech startups – especially those in networking – are in existence purely to tackle the big “software-defined opportunity” gold rush.

Instead, ThousandEyes is focused on network monitoring. If you’re like me – you hear those words and immediately conjure up images of all of the…..well, terrible software that exists today to monitor networks. In addition, network monitoring is inherently very fragmented. You can really only Continue reading

[SDN Protocols] Part 4 – OpFlex and Declarative Networking

This entry is part 5 of 5 in the series SDN Protocols

In this post, we will be discussing a relatively new protocol to the SDN scene – OpFlex. This protocol was largely championed by Cisco, but there are a few other vendors that have announced planned support for this protocol. I write this post because – like OVSDB – there tends to be a lot of confusion and false information about this protocol, so my goal in this post is to provide some illustrations that (hopefully) set the record straight, with respect to both OpFlex’s operation, and it’s intended role.

Before I get started, I would be remiss to not point you towards a brilliant article by Kyle Mestery titled “OpFlex is not an OpenFlow Killer“. At the time the article was written, Kyle was working for Noiro, a team within the INSBU at Cisco focused (at least primarily) on open source efforts in SDN, and the creators of OpFlex.


The Declarative Model of Network Programmability

Before we get into the weeds of the OpFlex protocol, it’s important to understand the model that OpFlex intends to address. OpFlex is the protocol du jour within a Cisco ACI based Continue reading

[SDN Protocols] Part 3 – OVSDB

This entry is part 4 of 4 in the series SDN Protocols

Today, we will be discussing the Open vSwitch Database Management Protocol, commonly (and herein) referred to as OVSDB. This is a network configuration protocol that  has been the subject of a lot of conversations pertaining to SDN. My goal in this post is to present the facts about OVSDB as they stand. If you want to know what OVSDB does, as well as does NOT do, read on.

I would like to call out a very important section, titled “OVSDB Myths”. I have encountered a lot of false information about OVSDB in the last year or so, and would like to address this specifically. Find this section at the end of this post.

If you’re new to OVSDB, it’s probably best to think of it in the same way you might think of any other configuration API like NETCONF, or maybe even proprietary vendor configuration APIs like NXAPI; it’s goal is to provide programmatic access to the management plane of a network device or software. However, in addition to being a published open standard, it is quite different in it’s operation from other network APIs.


Control vs Continue reading

Dealing with Schema Changes

It’s not often I get to write about concepts rooted in database technology, but I’d like to illuminate a situation that software developers deal with quite often, and one that those entering this space from the network infrastructure side may want to consider.

Software will often communicate with other software using APIs – an interface built so that otherwise independent software processes can send and receive data between each other, or with other systems. We’re finding that this is a pretty hyped-up buzzword in the networking industry right now, since network infrastructure historically has had only one effective method of access, and that is the CLI; not exactly ideal for anything but human beings.

These APIs will typically use some kind of transport protocol like TCP (many also ride on top of HTTP), in order to get from point A to point B. The data contained within will likely be some kind of JSON or XML structure. As an example, here’s the output from a Nexus 9000 routing table:

<?xml version="1.0"?>
                                                <ipnexthop>172. Continue reading

[SDN Protocols] Part 2 – OpenFlow Deep-Dive

This entry is part 3 of 4 in the series SDN Protocols

In the last post, I introduced you to the concept of control plane abstraction, specifically the OpenFlow implementation. I talked about how OpenFlow allows us to specify the flows that we want to be programmed into the forwarding plane, from outside the forwarding device itself. We can also match on fields we typically don’t have access to in traditional networking, since current hardware is optimized for destination-based forwarding.

In this post, I plan to cover quite a few bases. The goal of this post is to address the main concepts of OpenFlow’s operation, with links to find out more. With this post, you’ll be armed with the knowledge of what OpenFlow does and doesn’t do, as well as resources to dive even deeper.

NOTICE: This blog post was written referencing the specification and implementations of OpenFlow 1.3 – since this version, some aspects of the protocol may have changed (though it is likely the fundamentals discussed here will be mostly the same)


OpenFlow Tables

The OpenFlow specification describes a wide variety of topics. For instance, the protocol format that’s used to communicate with an OpenFlow switch Continue reading

Handling “Multiples” in Cisco NX-API with Python

A few weeks ago, I was working with the NX-API currently found on Cisco’s Nexus 9000 series switches, and ran into some peculiar behavior.

NX-API returns all information in terms of Tables and Rows. For a specific example, let’s look at what NX-API returns when I ask the switch for running OSPF processes:

There’s actually a lot more information in this snippet that pertains to the OSPF process itself, but I have omitted it for brevity. This specific example focuses on the section that describes the areas in this OSPF process.

  "ins_api": {
    "sid": "eoc",
    "type": "cli_show",
    "version": "0.1",
    "outputs": {
      "output": {
        "code": "200",
        "msg": "Success",
        "input": "show ip ospf",
        "body": {
          "TABLE_ctx": {
            "ROW_ctx": {
              ### OSPF process information omitted for brevity ###
              "TABLE_area": {
                "ROW_area": {
                  "age": "P15DT15H27M6S",
                  "loopback_intf": "1",
                  "passive_intf": "0",
                  "last_spf_run_time": "PT0S",
                  "spf_runs": "9",
                  "lsa_cnt": "5",
                  "no_summary": "false",
                  "backbone_active": "true",
                  "stub": "false",
                  "aname": "",
                  "total_intf": "2",
                  "auth_type": "none",
                  "act_intf": "2",
                  "nssa": "false",
                  "lsa_crc": "0x18d91"

NXAPI uses a special tag that starts with TABLE, and within that, tag(s) that start with ROW, whenever it needs to describe something that would normally be Continue reading

[SDN Protocols] Part 1 – OpenFlow Basics

This entry is part 2 of 4 in the series SDN Protocols

Let’s get into our first topic. And what better place to start than with the protocol that arguably started the SDN madness that we’re experiencing today – OpenFlow! I got fairly carried away with writing about this protocol, and understandably so – this is a complicated topic.

That’s why I’ve split this post (which is already part of a series – very meta, much deep) into two parts. This post – Part 1 – will address OpenFlow’s mid to high-level concepts, exploring what it does, why/how the idea of control plane abstraction may be useful, and  some details on how hardware interaction works. The second post – Part 2 – will dive a little deeper into the operation of OpenFlow on supporting physical and virtual switches, and the differences in some popular implementations of OpenFlow.


The State of Modern Control Planes

Before we get into the specifics of OpenFlow, it’s important we address the relationship between the control plane and the data plane, and how OpenFlow changes this relationship. You’ve undoubtedly heard by now that one of SDN’s key traits is the “separation” or “abstraction” of the control plane from the Continue reading

[SDN Protocols] – New Series

This entry is part 1 of 4 in the series SDN Protocols

The networking industry in the last few years has seen an explosion in buzzwords, slide decks, new technologies, and SDN product announcements.  The honest truth is that the networking industry is still in a great state of flux, as we collectively discover what SDN means to us.

There’s a lot of new terms floating around, and what makes things even harder to keep up with, the marketing engines are alive and well – muddying the waters, and making it nearly impossible to get technical facts straight. I’m fortunate enough to know a few people that remind me that what matters most is when the rubber meets the road (which usually manifests itself in “shut up and code”).

52770151 [SDN Protocols]   New Series

To that end, I am kicking off a series that will be completely dedicated to explaining the various protocols and technologies you might encounter in researching SDN.


Who Can Use This Series?

If you’re into open source implementations, all of this will be immediately relevant. Much of what I’ll be exploring pertains to the nitty-gritty under-the-covers operation of these protocols, and will often use real-world examples rooted deeply in open source, Continue reading

What is Unidirectional Automation?

I was pleased as punch to wake up the other day and read Marten Terpstra’s blog post on getting over the fear of using automation to make changes on our network infrastructure. He illuminated a popular excuse that I’ve heard myself on multiple occasions – that automation is great for things like threshold alarms, or pointing out the percieved root cause of a problem, but not actually fixing the problem. The idea is that the problems that occur on a regular basis, or even performing configuration changes in the first place – is a specialized task that a warm-blooded human being absolutely, no-doubt must take total control of in order to be successful.1266464746097 What is Unidirectional Automation?

With the right implementation, this idea is, of course, rubbish. I asked a question on Twitter not too long ago in preparation for a presentation I was about to give. I have a decent amount of experience working with VMware vSphere, and knew there were some experienced server virtualization folks following me, so I asked about a feature that was thought of in similar light not too long ago:

Spine/Leaf Topology Explorer with Ansible

I’ve mentioned before the need for networks to be addressed in a very programmatic way. Very often, I’ve found the discussion is actually a lot less about “programming language” details and more about getting rid of the methodology of addressing the network as a mere “collection of boxes” (see “Box Mentality“).

Instead, we have the ability to address the network as any developer would address the distributed components of an application. We acknowledge that networks are a distributed system – it’s what makes them as scalable as they have been. However, it’s important to understand we can address configuration and troubleshooting needs in a unified, automated way as well.

My goal in this post is to explore one particular application of such a methodology. I will use Ansible to first create a dataset that represents a spine/leaf network topology – also demonstrating how it might scale beyond my small lab implementation – then I will move into some kind of network task based on this information.

I have access to a few Cisco Nexus 9000 switches in the lab, and I wanted to be able to model a spine/leaf topology in a very elegant way that would (theoretically) scale as Continue reading

Network Automation or SDN?

With all of the activity going on in the networking industry right now, and all of the new terminology (as well as old re-invented terminology), it’s quite easy to get messages mixed up. After all, there’s no centralized dictionary for all of this stuff. I’d like to address something that has bugged me for a while.

I’ve now heard from quite a few folks that SDN to them means the ability to automate network tasks. This almost totally misses the point, in my opinion. Network automation should literally be thought of a prerequisite for what we’ll likely be doing on our networks in 10 years; call it SDN if you want. My logic involved with coming to this conclusion is almost 100% about the people involved. Allow me to elaborate.


What’s Missing?

In my experience the main thing that’s missing from 90% of enterprise networks today is that networking teams have not properly defined their workflows, and/or have not formalized a service catalog to other parts of the business. As a result, everything is fire-fighting, or one-off requests.

Tracking changes historically, and pinning them to business processes is totally impossible (if it’s even attempted), and garbage collection does not occur. Continue reading

Glue Networks at ONUG 2014

Glue Networks had a presence at the last ONUG, where Tom Hollingworth was able to get an overview from Glue’s founder, Jeff Gray:

As you can see, Glue’s product targets the WAN, and specifically addresses the difficult provisioning tasks that most shops do manually. These include but are not limited to:

  • Provisioning (and deprovisioning) of QoS resources for various applications like SAP and Lync based off of need and time of day.
  • Bringing up remote sites in a standardized, cookie-cutter manner
  • Creating and changing PfR (performance routing) configurations on the WAN.

Jeff visited our Tech Field Day round table at ONUG 2014 and gave us a more detailed introduction to the product:

First, some things I think this product does (or will do) well. The configuration of PfR or QoS en masse is a low-hanging use case I’ve mentioned before and even if I can do it using scripts today, having a single tool that does it in a simple way will provide value. These specific configurations are difficult and error-prone, so anything that tackles this is going to be useful.

I also did enjoy hearing about the options for getting the config onto the device. Jeff listed three options for Continue reading

The Evolution of Network Programmability

This post is the “text” version of a talk I gave at Cisco Live US 2014 titled “SDN: People, Process, and Evolution“. While there is certainly some technical details involved here, this topic is really more of a philosophical one, and it is very near and dear to my heart as I talk with more folks about how networking is going to evolve in the years to come.

The Problem with Networking

Most of my readers would consider themselves network engineers – folks that live and breathe networking and everything that’s required to build them. Folks like you and I don’t really need to hear what’s wrong with networking, as we live it every day. However, for the sake of others that may be reading, let me provide a little context here.

Nearly everyone in the industry is hearing about how “networking is slow” with respect to provisioning time. We hear about how virtual machines can be instantiated in a few seconds (hell, application containers can be spun up in less than a second!) yet the really important network stuff like firewall or load balancer policies take forever. They’re not wrong – networking has never really been tightly Continue reading

Pylint Errors – Final Newline Missing

I recently ran into a slew of errors when using Pylint - a sort of “quality checker” for your Python code. If you haven’t used it yourself, I highly recommend you check it out – it WILL make you a better Python coder.(Thanks to Matt Stone for introducing me!)

This particular error is common if you forget to append a newline character to the end of your python script, but I was getting one for every single line of code in my program.

khalis:library Mierdin$ pylint 
No config file found, using default configuration
C:  1, 0: Final newline missing (missing-final-newline)
C:  2, 0: Final newline missing (missing-final-newline)
C:  3, 0: Final newline missing (missing-final-newline)
C:  4, 0: Final newline missing (missing-final-newline)
C:  5, 0: Final newline missing (missing-final-newline)
C:  6, 0: Final newline missing (missing-final-newline)
C:  7, 0: Final newline missing (missing-final-newline)

You get the idea.

My code clearly has a newline character of some kind at the end, but perhaps it’s just not the right one. We need to see what newline character our editor is actually appending to the end of our lines.

For this, we’ll use the (*nix) “od” command, which dumps files Continue reading

Recap of ONUG Conference 2014

Last week I attended the Open Networking User Group conference. My main reason for attending was to participate in three roundtable discussions put on by Tech Field Day. These sessions were recorded, and I’ll be following up with specific thoughts on each session in later blog posts.

These round-tables only occupied a portion of the two-day conference, so I spent the remainder of the time speaking with some of the vendors and sitting in a few of the sessions.


I wasn’t permitted to attend a large chunk of ONUG sessions, and I’ll get to that in the next paragraph. I did manage to see a good friend Kyle Mestery present on two of my favorite topics – OpenDaylight and OpenStack. The sessions at ONUG were not recorded, but I’ll again direct you to this video for a reasonably close approximation:

Kyle is the embodiment of the passion and energy found in great communities like OpenStack and OpenDaylight, and if you ever have the opportunity to hear him present, I encourage you to take it.

I also finally got to meet Brad Hedlund in meatspace:

Open Networking User Group Conference 2014

Today I’m off to NYC for Open Networking User Group 2014. Tech Field Day was at the last ONUG back in October, 2013 and they were kind enough to invite me out to this one. Here’s a quick intro video of ONUG for those that aren’t aware of it – Tom Hollingsworth interviews ONUG creator Nick Lippis:

We have a good group of vendors lined up for similar round-table discussions. I happen to LOVE this format of conversation, especially with the smart folks we’ve seen from vendors like Nuage and Cumulus. I am really looking forward to sitting down and talking tech.

My original outsider’s perspective was that ONUG in general (not counting nerdy Tech Field Day round table discussions) wasn’t really aimed towards the technical folks, but rather at executives, and at other IT decision makers looking for additional choices in networking infrastructure. While there’s certainly a lot of that, I’d like to call out a few sessions/events that really interest the nerd in me (as if I’m not 100% nerd).

Back in February, I had the pleasure of sitting in Kyle Mestery’s presentation on integration with OpenDaylight and OpenStack at the OpenDaylight Summit:

Aside from a few Continue reading

Networking and the Consumption Model

I’ve talked with all kinds of IT professionals in the past year or so about building an organization of various IT disciplines that are truly service-oriented towards each other and to the other parts of the business. While I will never claim to be an expert in business development and will always claim allegiance to the nerdy technical bits, it’s easy to see the value in such an organizational model, and very interesting to explore the changes that technical people can make to push for such an approach. Let’s bring this down to earth a bit.



Server Virtualization is old news now, so lets go back about 15 years before it was even really on the scene. You’ve heard the arguments for server virtualization, and the description of this “ancient age” – servers were provisioned on a 1:1 basis with applications, they took weeks to provision or replace, and the capex/opex costs were way too high because on the one hand, the sheer amount of hardware necessary to run your apps was outrageously expensive, and on the other hand, the power and cooling required to constantly run them was no better.

Lets think about the kind of resources Continue reading