I’m pleased to be invited back for the 7th installment of Networking Field Day in San Jose, CA from February 19th - 21st. This event is part of a series of independent IT community-powered events that give the vendors an opportunity to talk about the products and ideas they’ve been working on, and receive honest and direct feedback from the delegates.
The results of this dynamic vary quite greatly - sometimes a vendor doesn’t quite bring their A-game and we let them know.
I’m pleased to be invited back for the 7th installment of Networking Field Day in San Jose, CA from February 19th - 21st. This event is part of a series of independent IT community-powered events that give the vendors an opportunity to talk about the products and ideas they’ve been working on, and receive honest and direct feedback from the delegates.
The results of this dynamic vary quite greatly - sometimes a vendor doesn’t quite bring their A-game and we let them know.
I will be at the OpenDaylight Summit in Santa Clara on February 4th and 5th.
To me, OpenDaylight represents a platform on which we can test SDN ideas today, right here, and right now. In my day job I may never be called upon to help deploy OpenDaylight (here’s hoping!) but nonetheless, it’s projects like this where the best and the brightest come together to share ideas and help to form the technology that the industry will be using for the next few decades.
This will be a short follow-up to my last post about the idea of Flow Control in storage protocols. As a review, the three main options in common use today are:
IP Storage - uses TCP windowing to provide feedback to client on how much data to send
Native Fibre Channel - uses buffer credits (typically on a hop-by-hop basis when using N_port to F_port) FCoE - uses Priority Flow Control to define a class of service on which to send Ethernet PAUSE frames to manage congestion The last item is really the only one that warrants any kind of configuration, as the first two are more or less baked into the protocol stacks.
This will be a short follow-up to my last post about the idea of Flow Control in storage protocols. As a review, the three main options in common use today are:
IP Storage - uses TCP windowing to provide feedback to client on how much data to send
Native Fibre Channel - uses buffer credits (typically on a hop-by-hop basis when using N_port to F_port) FCoE - uses Priority Flow Control to define a class of service on which to send Ethernet PAUSE frames to manage congestion The last item is really the only one that warrants any kind of configuration, as the first two are more or less baked into the protocol stacks.
Sadly, this will be another post regarding issues I’ve had with UCSM firmware release 2.2(1b). During the upgrade process, I experienced a lot of issues with data plane connectivity - after I activated (and subsequently rebooted) a Fabric Interconnect, and it came up with the new NXOS version, a slew of blades would have persistent errors regarding virtual interfaces (VIFs) that wouldn’t come back online.
Here is the error report for a single blade where I was seeing these errors:
When upgrading UCS firmware, it’s important to periodically check the state of the HA clustering service running between the two Fabric Interconnects. The infrastructure portions of UCS are generally redundant due to these two FIs but only if the clustering service has converged - so it’s important to use the “show cluster state” command to verify this is the case. During a firmware upgrade to 2.2(1b), I checked this:
6296FAB-A# connect local-mgmt 6296FAB-A(local-mgmt)# show cluster state Cluster Id: 8048cd6e-5d54-11e3-b36c-002a6a499d04 Unable to communicate with UCSM controller The error message - “unable to communicate with UCSM controller” worried me, and it was given when I ran the “show cluster state” command as well as the “cluster lead” command - the latter of which is necessary to switch an FI’s role in the cluster from subordinate to primary.
When upgrading UCS firmware, it’s important to periodically check the state of the HA clustering service running between the two Fabric Interconnects. The infrastructure portions of UCS are generally redundant due to these two FIs but only if the clustering service has converged - so it’s important to use the “show cluster state” command to verify this is the case. During a firmware upgrade to 2.2(1b), I checked this:
6296FAB-A# connect local-mgmt 6296FAB-A(local-mgmt)# show cluster state Cluster Id: 8048cd6e-5d54-11e3-b36c-002a6a499d04 Unable to communicate with UCSM controller The error message - “unable to communicate with UCSM controller” worried me, and it was given when I ran the “show cluster state” command as well as the “cluster lead” command - the latter of which is necessary to switch an FI’s role in the cluster from subordinate to primary.
One of the (sadly numerous) issues I’ve run into while upgrading to Cisco UCSM version 2.2(1b) is this little error message indicating that a service failed to start:
This gives us an error code of F0867 and it’s letting us know that the UCSM process httpd_cimc.sh failed on one of our Fabric Interconnects.
For those that don’t know, you can get a list of processes within UCSM by connecting to local management and running “show pmon state”.
One of the (sadly numerous) issues I’ve run into while upgrading to Cisco UCSM version 2.2(1b) is this little error message indicating that a service failed to start:
This gives us an error code of F0867 and it’s letting us know that the UCSM process httpd_cimc.sh failed on one of our Fabric Interconnects.
For those that don’t know, you can get a list of processes within UCSM by connecting to local management and running “show pmon state”.
When making the leap to adopting FCoE as a storage medium, there are a few things to consider in order to be successful. Many of these concepts are foreign to the storage administrator who has been operating a native Fibre Channel SAN for the better part of the last decade or more - this is because while Fibre Channel networks are costly, they are purpose-built. There is no concept of a loop in Fibre Channel - with Ethernet we deal with these all the time.
When making the leap to adopting FCoE as a storage medium, there are a few things to consider in order to be successful. Many of these concepts are foreign to the storage administrator who has been operating a native Fibre Channel SAN for the better part of the last decade or more - this is because while Fibre Channel networks are costly, they are purpose-built. There is no concept of a loop in Fibre Channel - with Ethernet we deal with these all the time.
Over the past few months I’ve heard a lot about vendor lock-in, specifically with respect to new SDN/Network Virtualization products that have come out last year. It appears that no matter what product you look at, there’s some major factor that will cause you to be severely locked in to that vendor until the end of time. Unless, of course, you’re a proponent of that vendor, in which case, that vendor is NOT locking you in, but that other guy totally is.
I’ve been hearing a lot about libvirt, so I figured I’d check it out, and see if I could play around with it in my own home lab.
According to the wiki, libvirt is a ”collection of software that provides a convenient way to manage virtual machines and other virtualization functionality, such as storage and network interface management.” Okay that’s pretty cool - basically if I have a multi-hypervisor environment, I can build my operational policies around libvirt, so that no matter what hypervisor a workload is being instantiated on, the process is the same.
I’ve been hearing a lot about libvirt, so I figured I’d check it out, and see if I could play around with it in my own home lab.
According to the wiki, libvirt is a ”collection of software that provides a convenient way to manage virtual machines and other virtualization functionality, such as storage and network interface management.” Okay that’s pretty cool - basically if I have a multi-hypervisor environment, I can build my operational policies around libvirt, so that no matter what hypervisor a workload is being instantiated on, the process is the same.
Wow - this one snuck up on me. Seriously, when I think of how 2013 went, I’m amazed at how much happened this year but also how fast it flew by.
As per the tradition I started last year, I thought it prudent to write a post summarizing how terribly I was able to forecast 2013 in terms of personal goals, and make another feeble attempt at planning out 2014.
I see a lot of articles and even vendor whitepapers that like to throw the term “best practice” around like it’s pocket change. Truth be told, while there are plenty of general best practices that are recommended in any case, many of what a vendor will call “best practices” are usually just the most common response to an If/Then statement that represents the surrounding environment.
Here’s a good example. I’ve heard on multiple occasions regarding the standard vSwitch in VMWare vSphere that it is a “best practice” to set the load balancing policy to “route based on the originating virtual port ID”.
I was troubleshooting an MTU related issue for NFS connectivity in a Flexpod (Cisco UCS, Cisco Nexus, and Netapp storage with VMware vSphere, running the Nexus 1000v). Regular-sized frames were making it through, but not jumbo frames. I ensured the endpoints were set up correctly, then moved inwards….in my experience, the problem is usually there.
The original design basically included the use of CoS tag 2 for all NFS traffic, so that it could be honored throughout the network, and given jumbo frames treatment.
I saw this Engineers Unplugged video today and was reminded of a viewpoint I’ve been slowly developing over the last two years or so:
Essentially the discussion is about convergence technologies like FCoE, where we rid ourselves of a completely separate network, and converge FC storage traffic onto our standard Ethernet network. With this technology shift, how does this impact the administration of the technology? Do the teams have to converge as well?