Category Archives for "The Data Center Overlords"

Fibre Channel in the Cloud: FCaaS

Public cloud providers such as Amazon Web Services, Microsoft Azure, and Rackspace, as well as private cloud systems such as OpenStack, have dominated the computing landscape for the past several years. And once a joke of a marketing term (remember Larry Ellison’s super villain-monologue on the topic?), the cloud is now A Thing, with a definition and everything.

One technology that seemed like it was getting left behind in all these cloud games, however, was Fibre Channel. Ephemeral compute nodes, object storage, extreme scale, elastic provisioning — all of these were characteristics that were initially thought to be bad fits for Fibre Channel.


Sad Fibre Channel is Sad

As it turns out, Fibre Channel is right at home in the cloud.


Amazon Web Services has recently rolled out Fibre Channel as a Service (FCaaS), as have Rackspace, Digital Ocean, and Microsoft Azure.

All of those public cloud providers have some sort of block storage offerings, but they’re typically based on something like iSCSI or another back-end block protocol. Customers have been demanding the kind of block storage in the public cloud, where they can control zoning and zonesets, just like they do in their traditional data centers worlds.

The Continue reading

LACP is not Link Aggregation

So there’s a mistake I’ve been making, for years. I’ve referred to what is link aggregation as “LACP”.  As in “I’m setting up an LACP between two switches”. While you can certainly set up LACP between to switches, the more correct term for the technology is link aggregation (as defined by the IEEE), and an instance of that is generically called a LAG (Link Aggregation Group). LACP is an optional part of this technology.

Here I am explaining this and more in an 18 minute Youtube video.

Fibre Channel: What Is It Good For?

In my last article, I talked about how Fibre Channel, as a technology, has probably peaked. It’s not dead, but I think we’re seeing the beginning of a slow decline. Fibre Channel’s long goodbye is caused by a number of factors (that mostly aren’t related to Fibre Channel itself), including explosive growth in non-block storage, scale-out storage, and interopability issues.

But rather than diss Fibre Channel, in this article I’m going to talk about the advantages of Fibre Channel has over IP/Ethernet storage (and talk about why the often-talked about advantages aren’t really advantages).

Fibre Channel’s benefits have nothing to do with buffer to buffer credits, the larger MTU (2048 bytes), its speed, or even its lossless nature. Instead, Fibre Channel’s (very legitimate) advantages are mostly non-technical in nature.

It’s Optimized Out of the Box

When you build a Fibre Channel-based SAN, there’s no optimization that needs to be done: Fibre Channel comes out of the box optimized for storage (SCSI) traffic. There are settings you can tweak, but most of the time there’s nothing that needs to be done other than set port modes and setup zoning. The same is true for the host HBAs. While there are some Continue reading

Peak Fibre Channel

There have been several articles talking about the death of Fibre Channel. This isn’t one of them. However, it is an article about “peak Fibre Channel”. I think, as a technology, Fibre Channel is in the process of (if it hasn’t already) peaking.

There’s a lot of technology in IT that doesn’t simply die. Instead, it grows, peaks, then slowly (or perhaps very slowly) fades. Consider Unix/RISC. The Unix/RISC market right now is a caretaker platform. Very few new projects are built on Unix/RISC. Typically a new Unix server is purchased to replace an existing but no-longer-supported Unix server to run an older application that we can’t or won’t move onto a more modern platform. The Unix market has been shrinking for over a decade (2004 was probably the year of Peak Unix), yet the market is still a multi-billion dollar revenue market. It’s just a (slowly) shrinking one.

I think that is what is happening to Fibre Channel, and it may have already started. It will become (or already is) a caretaker platform. It will run the workloads of yesterday (or rather the workloads that were designed yesterday), while the workloads of today and tomorrow have a vastly different set of Continue reading

The Cloud Is Now A Thing

In the networking world, we’re starting to see the term “cloud” more and more. When I teach classes, if I so much as mention the word cloud, I start to see some eyes roll. That’s completely understandable, as the term cloud was such an overused buzzword, only having been recently supplanted only by “software defined”.

Here’s real-life supervillain (dude owns an MiG 29 and an island with a volcano on it… seriously) Larry Ellison freaking out about the term cloud.

“It’s not water vapor! All it is, is a computer attached to a network!”

But here’s the thing, it’s actually a thing now. Rather than a catch-all buzzword, it’s being used more and more to define a particular type of operational model. And it’s defined by NIST, the National Institute of Standards and Technology, part of the US Department of Commerce. With the term cloud, we now get a higher degree of specificity.

The NIST definition of cloud is as follows:

  • On-demand self service
  • Broad network access
  • Resource pooling (multi-tenant)
  • Rapid Elasticity
  • Measured service

That first item on the list, the on-demand self service, is a huge change in how we will be doing networking. Right now network Continue reading

Ethernet over Fibre Channel

Since the 80’s, Ethernet has dominated the networking world. The LAN, the WAN, and the MAN are all now dominated by Ethernet links. FIDDI, HIPPI, ATM, Frame Relay, they’ve all gone by the wayside. But there is one protocol that has stuck around to run alongside Ethernet, and that’s Fibre Channel. While Fibre Channel has mostly sat in the shadow of Ethernet, relegated to only storage traffic, it’s now poised to overtake Ethernet in the battle for the LAN. And the way that Fibre Channel is taking on Ethernet is with Ethernet over Fibre Channel.


Suck it, Metcalfe

While Ethernet has enjoyed tremendous popularity, it has several (debilitating) limitations. For one, forwarding is haunted the possibility of a loop, and Spanning Tree Protocol is required to keep a watchful eye. Unfortunately, STP is almost as bad as a loop, with the ample opportunity for misconfigurations (rouge root bridges) and other shenanigans.  TRILL, a Layer 2 overlay for Ethernet that allows multi-pathing, hasn’t found its way into a commercial product yet, and its derivatives (FabricPath from Cisco and VCS from Brocade) haven’t seen much in the way of adoption.

Rathern than pile fix upon fix on Ethernet, SAN administrators (known for Continue reading

Fibre Channel and FCoE: Some Basics

There’s been some misconceptions and misinformation lately about FCoE. Like any technology, there are times when it makes sense and times when it doesn’t, but much of the anti-FCoE talk lately has been primarily ignorance and/or wilful misrepresentation.

In an effort to fight that ignorance, I put together a quick introduction to how FC and FCoE works. They both operate on the basic premise that you can’t drop any frames. Fibre Channel was built as a lossless protocol, and with a bit of work, Ethernet can also be lossless.

Check it out:

Learn what Russ Fellows Doesn’t Know

So how’s this for a condescending tweet?

It’s from Russ Fellows, author of the infamous FCoE “study” (which has been widely debunked for its many hilarious errors):

Interesting article (check it out). But the sad/amusing irony is that he’s wrong. How is he wrong? Here’s what Russ Fellows doesn’t know about storage:

1, 2, 4, and 8 Gbit Fibre Channel (as he points out) uses 8/10 bit encoding. That means about a 20% of the bandwidth available was lost due to encoding overhead (as Russ pointed out). That’s why 8 Gbit Fibre Channel only provides 800 MB/s of connectivity, even though 8,000 Megabits per second equates to 1,000 Megabytes per second (8000 Megabits / (8 bits per byte) = 1,000 Megabytes).

With this overhead in mind, Fibre Channel was designed to give 100 MB/s for every Gigabit of speed. It never increased the baud rate to make up for the overhead.

Ethernet, on the other hand, did increase the baud rate to make up for Continue reading

OTV AEDs Are Like Highlanders

While prepping for CCIE Data Center and playing around with a lab environment, I ran into a problem I’d like to share.

I was setting up a basic OTV setup with three VDCs running OTV, connecting to a core VDC running the multicast core (which is a lot easier than it sounds). I’m running it in a lab environment we have at Firefly, but I’m not going by our normal lab guide, instead making it up as I go along in order to save some time, and make sure I can stand up OTV without a lab guide.

Each VDC will set up an adjacency with the other two, with the core VDC providing unicast and multicast connectivity.  That part was pretty easy to setup (even the multicast part, which had previously freaked me the shit out). Each VDC would be its own site, so no redundant AEDs.

On each OTV VDC, I setup the following as per my pre-OTV checklist:

  • Bi-directional IPv4 unicast connectivity to each join interface (I used a single OSPF area)
  • MTU of 9216 end-to-end (easy since OTV requires M line cards, and it’s just an MTU command on the interface)
  • An OTV site VLAN which requires:

Top 5 Reasons The Evaluator Group Screwed Up

It’s been a while since the trainwreck of a “study” commissioned by Brocade and performed by The Evaluator Group,  but it’s still being discussed in various storage circles (and that’s not good news for Brocade). Some pretty much parroted the results, seemingly without reading the actual test. Then got all pissy when confronted about it.  I did a piece on my interpretations of the results, as did Dave Alexander of WWT and J Metz of Cisco. Our mutual conclusion can be best summed up with a single animated GIF.



But since a bit of time has passed, I’ve had time to absorb Dave and J’s opinions, as well as others, I’ve come up with a list of the Top 5 Reasons by The Evaluator Group Screwed Up. This isn’t the complete list, of course, but some of the more glaring problems. Let’s start with #1:

Reason #1: I Have No Idea What I’m Doing

Their hilariously bad conclusion to the higher variance in response times and higher CPU usage was that it was the cause of the software initiators. Except, they didn’t use software initiators. The had actually configured hardware initiators, and didn’t know it. Let that sink Continue reading

Fibre Channel: The Heart of New SDN Solutions

From Juniper to Cisco to VMware, companies are spouting up new SDN solutions. Juniper’s Contrail, Cisco’s ACI, VMware’s NSX, and more are all vying to be the next generation of data center networking. What is surprising, however, is what’s at the heart of these new technologies.

Is it VXLAN, NVGRE, Openflow? Nope. It’s Fibre Channel.


If you think about it, it makes sense. Fibre Channel has been doing fabrics since before we ever called Ethernet fabrics, well, fabrics. And this isn’t the first time that Fibre Channel has shown up in unusual places. There’s a version of Fibre Channel that runs inside certain airplanes, including jet fighters like the F-22.


Keep the skies safe from FCoE (sponsored by the Evaluator Group)

New generation of switches have been capable of Data Center Bridging (DCB), which enables Fibre Channel over Ethernet. These chips are also capable of doing native Fibre Channel So rather than build complicated VPLS fabrics or routed networks, various data center switching companies are leveraging the inherent Fibre Channel capabilities of the merchant silicon and building Fibre Channel-based underlay networks to support an IP-based overlay.

Buffer-to-buffer (B2B) credit system and losslessness of Fibre Channel, plus the new 32/128 Continue reading

Hey, Remember vTax?

Hey, remember vTax/vRAM? It’s dead and gone, but with 6 Terabyte of RAM servers now available, imagine what could have been (your insanely high licensing costs).

Set the wayback machine to 2011, when VMware introduced vSphere version 5. It had some really great enhancements over version 4, but no one was talking about the new features. Instead, they talked about the new licensing scheme and how much it sucked.


While some defended VMware’s position, most were critical, and my own opinion… let’s just say I’ve likely ensured I’ll never be employed by VMware. Fortunately, VMware came to their senses and realized what a bone-headed, dumbass move that vRAM/vTax was, and repealed the vRAM licensing one year later in 2012. So while I don’t want to beat a dead horse (which, seriously, disturbing idiom), I do think it’s worth looking back for just a moment to see how monumentally stupid that licensing scheme was for customers, and serve as a lesson in the economies of scaling for the x86 platform, and as a reminder about the ramifications of CapEx versus OpEx-oriented licensing.

Why am I thinking about this almost 2 years after they got rid of vRAM/vTax? I’ve been Continue reading

Changing Data Center Workloads

Networking-wise, I’ve spent my career in the data center. I’m pursuing the CCIE Data Center. I study virtualization, storage, and DC networking. Right now, the landscape in the network is constantly changing, as it has been for the past 15 years. However, with SDN, merchant silicon, overlay networks, and more, the rate of change in a data center network seems to be accelerating.


Things are changing fast in data center networking. You get the picture

Whenever you have a high rate of change, you’ll end up with a lot of questions such as:

  • Where does this leave the current equipment I’ve got now?
  • Would SDN solve any of the issues I’m having?
  • What the hell is SDN, anyway?
  • I’m buying vendor X, should I look into vendor Y?
  • What features should I be looking for in a data center networking device?

I’m not actually going to answer any of these questions in this article. I am, however, going to profile some of the common workloads that you find in data centers currently. Your data center may have one, a few, or all of these workloads. It may not have any of them. Your data center may have one of the Continue reading

FCoE versus FC Farce (I’m Tellin’ All Y’All It’s Sabotage!)

Updates 2/6/2014:

  • @JohnKohler noticed that the UCS Manager screenshot used (see below) is from a UCS Emulator, not any system they used for testing.
  • Evaluator Group promises answers to questions that both I and Dave Alexander (@ucs_dave) have brought up.

On my way back from South America/Antarctica, I was pointed to a bake-off/performance test commissioned by Brocade and performed by a company called Evaluator Group. It compared the performance of edge FCoE (non-multi-hop FCoE) to native 16 Gbit FC. The FCoE test was done on a Cisco UCS blade system connecting to a Brocade switch, and the FC was done on an HP C7000 chassis system connecting to the same switch. At first glance, it would seem to show that FC is superior to FCoE for a number of reasons.

I’m not a Cisco fanboy, but I am a Cisco UCS fanboy, so I took great interest in the report. (I also work for a Cisco Learning Partner as Continue reading

2013 Was A Good Year

Happy new year everyone. I think 2014 will be quite an interesting year for the industry. 2013 certainly was for me, at least professionally and personally. I tried twice to get my CCIE DC, didn’t pass. I did, however, obtain my CCNP Data Center. I also learn a whole bunch of new skills. Here’s a quick clip show (and yes, there are shots of me skydiving in a Star Trek TNG Uniform).

OpenFlow/SDN Won’t Scale?

I got in a conversation today on Twitter, talking about SDN/SDF (software defined forwarding), which is a new term I totally made up which I use to describe the programmatic and centralized control of forwarding tables on switches and multi-layer switches. The comment was made that OpenFlow in particular won’t scale, which reminded me of an article by Doug Gourlay of Arista talking about scalability issues with OpenFlow.

The argument that Doug Gourlay of Arista had is essentially that OpenFlow can’t keep up with the number of new flows in a network (check out points 2 and 3). In a given data center, there would be tens of thousands (or millions or tens of millions) of individual flows running through a network at any given moment. And by flows, I mean keeping track of stateful TCP connection or UDP pseudo-flows. The connection rate would also be pretty high if you’re talking dozens or hundreds of VMs, all taking in new connections. 

My answer is that yeah, if you’re going to try to put the state of every TCP connection and UDP flow into the network operating system and into the forwarding tables of the devices, that’s Continue reading