Archive

Category Archives for "The Data Center Overlords"

PSA: Virtual Interfaces (in ESXi) Aren’t Limited To Reported Interface Speeds

There is an incorrect assumption that comes up from time to time, one that I shared for a while, is that VMware ESXi virtual NIC (vNIC) interfaces are limited to their “speed”.

In my stand-alone ESXi 7.0 installation, I have two options for NICs: vxnet3 and e1000. The vmxnet3 interface shows up at 10 Gigabit on the VM, and the e1000 shows up as a 1 Gigabit interface. Let’s test them both.

One test system is a Rocky Linux installation, the other is a Centos 8 (RIP Centos). They’re both on the same ESXi host on the same virtual switch. The test program is iperf3, installed from the default package repositories. If you want to test this on your own, it really doesn’t matter which OS you use, as long as its decently recent and they’re on the same vSwitch. I’m not optimizing for throughput, just putting enough power to try to exceed the reported link speed.

The ESXi host is 7.0 running on an older Intel Xeon E3 with 4 cores (no hyperthreading).

Running iperf3 on the vmxnet3 interfaces, that show up as 10 Gigabit on the Rocky VM:

[ 1.323917] vmxnet3 0000:0b:00.0 ens192: renamed  Continue reading

Cut-Through Switching Isn’t A Thing Anymore

So, cut-through switching isn’t a thing anymore. It hasn’t been for a while really, though in the age of VXLAN, it’s really not a thing. And of course with all things IT, there are exceptions. But by and large, Cut-through switching just isn’t a thing.

And it doesn’t matter.

Cut-through versus store-and-forward was a preference years ago. The idea is that cut-through switching had less latency than store and forward (it does, to a certain extent). It was also the preferred method, and purchasing decisions may have been made (and sometimes still are, mostly erroneously) on whether a switch is cut-through or store-and-forward.

In this article I’m going to cover two things:

  • Why you can’t really do cut-through switching
  • Why it doesn’t matter that you can’t do cut-through switching

Why You Can’t Do Cut-Through Switching (Mostly)

You can’t do cut-through switching when you change speeds. If the bits in a frame are sent at 10 Gigabits, they need to go into a buffer before they’re sent over a 100 Gigabit uplink. The reverse is also true. You can’t stuff a frame that’s piling into an interface 10 times faster than it’s sending (though it’s not slowed down).

So any switch Continue reading

Requiem for FCoE

FCoE is dead. We’re beyond the point of even asking if FCoE is dead, we all know it just is. It was never widely adopted and it’s likely never going to be widely adopted. It enjoy a few awkward deployments here and there, and a few isolated islands in the overall data center market, but it it never caught on the way it was intended to.

So What Killed FCoE?

So what killed FCoE? Here I’m going to share a few thoughts on why FCoE is dead, and really never was A Thing(tm).

It Was Never Cheaper

Ethernet is the champion of connectivity. It’s as ubiquitous as water in an ocean and air in the.. well, air. All the other mediums (ATM, Frame Relay, FDDI, Token Ring) have long ago fallen by the wayside. Even mighty Infiniband has fallen. Only Fibre Channel still stands as the alternative for a very narrow use case.

The thought is that the sheer volume of Ethernet ports would make them cheaper (and that still might happen), but right now there is no real price benefit from using FCoE versus FC.

In the beginning, especially, FCoE was quite a bit more expensive than running separate Continue reading

The Three Levels of Data Protection for Data Hoarders

The following post is aimed for photographers and other digital hoarders. Those of us that want to keep various digital assets not just for a few years, but a lifetime, and even multiple lifetimes (passed down, etc.)

There are three levels of data protection: Data resiliency, data backup, and data archive.

Data Resiliency (Redundant Disks, RAID, NAS/DAS)

Data resiliency is when you have multiple disks in some sort of redundant configuration. Typically this is some type of RAID array, through there are other technologies now that operate similar to RAID (such as ZFS, Storage Spaces, etc.) This will protect you from a drive failure. It will not, however, protect you from accidental file deletion, theft, flood/natural disaster, etc. The drives have the same file system on them, and thus have a lot of “shared fate”, where if something happens to one, it can happen to the other.

To put it simply, while there are some scenarios where your data is protected by data resiliency (drive failure), there are scenarios where it won’t (flood, theft).

RAID is not backup.

Data Backup

One of the maxims we have in the IT industry in which I’ve worked for the past Continue reading

Wow: NVMe and PCIe Gen 4

Recently it’d come to my attention that my old PC rig wasn’t cutting it.

Considering it was 10 years old, it was doing really well. I mean, I went from HDD to 500 GB SSD to 1 TB SSD, up’d the RAM, and replaced the GPU at least once. But still, it was a 4-core system (8 threads) and it had performed admirably.

The Intel NIC was needed because the built-in ASUS Realtek NIC was a piece of crap, only able to push about 90 MB/s. The Intel NIC was able to push 120 MB/s (close to the theoretical max for 1 Gigabit which is 125 MB/s).

The thing that broke the camel’s back, however, was video. Specifically 4K video. I’ve been doing video edits and so forth in 1080p, but moving to 4K and the power of Premerier Pro (as opposed to iMovie) was just killing my system. 1080p was a Continue reading

For ESXi: Realtek NICs Are Awful And Don’t Use Them

OK, this isn’t a really a controversial opinion. This is more as a guide for those who run into these problems when trying to setup their first whitebox/homelab systems for ESXi.

So it goes something like this: You’ve got an old desktop, gaming rig, or workstation. You decide you’ll retire it to your home data center (or basement, or laundry room) as a hypervisor. ESXi by itself (no vSphere controller) is free, and here’s how to download and get the license key.

For most desktop/workstation type of hardware, you can install ESXi from the general ESXi installer except for one aspect: Many of these types of systems use Realtek, Marvell, or other desktop/consumer grade NICs, and there’s not an ESXi driver for these. And for good reasons: They suck.

So you have the choice: Try to use a special custom ISO installer with the Realtek?Marvell/etc. driver loaded, or buy a different NIC. In most of IT, there’s usually more than one right answer, and a heaping dose of “it depends”. However, for this particular question (Realtek or buy another NIC) there’s only right right answer: Buy another NIC.

Realtek NICs suck. They don’t perform well, they’re a pain to Continue reading

Certification Exam Questions That I Hate

In my 11 year career as an IT instructor, I’ve had to pass a lot of certification exams. In many cases not on the first try. Sometimes for fair reasons, and sometimes, it feels, for unfair reasons. Recently I had to take the venerable Cisco CCNA R&S exam again. For various reasons I’d allowed it to expire, and hadn’t taken many exams for a while. But recently I needed to re-certify with it which reminded me of the whole process.

Having taken so many exams (50+ in the past 11 years) I’ve developed some opinions on the style and content of exams.

giphy.gif

In particular, I’ve identified some types of questions I utterly loath for their lack of aptitude measurement, uselessness, and overall jackassery. Plus, a couple of styles that I like.

This criticisms is for all certification exams, from various vendors, and not limited to even IT.

To Certify, Or Not To Certify

The question of the usefulness of certification is not new.

One one hand, you have a need to weed out the know-its from the know-it-nots, a way to effectively measure a person’s aptitude in a given subject. A certification exam, in its purest form, is meant to Continue reading

A Discussion On Storage Overhead

Let’s talk about transmission overhead.

For various types of communications protocols, ranging from Ethernet to Fibre Channel to SATA to PCIe, there’s typically additional bits that are transmitted to help with error correction, error detection, and/or clock sync. These additional bits eat up some of the bandwidth, and is referred to generally as just “the overhead”.

For 1 Gigabit Ethernet and 8 Gigabit Fibre Channel as well as SATA I, II, and III, they use 8/10 overhead. Which means for every eight bits of data, an additional two bits are sent.

The difference is who pays for those extra bits. With Ethernet, Ethernet pays. With Fibre Channel and SATA, the user pays.

1 Gigabit Ethernet has a raw transmit rate of 1 gigabit per second. However, the actual transmission rate (baud, the rate at which raw 1s and 0s are transmitted) for Gigabit Ethernet is 1.25 gigabaud. This is to make up for the 8/10 overhead.

SATA and Fibre Channel, however, do not up the baud rate to accommodate for the 8/10 overhead. As such, even though 1,000 Gigabit / 8 bits per byte = 125 MB/s, Gigabit Fibre Channel only provides 100 MB/s. 25 MB/s is eaten up Continue reading

A Primer for Home NAS Storage Speed Units and Abbreviations

One of the most common mistakes/confusion I see with regard to storage is how speed is measured.

In tech, there’s some cultural conventions to which units speeds are discussed in.

  • In the networking world, we measure bits per second
  • In the storage and server world, we measure speed in bytes per second

Of course they both say the same thing, just in different units. You could measure bytes per second in the networking world and bits per second in the server/storage world, but it’s not the “native” method and could add to confusion.

For NAS, we have a bit of a conundrum in that we’re talking about both worlds. So it’s important to communicate effectively which method you’re using to measure speed: bits of bytes.

Generally speaking, if you want to talk about Bytes, you capitalize the B. If you want to talk about bits, the b is lower case. I.e. 100 MB/s (100 Megabytes per second) and 100 Mbit or Mb (100 Megabit per second).

This is important, because there a 8 bits in a byte, the difference in speed is pretty stark depending on if you’re talking about bits per second or bytes per Continue reading

Microsoft Storage Spaces Is Hot Garbage For Parity Storage

I love parity storage. Whether it’s traditional RAID 5/6, erasure coding, raidz/raid2z, whatever. It gives you redundancy on your data without requiring double the drives that mirroring or mirroring+stripping would require.

The drawback is write performance is not as good as mirroring+stripping, but for my purposes (lots of video files, cold storage, etc.) parity is perfect.

In my primary storage array, I use double redundancy on my parity, so effectively N+2. I can lose any 2 drives without losing any data.

I had a simple Storage Spaces mirror on my Windows 10 Pro desktop which consisted of (2) 5 TB drives using ReFS. This had four problems:

  • It was getting close to full
  • The drives were getting old
  • ReFS isn’t support anymore on Windows 10 Pro (need Windows 10 Workstation)
  • Dropbox (which I use extensively) is dropping support for ReFS-based file systems.

ReFS had some nice features such as checksumming (though for data checksumming, you had to turn it on), but given the type of data I store on it, the checksumming isn’t that important (longer-lived data is stored either on Dropbox and/or my ZFS array). I do require Dropbox, so back to NTFS it is.

I Continue reading

ZFS on Linux with Encryption Part 2: The Compiling

First off: Warning. I don’t know what the stability of this feature is. It’s been in the code for a couple of months, it hasn’t been widely used. I’ve been testing it, and so far it’s worked as expected. 

In exploring native encryption, I attempted to get it on Linux/ZFS using the instruction on this site: https://blog.heckel.xyz/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/. While I’m sure they worked at the time, the code in the referenced non-standard repos has changed and I couldn’t get anything to compile correctly.

After trying for about a day, I realized (later than I care to admit) that I should have just tried the standard repos. They worked like a charm. The instructions below compiled and successfully installed ZFS on Linux with dataset encryption on both Ubuntu 17.10 and CentOS 7.4 in the November/December 2017 time frame.

Compiling ZFS with Native Encryption

The first step is to make sure a development environment is installed on your Linux system. Make sure you have compiler packages, etc. installed. Here’s a few packages for CentOS you’ll need (you’ll need similar packages/libraries for whatever platform you run).

  • openssl-devel
  • attr, libattr-devel
  • libblkid-devel
  • zlib-devel
  • libuuid-devel

The builds were pretty good at telling Continue reading

ZFS and Linux and Encryption Part 1: Raining Hard Drives

(Skip to Part II to learn how to install ZFS with encryption on Linux)

Best Buy has been having a constant series of sales on WD Easy Store 8 TB drives. And it turns out, inside many of them (though not all) are WD Red NAS 5400 RPM drives. For $130-180 a piece, that’s significantly less than the regular price on Amazon/Newegg for these drives bare, which is around $250-$275.

(For updates on the sales, check out the subreddit DataHoarder.)

takemymoney

Over the course of several months, I ended up with 6 WD Red NAS 8 TB drives. Which is good, because my current RAID array is starting to show its age, and is also really, really full.

If you’re not familiar with the WD NAS Red’s, they’re drives specifically built to run 24/7. The regular WD Reds are 5400 RPM, so they’re a bit slower than a regular desktop drive (the Red Pro are 7200 RPM), but I don’t really care for my workload. For speed I use SSDs, and these drives for bulk storage. Plus, the slower speeds mean less heat and less power.

My current array is made of (5) 3 TB drives operating at Continue reading

Do We Need Chassis Switches Anymore in the DC?

While Cisco Live this year was far more about the campus than the DC, Cisco did announce the Cisco Nexus 9364C, a spine-oriented switch which can run in both ACI mode and NX-OS mode. And it is a monster.

It’s (64) ports of 100 Gigabit. It’s from a single SoC (the Cisco S6400 SoC).

It provides 6.4 Tbps in 2RU, likely running below 700 watts (probably a lot less). I mean, holy shit.

9364c

Cisco Nexus 9364C: (64) ports of 100 Gigabit Ethernet.

And Cisco isn’t the only vendor with an upcoming 64 port 100 gigabit switch in a 2RU form factor. Broadcom’s Tomahawk II, successor to their 25/100 Gigabit datacenter SoC, also sports the ability to have (64) 100 Gigabit interfaces. I would expect the usual suspects to announce switches based on these soon (Arista, Cisco Nexus 3K, Juniper, etc.)

And another vendor Innovium, while far less established, is claiming to have a chip in the works that can do (128) 100 Gigabit interfaces. On a single SoC.

For modern data center fabric, which rely on leaf/spine Clos style topologies, do we even need chassis anymore?

For a while we’ve been reliant upon the Sith-rule on Continue reading

Fibre Channel of Things (FCoT)

The “Internet of Things” is well underway. There are of course the hilarious bad examples of the technology (follow @internetofshit for some choice picks), but there are many valid ways that IoT infrastructure can be extremely useful.  With the networked compute we can crank out for literally pennies and the data they can relay to process, IoT is here to stay.

Hacking a dishwasher is the new hacking a gibson

But there’s one thing that these dishwashers, cars, refrigerators, Alexa’s, etc., all lack: Access to decent storage.

The storage on many IoT devices is either terrible or nonexistent. Unreliable flash storage or no storage at all. That’s why the Fibre Channel T19 working group created a standard for FCoT (Fibre Channel of Things). This gives small devices access to real storage, powered by arrays not cheap and unreliable local flash storage.

The FCoT suite is a combination of VXSAN and FCIP. VXSAN provides the multi-tenancy and scale to fibre channel networks, and FCIP gives access to the VXSANs from a variety of FCaaS providers over the inferior IP networks (why Continue reading

Why We Wear Seat Belts On Airplanes

This post is inspired by Matt Simmons‘ fantastic post on why we still have ashtrays on airplanes, despite smoking being banned over a decade ago. This time, I’m going to cover seat belts on airplanes. I’ve often heard people balking at the practice for being somewhat arbitrary and useless, much like balking at turning off electronic devices before takeoff. But while some rules in commercial aviation are a bit arbitrary, there is a very good reason for seat belts.

In addition to being a very, very frequent flier (I just hit 1 million miles on United), I’m also a licensed fixed wing pilot and skydiving instructor. Part of the training of any new skydiver is what we call the “pilot briefing”. And as part of that briefing we talk about the FAA rules for seat belts: They should be on for taxi, take-off, and landing. That’s true for commercial flights as well.

Some people balk at the idea of seat belts on commercial airliners. After all, if you fly into the side of a mountain, a seat belt isn’t going to help much. But they’re still important.

84271048

Your Seat Belt Is For Me, My Seat Belt Is For You

Continue reading

Did VMware vSphere 6.0 Remove the Layer 2 Adjacency Requirement For vMotion? No.

images

I’ve seen this misconception a few times on message boards, reddit, and even comments on this blog: That Layer 2 adjacency is no longer required with vSphere 6.0, as VMware now supports Layer 3 vMotion. The (mis)perception is that you no longer need to stretch a Layer 2 domain between ESXi hosts.

That is incorrect. VMware did remove a Layer 2 adjacency requirement for the vMotion Network, but not for the VMs. Lemme explain.

It used to be (before vSphere 6.0) that you were required to have the VMkernel interfaces that performed vMotion on the same subnet. You weren’t supposed to go through a default gateway (though I think you could, it just wasn’t supported). So not only did your VM networks need to be stretched between hosts, but so did your VMkernel interfaces that performed the vMotion sending/receiving.

What vSphere added was a separate TCP/IP stack for vMotion networks, so you could have a specific default gateway for vMotion, allowing your vMotion VMkernel interfaces to be on different subnets.

This does not remove the requirement that the same Layer 2 network exist on the sending and receiving ESXi host. The IP of the VM needs to Continue reading

Fibre Channel in the Cloud: FCaaS

Public cloud providers such as Amazon Web Services, Microsoft Azure, and Rackspace, as well as private cloud systems such as OpenStack, have dominated the computing landscape for the past several years. And once a joke of a marketing term (remember Larry Ellison’s super villain-monologue on the topic?), the cloud is now A Thing, with a definition and everything.

One technology that seemed like it was getting left behind in all these cloud games, however, was Fibre Channel. Ephemeral compute nodes, object storage, extreme scale, elastic provisioning — all of these were characteristics that were initially thought to be bad fits for Fibre Channel.

giphy

Sad Fibre Channel is Sad

As it turns out, Fibre Channel is right at home in the cloud.

mrp6ibd

Amazon Web Services has recently rolled out Fibre Channel as a Service (FCaaS), as have Rackspace, Digital Ocean, and Microsoft Azure.

All of those public cloud providers have some sort of block storage offerings, but they’re typically based on something like iSCSI or another back-end block protocol. Customers have been demanding the kind of block storage in the public cloud, where they can control zoning and zonesets, just like they do in their traditional data centers worlds.

The Continue reading

LACP is not Link Aggregation

So there’s a mistake I’ve been making, for years. I’ve referred to what is link aggregation as “LACP”.  As in “I’m setting up an LACP between two switches”. While you can certainly set up LACP between to switches, the more correct term for the technology is link aggregation (as defined by the IEEE), and an instance of that is generically called a LAG (Link Aggregation Group). LACP is an optional part of this technology.

Here I am explaining this and more in an 18 minute Youtube video.