Netvisor and iTOR Unvieled

After a long wait, we finally unveiled stage 1 of the big solution – the Netvisor and our intelligent Top of the Rack (iTOR) switch. If you haven’t had a chance to see, you can read about it here. At this point, we have enough boxes on the way that we can open the beta to slightly larger audience. Some more details about the hardware – it has 48  10gigabit ethernet ports which can take a sfp+ optical module, sfp+ direct attach or a 1gigbit RJ45 module along with 4x40gigabit qsfp ports. The Network Hypervisor controlling one or more iTOR is a full fledged Operating System and  amongst other things capable of running your hardest applications. Comes with all tools like gcc/gdb/perl already there and you can load anything else that is not there. Why you may ask – if you always had an application that needed to be in the network, now it truly can be on the network. Imagining doing your physical or virtual load balancers, proxy servers, monitoring engines, IDS systems, SPAM filters, running on our network hypervisor where they are truly in the network without needing anything to plug in. Create you virtual networks along with Continue reading

A BGP leak made in Canada

A BGP leak made in Canada

Today many network operators saw their BGP session flap, RTT’s increase and CPU usage on routers spike.  While looking at our BGP data we determined the root cause to be a large BGP leak in Canada that quickly affected networks worldwide.

Dery Telecom
Based on our analysis it seems that Canadian ISP Dery Telecom Inc (AS46618) is the cause of what we observed today. AS46618 is dual homed to both VIDEOTRON and Bell. What seems to have happened is that AS46618 leaked routes learned from VIDEOTRON to Bell. This in itself is not unique and happens relatively often. However normally transit ISP’s like Bell have strict filters applied on these BGP sessions, limiting the number of prefixes they accept from their customers. In this case the filter failed to work or simply wasn’t (correctly) applied by both Bell and Dery Telecom.

Sequence of events
At 17:27 UTC  AS46618 ( Dery Telecom Inc) started to leak a ‘full table’, or at least a significant chunk of it to its provider Bell AS577. Bell selected 107,409 of these routes as best routes. Even though many of the ASpaths were much longer than other alternatives it Continue reading

The First Bufferbloat Battle Won

Some puzzle pieces of a picture puzzle.Bufferbloat was covered in a number of sessions at the Vancouver IETF last week.

The most important of these sessions is a great explanation of Kathie Nichols and Van Jacobson’s CoDel (“coddle”) algorithm given during Tuesday’s transport area meeting by Van.  It is not to be missed by serious network engineers. It also touches on why we like fq_codel so much, though I plan to write much more extensively on this topic very soon. CoDel by itself is great, but in combination with SFQ (like) algorithms that segregate flows, the results are stunning; CoDel is the first AQM algorithm which can work across arbitrary number of queues/flows.

The Saturday before the IETF the IAB / IRTF Workshop on Congestion Control for Interactive Real-Time Communication took place. My position paper was my blog entry of several weeks back. In short,  there is no single bullet, though with CoDel we finally have the final missing bullet for its complete solution. The other, equally important but non-technical bullets will be market pressure fix broken software/firmware/hardware all over the Internet: so exposing the bloat problem is vital. You cannot successfully engineer around bufferbloat, but you can detect it, and let users know when they Continue reading

How does Openflow, SDN help Virtualization/Cloud (Part 3 of 3) – Why Nicira had to do a deal?

The challenges faced by Openflow and SDN

This is the 3rd and final article in this series. As promised, lets look at some of the challenges facing this space and how we are addressing those challenges.

Challenge 1 – Which is why Nicira had to get a big partner

I have seen a lot of article about Nicira being acquired. The question no one has asked is – if the space is so hot, why did Nicira sell so early? The deal size (1.26B) was hardly chump change but if I were them and my stock was rising exponentially, then I would have held off in lure of changing the world. So what was the rush? I believe the answer lies in some of the issues I discussed in article 2 of this series a few months back–the difference between server (Controller-based) and switch (Fabric-based) approaches. The Nicira solution was very dependent on the server and the server hypervisor. The world of server operating systems and hypervisor is so fragmented that staying independent would have been a very uphill battle. Tying up with one of the biggest hypervisors made sense to ensure that their technology keeps moving forward. And Continue reading

The IPv6 DFZ just passed 10.000 prefixes


I noticed the IPv6 Default Free Zone just tipped over 10.000 prefixes. I think this is quite a milestone! Go IPv6!

The data was taken from the RING Looking Glass, which is part of the NLNOG RING project. The RING Looking Glass takes full BGP feeds from ± 20 organisations.

BTW, as an apples and peaches comparison: the IPv4 full routing table currently is roughly 415.000 prefixes. Live IPv4 and IPv6 graphs can be viewed here (RING) and here (Potaroo).

Native VLAN – Some Surprising Results


I did some fiddling around with router-on-a-stick configurations recently and found some native VLAN behavior that took me by surprise.

The topology for these experiments is quite simple, just one router, one switch, and a single 802.1Q link interconnecting them:
Dead Simple Routers-On-A-Stick Configuration


The initial configuration of the switch looks like:

vlan 10,20,30
!
interface FastEthernet0/34
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 10,20,30
 switchport mode trunk
 spanning-tree portfast trunk
!
interface Vlan10
 ip address 192.0.2.1 255.255.255.0
!
interface Vlan20
 ip address 198.51.100.1 255.255.255.0
!
interface Vlan30
 ip address 203.0.113.1 255.255.255.0


And the initial configuration of the router looks like:

interface FastEthernet0/0
 no ip address
 duplex auto
 speed auto
!
interface FastEthernet0/0.10
 encapsulation dot1Q 10
 ip address 192.0.2.2 255.255.255.0
!
interface FastEthernet0/0.20
 encapsulation dot1Q 20
 ip address 198.51.100.2 255.255.255.0
!
interface FastEthernet0/0.30
 encapsulation dot1Q 30
 ip address 203.0.113.2 255.255.255.0



So, nothing too interesting going on here. The devices can ping each other on each of their three IP interfaces.

We can switch VLAN tagging off Continue reading

KICLet: Solarwinds’ Dirty Google Tricks

This is a (justifiable) rant. You’ve been warned. Solarwinds Orion NPM is an okay tool, but when it comes to managing anything other than Cisco switches and routers, it’s…..meh. It takes very little effort to get devices like that monitored to the fullest extent, but when it comes to something like a storage array, it seems like you really have to make tweaks until your fingers bleed to get the minimal monitoring functionality out of it.

KICLet: Solarwinds’ Dirty Google Tricks

This is a (justifiable) rant. You’ve been warned. Solarwinds Orion NPM is an okay tool, but when it comes to managing anything other than Cisco switches and routers, it’s…..meh. It takes very little effort to get devices like that monitored to the fullest extent, but when it comes to something like a storage array, it seems like you really have to make tweaks until your fingers bleed to get the minimal monitoring functionality out of it.

New 10Gb/s Interconnect Options

Followers of this blog, or folks who've heard me on the Packet Pushers Podcast may have noticed that I obsessively look for less expensive ways to interconnect data center devices.

That's because the modules are so expensive! A loaded Nexus 7010 is intimidating enough with it's $473,000 list price, but that's without any optic modules...

If we want to link those interfaces to something with 10GBASE SR modules then triple the budget because the optics cost twice as much as the switch:

$1495 / module * 8 blades * 48 links / blade * 2 modules / link = $1,148,160

By comparison, purchasing the same amount of links in 5m Twinax form comes in under $100,000.

So, there tend to be a lot of TwinAx and FET modules in my designs, and the equipment is located carefully to ensure that ~$200 TwinAx links can be used rather than ~$3,000 fiber links.

The tradeoff comes when you put a lot of 5m Twinax cables into one place and quickly find they're not much fun to work with because they're not bendy and because they're thick. That's why I'm so interested in something I noticed yesterday on Cisco's 10GBASE SFP+ Modules Data Continue reading

4 Types of Port Channels and When They’re Used

The other day I was catching up on recorded content from Cisco Live! and I saw mention of yet another implementation of port channels (this time called Enhanced Virtual Port Channels). I thought it would make a good blog entry to describe the differences of each, where they are used, and what platforms each is supported on.

Marathon Networks’ PVTD

I have seen many broken Private VLAN networks, and decided to find a way to fix them. So in the last year I have developed a network appliance called PVTD, which solves many of the Private VLAN problems. You can read all bout it at www.marathon-networks.com

KIClet: Sub-Optimal Fibre Channel Path Selection

The SAN I’m currently working with connects a pair of Netapp FAS3270 filers running ONTAP 8.0.2 7-Mode. If you’re running VMware ESXi in your environment in front of a Fibre Channel SAN, path selection is discovered more or less in a first-come-first-served fashion. I got this message on my Netapp filer: FCP Partner Path Misconfigured: Host I/O access through a non-primary and non-optimal path was detected. Since the LUNs mounted by ESXi were residing on the A-side filer, the paths going through the B-side filer would just be sent over the partner link to the A-side, which is less efficient than going directly through A.

KIClet: Sub-Optimal Fibre Channel Path Selection

The SAN I’m currently working with connects a pair of Netapp FAS3270 filers running ONTAP 8.0.2 7-Mode. If you’re running VMware ESXi in your environment in front of a Fibre Channel SAN, path selection is discovered more or less in a first-come-first-served fashion. I got this message on my Netapp filer: FCP Partner Path Misconfigured: Host I/O access through a non-primary and non-optimal path was detected. Since the LUNs mounted by ESXi were residing on the A-side filer, the paths going through the B-side filer would just be sent over the partner link to the A-side, which is less efficient than going directly through A.

What is LISP DDT?

Some background on LISP

LISP (Locator/Identifier Separation Protocol) is a smart and novel method to create overlay networks with features such as multi-homing, mobility and VPN-segregation. These feats are possible because LISP makes a distinction between the 'who' and the 'where'.
"The separation of location and identity is a step which has recently been identified by the IRTF as a critically necessary evolutionary architectural step for the Internet."
- N. Chiappa in draft-chiappa-lisp-introduction-00 
An example would be that my IPv6 prefix 2001:67c:208c:10::/64 (the 'who') currently is located behind the following WAN IP addresses: 62.194.155.106, 217.8.107.2 and 2001:67C:21B4:1::2 (the 'where'). In this example my prefix is multi-homed behind 3 connections, and I'm doing IPv6 over IPv4 next to IPv6 over IPv6. This is possible because this single IPv6 prefix can have multiple Routing Locators (the 'where') and LISP is address-family agnostic.

Mapping systems for location information

As you can imagine, the key to protocols like LISP is locating who is where in a fast and efficient way.

To create more context: with the Border Gateway Protocol (BGP) all participating nodes (routers) have all information about everybody in memory. When an organisation Continue reading

KIClet: Microphone troubles with Lenovo W520

I came across this the other day and wanted to share. For some reason, Windows by default decided to enable the “audio enhancements” feature on my new Lenovo Thinkpad w520. This caused my microphone to essentially be unusable - I was in several webex meetings and each time everyone said I was completely garbled and not even close to being able to understand me. After a little poking around, I found this:

KIClet: Microphone troubles with Lenovo W520

I came across this the other day and wanted to share. For some reason, Windows by default decided to enable the “audio enhancements” feature on my new Lenovo Thinkpad w520. This caused my microphone to essentially be unusable - I was in several webex meetings and each time everyone said I was completely garbled and not even close to being able to understand me. After a little poking around, I found this:

Get-Console Review on the iPad

I have used my iPad to console onto Cisco routers and switches for about 2 years now. I started using the Flex-Serial cable on my jailbroken iPad and iPhone, with the iSSH app and a ported version of Minicom (earlier blog post).  Amidst some minor bugs and irritations this worked well and was considerably more […]

The Internet is Broken, and How to Fix It

Some puzzle pieces of a picture puzzle.

Many real time applications such as VOIP, gaming,  teleconferencing, and performing music together, require low latency. These are increasingly unusable in today’s internet, and not because there is insufficient bandwidth, but that we’ve failed to look at the Internet as a end to end system. The edge of the Internet now often runs congested. When it does, bufferbloat causes performance to fall off a cliff.

Where once a home user’s Internet connection consisted of a single computer, it now consists of a dozen or more devices – smart phones, TV’s, Apple TV’s/Roku devices, tablet devices, home security equipment, and one or more computer per household member. More Internet connected devices are arriving every year, which often perform background activities without user’s intervention, inducing transients on the network. These devices need to effectively share the edge connection, in order to make each user happy. All can induce congestion and bufferbloat that baffle most Internet users.

The CoDel (“coddle”) AQM algorithm provides the “missing link” necessary for good TCP behavior and solving bufferbloat. But CoDel by itself is insufficient to solve provide reliable, predictable low latency performance in today’s Internet.

Bottlenecks are most common at the “edge” of the Internet and there you must Continue reading