Author Archives: networkheresy

Elephant Detection in the vSwitch With Performance Handling in the Underlay

As we’ve discussed previously, the vSwitch is a great position to detect elephant, or heavy-hitter flows because it has proximity to the guest OS and can use that position to gather additional context. This context may include the TSO send buffer, or even the guest TCP send buffer. Once an elephant is detected, it can be signaled to the underlay using standard interfaces such as DSCP. The following slide deck provides and overview of a working version of this, showing how such a setup can be used to both dynamically detect elephants and isolate mice from queuing delays they cause. We’ll write about this in more detail in a later post, but for now check out the slides (and in particular the graphs showing the latency of mice with and without detection and handling).

On Policy in the Data Center: The policy problem

(This post was written by Tim Hinrichs and Scott Lowe with contributions from Martin Casado, Mike Dvorkin, Peter Balland, Pierre Ettori, and Dennis Moreau.)

Fully automated IT provisioning and management is considered by many to be the ultimate nirvana— people log into a self-service portal, ask for resources (compute, networking, storage, and others), and within minutes those resources are up and running. No longer are the people who use resources waiting on the people who are responsible for allocating and maintaining them. And, according to the accepted definitions of cloud computing (for example, the NIST definition in SP800-145), self-service provisioning is a key tenet of cloud computing.

However, fully automated IT management is a double-edged sword. While having people on the critical path for IT management was time-consuming, it provided an opportunity to ensure that those resources were managed sensibly and in a way that was consistent with how the business said they ought to be managed. In other words, having people on the critical path enabled IT resources to be managed according to business policy. We cannot simply remove those people without also adding a way of ensuring that IT resources obey business policy—without introducing a way Continue reading

Network Virtualization and the End-to-End Principle

[This post was written by Dinesh Dutt with help from Martin Casado.  Dinesh is Chief Scientist at Cumulus Networks. Before that, he was a Cisco Fellow, working on various data center technologies from ASICs to protocols to RFCs. He’s a primary co-author on the TRILL RFC and the VxLAN draft at the IETF.  Sudeep Goswami, Shrijeet Mukherjee, Teemu Koponen, Dmitri Kalintsev, and T. Sridhar provided useful feedback along the way.]

In light of the seismic shifts introduced by server and network virtualization, many questions pertaining to the role of end hosts and the networking subsystem have come to the fore. Of the many questions raised by network virtualization, a prominent one is this: what function does the physical network provide in network virtualization? This post considers this question through the lens of the end-to-end argument.

Networking and Modern Data Center Applications

There are a few primary lessons learnt from the large scale data centers run by companies such as Amazon, Google, Facebook and Microsoft. The first such lesson is that a physical network built on L3 with equal-cost multipathing (ECMP) is a good fit for the modern data center. These networks provide predictable latency, scale well, converge quickly when nodes or links change, and provide Continue reading

Of Mice and Elephants

[This post has been written by Martin Casado and Justin Pettit with hugely useful input from Bruce Davie, Teemu Koponen, Brad Hedlund, Scott Lowe, and T. Sridhar]


This post introduces the topic of network optimization via large flow (elephant) detection and handling.  We decompose the problem into three parts, (i) why large (elephant) flows are an important consideration, (ii) smart things we can do with them in the network, and (iii) detecting elephant flows and signaling their presence.  For (i), we explain the basis of elephant and mice and why this matters for traffic optimization. For (ii) we present a number of approaches for handling the elephant flows in the physical fabric, several of which we’re working on with hardware partners.  These include using separate queues for elephants and mice (small flows), using a dedicated network for elephants such as an optical fast path, doing intelligent routing for elephants within the physical network, and turning elephants into mice at the edge. For (iii), we show that elephant detection can be done relatively easily in the vSwitch.  In fact, Open vSwitch has supported per-flow tracking for years. We describe how it’s easy to identify elephant flows at the vSwitch and Continue reading

Visibility, Debugging and Network Virtualization (Part 1)

[This post was written by Martin Casado and Amar Padmanahban, with helpful input from Scott Lowe, Bruce Davie, and T. Sridhar]

This is the first in a multi-part discussion on visibility and debugging in networks that provide network virtualization, and specifically in the case where virtualization is implemented using edge overlays.

In this post, we’re primarily going to cover some background, including current challenges to visibility and debugging in virtual data centers, and how the abstractions provided by virtual networking provide a foundation for addressing them.

The macro point is that much of the difficulty in visibility and troubleshooting in today’s environments is due to the lack of consistent abstractions that both provide an aggregate view of distributed state and hide unnecessary complexity. And that network virtualization not only provides virtual abstractions that can be used to directly address many of the most pressing issues, but also provides a global view that can greatly aid in troubleshooting and debugging the physical network as well.

A Messy State of Affairs

While it’s common to blame server virtualization for complicating network visibility and troubleshooting, this isn’t entirely accurate. It is quite possible to build a static virtual datacenter and, assuming the vSwitch Continue reading

Scale, SDN, and Network Virtualization

[This post was put together by Teemu Koponen, Andrew Lambeth, Rajiv Ramanathan, and Martin Casado]

Scale has been an active (and often contentious) topic in the discourse around SDN (and by SDN we refer to the traditional definition) long before the term was coined. Criticism of the work that lead to SDN argued that changing the model of the control plane from anything but full distribution would lead to scalability challenges. Later arguments reasoned that SDN results in *more* scalable network designs because there is no longer the need to flood the entire network state in order to create a global view at each switch. Finally, there is the common concern that calls into question the scalability of using traditional SDN (a la OpenFlow) to control physical switches due to forwarding table limits.

However, while there has been a lot of talk, there have been relatively few real-world examples to back up the rhetoric. Most arguments appeal to reason, argue (sometimes convincingly) from first principles, or point to related but ultimately different systems.

The goal of this post is to add to the discourse by presenting some scaling data, taken over a two-year period, from a production network virtualization Continue reading