
Category Archives for "sFlow"

sFlow Test

sFlow Test has been released on GitHub, The suite of checks is intended to validate the implementation of sFlow on a data center switch. In particular, the tests are designed to verify that the sFlow agent implementation provides measurements under load with the accuracy needed to drive SDN control applications, including:
Many of the tests can be run while the switches are in production and are a useful way of verifying that a switch is configured and operating correctly.

The stress tests can be scaled to run without specialized equipment. For example, the recommended sampling rate for 10G links in production is 1-in-10,000. Driving a switch with 48x10G ports to 30% of total capacity would require a load generator capable of generating 288Gbit/s. However, dropping the sampling rate to 1-in-100 and generating a load of 2.88Gbit/s is an equivalent test of the sFlow agent's performance and can be achieved by two moderately powerful servers with 10G network adapters.

For example, using the test setup above, run an iperf server on Server2:
iperf -su
Then run the following sequence of tests on Server1:
RT="10.0.0. Continue reading

Active Route Manager

SDN Active Route Manager has been released on GitHub, The software is based on the article White box Internet router PoC. Active Route Manager peers with a BGP route reflector to track prefixes and combines routing data with sFlow measurements to identify the most active prefixes. Active prefixes can be advertised via BGP to a commodity switch, which acts as a hardware route cache, accelerating the performance of a software router.
There is an interesting parallel with the Open vSwitch architecture, see Open vSwitch performance monitoring, which maintains a cache of active flows in the Linux kernel to accelerate forwarding. In the SDN routing case, active prefixes are pushed to the switch ASIC in order to bypass the slower software router.
In this example, the software is being used in passive mode, estimating the cache hit / miss rates without offloading routes. The software has been configured to manage a cache of 10,000 prefixes. The first screen shot shows the cache warming up.

The first panel shows routes being learned from the route reflector: the upper chart shows the approximately 600,000 routes being learned from the BGP route reflector, and the lower chart shows the rate at which Continue reading

Fabric View

The Fabric View application has been released on Github, Fabric View provides real-time visibility into the performance of leaf and spine ECMP fabrics.
A leaf and spine fabric is challenging to monitor. The fabric spreads traffic across all the switches and links in order to maximize bandwidth. Unlike traditional hierarchical network designs, where a small number of links can be monitored to provide visibility, a leaf and spine network has no special links or switches where running CLI commands or attaching a probe would provide visibility. Even if it were possible to attach probes, the effective bandwidth of a leaf and spine network can be as high as a Petabit/second, well beyond the capabilities of current generation monitoring tools.

Fabric View solves the visibility challenge by using the industry standard sFlow instrumentation built into most data center switches. Fabric View represents the fabric as if it were a single large chassis switch, treating each leaf switch as a line card and the spine switches as the backplane. The result is an intuitive tool that is easily understood by anyone familiar with traditional networks.

Fabric View provides real-time, second-by-second visibility to traffic, identifying top talkers, protocols, tenants, tunneled traffic, Continue reading

Real-time analytics and control applications

sFlow-RT 2.0 released - adds application support describes a new application framework for sharing solutions built on top of the real-time analytics platform. Application examples are provided on the sFlow-RT Download page.

The flow-graph application, shown above, generates a real-time graph of communication between hosts.  The application uses a simple sFlow-RT script to track associations between hosts based on their communication patterns and plots the results using the vis.js dynamic, browser based visualization library. This example can be modified to track different types of relationship and extended to incorporate other popular data visualization libraries such as D3.js.
The dashboard-example includes representative real-time metric and top flows trend charts. The example uses the jQuery-UI library to build build a simple tabbed interface. This example can be extended to build groups of custom charts.
The top-flows application supports the definition of custom flows and tracks the largest flows in a continuously updating table.

Each of the examples has a server-side component that uses sFlow-RT's script API to collect, analyze, and export measurements. An HTML5 client side user interface connects to the server and presents the data.

The sFlow-RT analytics engine is a highly scaleable platform for processing sFlow measurements Continue reading

Open Virtual Network (OVN)

Open Virtual Network (OVN) is an open source network virtualization solution built as part of the Open vSwitch (OVS) project. OVN provides layer 2/3 virtual networking and firewall services for connecting virtual machines and Linux containers.

OVN is built on the same architectural principles as VMware's commercial NSX and offers the same core network virtualization capability — providing a free alternative that is likely to see rapid adoption in open source orchestration systems, Mirantis: Why the Open Virtual Network (OVN) matters to OpenStack.

This article uses OVN as an example, describing a testbed which demonstrates how the standard sFlow instrumentation build into the physical and virtual switches provides the end-to-end visibility required to manage large scale network virtualization and deliver reliable services.

Open Virtual Network

The Northbound DB provides a way to describe the logical networks that are required. The database abstracts away implementation details which are handled by the ovn-northd and ovn-controllers and presents an easily consumable network virtualization service to orchestration tools like OpenStack.

The purple tables on the left describe a simple logical switch LS1 that has two logical ports LP1 and LP2 with MAC addresses AA and BB respectively. The green tables on the right show Continue reading

Cisco adds sFlow support to Nexus 9K series

Cisco adds support for the sFlow standard in the Cisco Nexus 9000 Series 7.0(3)I2(1) NX-OS Release. Combined with the Nexus 3000/3100 series, which have included sFlow support since NX-OS 5.0(3)U4(1),  Cisco now offers cost effective, built-in, visibility across the full spectrum of data center switches.
Cisco network engineers might not be familiar with the multi-vendor sFlow technology since it is a relatively new addition to Cisco products. The article, Cisco adds sFlow support, describes some of the key features of sFlow and contrasts them to Cisco NetFlow.
Nexus 9000 switches can be operated in NX-OS mode or ACI mode:
  • NX-OS mode includes a number of open features such as sFlow, Python, NX-API, and Bash that integrate with an open ecosystem of orchestration tools such as Puppet, Chef, CFEngine, and Ansible. "By embracing the open culture of development and operations (DevOps) and creating a more Linux-like environment in the Cisco Nexus 9000 Series, Cisco enables IT departments with strong Linux skill sets to meet business needs efficiently," Cisco Nexus 9000 Series Switches: Integrate Programmability into Your Data Center. Open APIs are becoming increasingly popular, preventing vendor lock-in, and allowing organizations to benefit from the rapidly increasing range of open hardware Continue reading

White box Internet router PoC

SDN router using merchant silicon top of rack switch describes how the performance of a software Internet router could be accelerated using the hardware routing capabilities of a commodity switch. This article describes a proof of concept demonstration using Linux virtual machines and a bare metal switch running Cumulus Linux.
The diagram shows the demo setup, providing inter-domain routing between Peer 1 and Peer 2. The Peers are directly connected to the Hardware Switch and ingress packets are routed by the default ( route to the Software Router. The Software Router learns the full set of routes from the Peers using BGP and forwards the packet to the correct next hop router. The packet is then switched to the selected peer router via bridge br_xen.

The following traceroute run on Peer 1 shows the set of router hops from to
[root@peer1 ~]# traceroute -s
traceroute to (, 30 hops max, 40 byte packets
1 ( 3.090 ms 3.014 ms 2.927 ms
2 192.168. Continue reading

SDN router using merchant silicon top of rack switch

The talk from David Barroso describes how Spotify optimizes hardware routing on a commodity switch by using sFlow analytics to identify the routes carrying the most traffic.  The full Internet routing table contains nearly 600,000 entries, too many for commodity switch hardware to handle. However, not all entries are active all the time. The Spotify solution uses traffic analytics to track the 30,000 most active routes (representing 6% of the full routing table) and push them into hardware. Based on Spotify's experience, offloading the active 30,000 routes to the switch provides hardware routing for 99% of their traffic.

David is interviewed by Ivan Pepelnjak,  SDN ROUTER @ SPOTIFY ON SOFTWARE GONE WILD. The SDN Internet Router (SIR) source code and documentation is available on GitHub.
The diagram from David's talk shows the overall architecture of the solution. Initially the Internet Router (commodity switch hardware) uses a default route to direct outbound traffic to a Transit Provider (capable of handling all the outbound traffic). The BGP Controller learns routes via BGP and observes traffic using the standard sFlow measurement technology embedded with most commodity switch silicon.
After a period (1 hour) the BGP Controller identifies the most active 30,000 prefixes and Continue reading

WAN optimization using real-time traffic analytics

TATA Consultancy Services white paper, Actionable Intelligence in the SDN Ecosystem: Optimizing Network Traffic through FRSA, demonstrates how real-time traffic analytics and SDN can be combined to perform real-time traffic engineering of large flows across a WAN infrastructure.
The architecture being demonstrated is shown in the diagram (this diagram has been corrected - the diagram in the white paper incorrectly states that sFlow-RT analytics software uses a REST API to poll the nodes in the topology. In fact, the nodes stream telemetry using the widely supported, industry standard, sFlow protocol, providing real-time visibility and scaleability that would be difficult to achieve using polling - see Push vs Pull).

The load balancing application receives real-time notifications of large flows from the sFlow-RT analytics software and programs the SDN Controller (in this case OpenDaylight) to push forwarding rules to the switches to direct the large flows across a specific path. Flow Aware Real-time SDN Analytics (FRSA) provides an overview of the basic ideas behind large flow traffic engineering that inspired this use case.

While OpenDaylight is used in this example, an interesting alternative for this use case would be the ONOS SDN controller running the Segment Routing application. ONOS Continue reading

Optimizing software defined data center

The recent Fortune magazine article, Software-defined data center market to hit $77.18 billion by 2020, starts with the quote "Data centers are no longer just about all the hardware gear you can stitch together for better operations. There’s a lot of software involved to squeeze more performance out of your hardware, and all that software is expected to contribute to a burgeoning new market dubbed the software-defined data center."

The recent ONS2015 Keynote from Google's Amin Vahdat describes how Google builds large scale software defined data centers. The presentation is well worth watching in its entirety since Google has a long history of advancing distributed computing with technologies that have later become mainstream.
There are a number of points in the presentation that relate to the role of networking to the performance of cloud applications. Amin states, "Networking is at this inflection point and what computing means is going to be largely determined by our ability to build great networks over the coming years. In this world data center networking in particular is a key differentiator."

This slide shows the the large pools of storage and compute connected by the data center network that are used Continue reading

Leaf and spine traffic engineering using segment routing and SDN

The short 3 minute video is a live demonstration showing how software defined networking (SDN) can be used to orchestrate the measurement and control capabilities of commodity data center switches to automatically load balance traffic on a 4 leaf, 4 spine, 10 Gigabit leaf and spine network.
The diagram shows the physical layout of the demonstration rack. The four logical racks with their servers and leaf switches are combined in a single physical rack, along with the spine switches, and SDN controllers. All the links in the data plane are 10G and sFlow has been enabled on every switch and link with the following settings, packet sampling rate 1-in-8192 and counter polling interval 20 seconds. The switches have been configured to send the sFlow data to sFlow-RT analytics software running on Controller 1.

The switches are also configured to enable OpenFlow 1.3 and connect to multiple controllers in the redundant ONOS SDN controller cluster running on Controller 1 and Controller 2.
The charts from The Nature of Datacenter Traffic: Measurements & Analysis show data center traffic measurements published by Microsoft. Most traffic flows are short duration. However, combined they consume less bandwidth than a much smaller number of Continue reading

Analytics and SDN

Recent presentations from AT&T and Google describe SDN/NFV architectures that incorporate measurement based feedback in order to improve performance and reliability.

The first slide is from a presentation by AT&T's Margaret Chiosi; SDN+NFV Next Steps in the Journey, NFV World Congress 2015. The future architecture envisions generic (white box) hardware providing a stream of analytics which are compared to policies and used to drive actions to assure service levels.

The second slide is from the presentation by Google's Bikash Koley at the Silicon Valley Software Defined Networking Group Meetup. In this architecture, "network state changes observed by analyzing comprehensive time-series data stream." Telemetry is used to verify that the network is behaving as intended, identifying policy violations so that the management and control planes can apply corrective actions. Again, the software defined network is built from commodity white box switches.

Support for standard sFlow measurements is almost universally available in commodity switch hardware. sFlow agents embedded within network devices continuously stream measurements to the SDN controller, supplying the analytics component with the comprehensive, scaleable, real-time visibility needed for effective control.

SDN fabric controller for commodity data center switches describes the measurement and control capabilities available in commodity switch hardware. Continue reading

Big Tap sFlow: Enabling Pervasive Flow-level Visibility

Today's Big Switch Networks webinar, Big Tap sFlow: Enabling Pervasive Flow-level Visibility, describes how Big Switch uses software defined networking (SDN) to control commodity switches and deliver network visibility. The webinar presents a live demonstration showing how real-time sFlow analytics is used to automatically drive SDN actions to provide a "smarter way to find a needle in a haystack."

The video presentation covers the following topics:

  • 0:00 Introduction to Big Tap
  • 7:00 sFlow generation and use cases
  • 12:30 Demonstration of real-time tap triggering based on sFlow

The webinar describes how the network wide monitoring provided by industry standard sFlow instrumentation complements the Big Tap SDN controller's ability to capture and direct packet selected packet streams to visibility tools.

The above slide from the webinar draws an analogy for the role that sFlow plays in targeting the capture network to that of a finderscope, the small, wide-angle telescope used to provide an overview of the sky and guide the telescope to its target. Support for the sFlow measurement standard is built into commodity switch hardware and is enabled on all ports in the capture network to provide a wide angle view of all traffic in the data center. Once Continue reading interview

The interview includes a wide ranging discussion of current trends in the software defined networking (SDN), including: merchant silicon, analytics, probes, scaleability, Open vSwitch, network virtualization, VxLAN, network function virtualization (NFV),  Open Compute Project, white box / bare metal switches, leaf and spine topologies, large "Elephant" flow marking and steering, Cumulus Linux, Big Switch, orchestration, Puppet and Chef.

The interview and full transcript are available on SDxCentral: sFlow Creator Peter Phaal On Taming The Wilds Of SDN & Virtual Networking

Related articles on this blog include:

ECMP visibility with Cumulus Linux

Demo: Implementing the Big Data Design Guide in the Cumulus Workbench  is a great demonstration of the power of zero touch provisioning and automation. When the switches and servers boot they automatically pick up their operating systems and configurations for the complex Equal Cost Multi-Path (ECMP) routed network shown in the diagram.

Topology discovery with Cumulus Linux looked at an alternative Multi-Chassis Link Aggregation (MLAG) configuration and shows how to extract the configuration and monitor traffic on the network using sFlow and Fabric View.

The paper Hedera: Dynamic Flow Scheduling for Data Center Networks describes the impact of colliding flows on effective ECMP cross sectional bandwidth. The paper gives an example which demonstrates that effective cross sectional bandwidth can be reduced by a factor of between 20% to 60%, depending on the number of simultaneous flows per host.

This article uses the workbench to demonstrate the effect of large "Elephant" flow collisions on network throughput. The following script running on each of the servers uses the iperf tool to generate pairs of overlapping Elephant flows:
cumulus@server1:~$ while true; do iperf -c -t 20; sleep 20; done
Client connecting to, TCP port Continue reading

Topology discovery with Cumulus Linux

Demo: Implementing the OpenStack Design Guide in the Cumulus Workbench is a great demonstration of the power of zero touch provisioning and automation. When the switches and servers boot they automatically pick up their operating systems and configurations for the complex network shown in the diagram.
REST API for Cumulus Linux ACLs describes a REST server for remotely controlling ACLs on Cumulus Linux. This article will discuss recently added topology discovery methods that allow an SDN controller to learn topology and apply targeted controls (e.g Large "Elephant" flow marking, Large flow steering, DDoS mitigation, etc.).

Prescriptive Topology Manager

Complex Topology and Wiring Validation in Data Centers describes how Cumulus Networks' prescriptive topology manager (PTM) provides a simple method of verifying and enforcing correct wiring topologies.

The following REST call converts the topology from PTM's dot notation and returns a JSON representation:
cumulus@wbench:~$ curl http://leaf1:8080/ptm
Returns the result:
"links": {
"L1": {
"node1": "leaf1",
"node2": "spine1",
"port1": "swp1s0",
"port2": "swp49"


Prescriptive Topology Manager is preferred since it ensures that the discovered topology is correct. However, PTM builds on basic Link Level Discovery Protocol (LLDP), which provides an alternative method of topology Continue reading

Guest Blog: REST API for Cumulus Linux ACLs

Cumulus Linux: REST API for Cumulus Linux ACLs

RESTful control of Cumulus Linux ACLs included a proof of concept script that demonstrated how to remotely control iptables entries in Cumulus Linux.  Cumulus Linux in turn converts the standard Linux iptables rules into the hardware ACLs implemented by merchant silicon switch ASICs to deliver line rate filtering.

Previous blog posts demonstrated how remote control of Cumulus Linux ACLs can be used for DDoS mitigationand Large “Elephant” flow marking.

A more advanced version of the script is now available on GitHub

The new script adds the following features:

  1. It now runs as a daemon.
  2. Exceptions generated by cl-acltool are caught and handled
  3. Rules are compiled asynchronously, reducing response time of REST calls
  4. Updates are batched, supporting hundreds of operations per second

The script doesn’t provide any security, which may be acceptable if access to the REST API is limited to the management port, but is generally unacceptable for production deployments.

Fortunately, Cumulus Linux is a open Linux distribution that allows additional software components to be installed. Rather than being forced to add authentication and encryption to the script, it is possible to install additional software and leverage the capabilities of a mature web server such as Apache. The Continue reading

Broadcom ASIC table utilization metrics, DevOps, and SDN

Figure 1: Two-Level Folded CLOS Network Topology Example
Figure 1 from the Broadcom white paper, Engineered Elephant Flows for Boosting Application Performance in Large-Scale CLOS Networks, shows a data center leaf and spine topology. Leaf and spine networks are seeing rapid adoption since they provide the scaleability needed to cost effectively deliver the low latency, high bandwidth interconnect for cloud, big data, and high performance computing workloads.

Broadcom Trident ASICs are popular in white box, brite-box and branded data center switches from a wide range of vendors, including: Accton, Agema, Alcatel-Lucent, Arista, Cisco, Dell, Edge-Core, Extreme, Hewlett-Packard, IBM, Juniper, Penguin Computing, and Quanta.
Figure 2: OF-DPA Programming Pipeline for ECMP
Figure 2 shows the packet processing pipeline of a Broadcom ASIC. The pipeline consists of a number of linked hardware tables providing bridging, routing, access control list (ACL), and ECMP forwarding group functions. Operations teams need to be able to proactively monitor table utilizations in order to avoid performance problems associated with table exhaustion.

Broadcom's recently released sFlow specification, sFlow Broadcom Switch ASIC Table Utilization Structures, leverages the industry standard sFlow protocol to offer scaleable, multi-vendor, network wide visibility into the utilization of these hardware tables.

Support for Continue reading

Cloud analytics

Librato is an example of a cloud based analytics service (now part of SolarWinds). Librato provides an easy to use REST API for pushing metrics into their cloud service. The web portal makes it simple to combine and trend data and build and share dashboards.

This article describes a proof of concept demonstrating how Librato's cloud service can be used to cost effectively monitor large scale cloud infrastructure by leveraging standard sFlow instrumentation. Librato offers a free 30 day trial, making it easy to evaluate solutions based on this demonstration.
The diagram shows the measurement pipeline. Standard sFlow measurements from hosts, hypervisors, virtual machines, containers, load balancers, web servers and network switches stream to the sFlow-RT real-time analytics engine. Metrics are pushed from sFlow-RT to Librato using the REST API.

Over 40 vendors implement the sFlow standard and compatible products are listed on The open source Host sFlow agent exports standard sFlow metrics from hosts. For additional background, the Velocity conference talk provides an introduction to sFlow and case study from a large social networking site.

Librato's service is priced based on the number of data points that they need to store. For example, a Host sFlow agent Continue reading