sFlow Archives - Page 11 of 15 - NetworkingNexus.net

Mininet flow analytics

Mininet is free software that creates a realistic virtual network, running real kernel, switch and application code, on a single machine (VM, cloud or native), in seconds. Mininet is useful for development, teaching, and research. Mininet is also a great way to develop, share, and experiment with OpenFlow and Software-Defined Networking systems.

This article shows how standard sFlow instrumentation built into Mininet can be combined with sFlow-RT analytics software to provide real-time traffic visibility for Mininet networks. Augmenting Mininet with sFlow telemetry realistically emulates the instrumentation built into most vendor's switch hardware, provides visibility into Mininet experiments, and opens up new areas of research (e.g. SDN and large flows).

The following papers are a small selection of projects using sFlow-RT:

In order Continue reading

Identifying bad ECMP paths

In the talk Move Fast, Unbreak Things! at the recent DevOps Networking Forum, Petr Lapukhov described how Facebook has tackled the problem of detecting packet loss in Equal Cost Multi-Path (ECMP) networks. At Facebook's scale, there are many parallel paths and actively probing all the paths generates a lot of data. The active tests generate over 1Terabits/second of measurement data per Facebook data center and a Hadoop cluster with hundreds of compute nodes is required per data center to process the data.

Processing active test data can detect that packets are being lost within approximately 20 seconds, but doesn't provide the precise location where packets are dropped. A custom multi-path traceroute tool (fbtracert) is used to follow up and narrow down the location of the packet loss.

While described as measuring packet loss, the test system is really measuring path loss. For example, if there are 64 ECMP paths in a pod, then the loss of one path would result in a packet loss of approximately 1 in 64 packets in traffic flows that cross the ECMP group.

Black hole detection describes an alternative approach. Industry standard sFlow instrumentation embedded within most vendor's switch hardware provides visibility into the Continue reading

Black hole detection

The Broadcom white paper, Black Hole Detection by BroadView™ Instrumentation Software, describes the challenge of detecting and isolating packet loss caused by inconsistent routing in leaf-spine fabrics. The diagram from the paper provides an example, packets from host H11 to H22 are being forwarded by ToR1 via Spine1 to ToR2 even though the route to H22 has been withdrawn from ToR2. Since ToR2 doesn't have a route to the host, it sends the packet back up to Spine 2, which will send the packet back to ToR2, causing the packet to bounce back and forth until the IP time to live (TTL) expires.

The white paper discusses how Broadcom ASICs can be programmed to detect blackholes based on packet paths, i.e. packets arriving at a ToR switch from a Spine switch should never be forwarded to another Spine switch.

This article will discuss how the industry standard sFlow instrumentation (also included in Broadcom based switches) can be used to provide fabric wide detection of black holes.

The diagram shows a simple test network built using Cumulus VX virtual machines to emulate a four switch leaf-spine fabric like the one described in the Broadcom white paper (this network is Continue reading

sFlow to IPFIX/NetFlow

RESTflow explains how the sFlow architecture shifts the flow cache from devices to external software and describes how the sFlow-RT REST API can be used to program and query flow caches. Exporting events using syslog describes how flow records can be exported using the syslog protocol to Security Information and Event Management (SIEM) tools such as Logstash and and Splunk. This article demonstrates how sFlow-RT can be used to define and export the flows using the IP Flow Information eXport (IPFIX) protocol (the IETF standard based on NetFlow version 9).

For example, the following command defines a cache that will maintain flow records for TCP flows on the network, capturing IP source and destination addresses, source and destination port numbers and the bytes transferred and sending flow records to address 10.0.0.162:

curl -H "Content-Type:application/json" -X PUT --data  '{"keys":"ipsource,ipdestination,tcpsourceport,tcpdestinationport", 
"value":"bytes", "ipfixCollectors":["10.0.0.162"]}' 
http://localhost:8008/flow/tcp/json

Running Wireshark's tshark command line utility on 10.0.0.162 verifies that flows are being received:

# tshark -i eth0 -V udp port 4739
Running as user "root" and group "root". This could be dangerous.
Capturing on lo
Frame 1 (134 bytes on wire, 134 bytes captured)
    Arrival Time:  Continue reading

Berkeley Packet Filter (BPF)

Linux bridge, macvlan, ipvlan, adapters discusses how industry standard sFlow technology, widely supported by data center switch vendors, has been extended to provide network visibility into the Linux data plane. This article explores how sFlow's lightweight packet sampling mechanism has been implemented on Linux network adapters.

Linux Socket Filtering aka Berkeley Packet Filter (BPF) describes the recently added prandom_u32() function that allows packets to be randomly sampled in the Linux kernel for efficient monitoring of production traffic.

Background: Enhancing Network Intrusion Detection With Integrated Sampling and Filtering, Jose M. Gonzalez and Vern Paxson, International Computer Science Institute Berkeley, discusses the motivation for adding random sampling BPF and the email thread [PATCH] filter: added BPF random opcode describes the Linux implementation and includes an interesting discussion of the motivation for the patch.

The following code shows how the open source Host sFlow agent implements random 1-in-256 packet sampling as a BPF program:

ld rand
mod #256
jneq #1, drop
ret #-1
drop: ret #0

A JIT for packet filters discusses the Linux Just In Time (JIT) compiler for BFP programs, delivering native machine code performance for compiled filters.

Minimizing cost of visibility describes why low overhead monitoring is an Continue reading

Cisco SF250, SG250, SF350, SG350, SG350XG, and SG550XG series switches

Software version 2.1.0 adds sFlow support to Cisco 250 Series Smart Switches, 350 Series Smart Switches and 550X Series Stackable Managed Switches.

Cisco network engineers might not be familiar with the multi-vendor sFlow technology since it is a relatively new addition to Cisco products. The article, Cisco adds sFlow support, describes some of the key features of sFlow and contrasts them to Cisco NetFlow.

Configuring sFlow on the switches is straightforward. For example, The following commands configure a switch to sample packets at 1-in-1024, poll counters every 30 seconds and send sFlow to an analyzer (10.0.0.50) over UDP using the default sFlow port (6343):

sflow receiver 1 10.0.0.50

For each interface:

sflow flow-sampling 1024 1
sflow counter-sampling 30 1

A previous posting discussed the selection of sampling rates. Additional information can be found on the Cisco web site.

Trying out sFlow offers suggestions for getting started with sFlow monitoring and reporting. The article recommends the sFlowTrend analyzer as a way to get started since it is a free, purpose built sFlow analyzer that delivers the full capabilities of sFlow instrumentation in the Cisco switches.

Multi-tenant sFlow

This article discusses how real-time sFlow telemetry can be shared with network tenants to provide each tenant with a real-time view of their slice of the shared resources. The diagram shows a simple network with two tenants, Tenant A and Tenant B, each assigned their own subnet, 10.0.0.0/24 and 10.0.1.0/24 respectively.

One option would be to simply replicate the sFlow datagrams and send copies to both tenants. Forwarding using sflowtool describes how sflowtool can be used to replicate and forward sFlow and sFlow-RT can be configured to forward sFlow using its REST API:

curl -H "Content-Type:application/json" \
-X PUT --data '{"address":"10.0.0.1","port":6343}' \
http://127.0.0.1:8008/forwarding/TenantA/json

However, there are serious problems with this approach:

Private information about Tenant B's traffic is leaked to Tenant A.
Information from internal links within the network (i.e. links between s1, s2, s3 and s4) is leaked to Tenant A.
Duplicate data from each network hop is likely to cause Tenant A to over-estimate their traffic.

The sFlow-RT multi-tenant forwarding function addresses these challenges. The first task is to provide sFlow-RT with an accurate network topology specifying the internal links connecting the Continue reading

Network visibility with Docker

Microservices describes the critical role that network visibility provides as a common point of reference for monitoring, managing and securing the interactions between the numerous and diverse distributed service instances in a microservices deployment.

Industry standard sFlow is well placed to give network visibility into the Docker infrastructure used to support microservices. The sFlow standard is widely supported by data center switch vendors (Cisco, Arista, Juniper, Dell, HPE, Brocade, Cumulus, etc.) providing a cost effective and scaleable method of monitoring the physical network infrastructure. In addition, Linux bridge, macvlan, ipvlan, adapters described how sFlow is also an efficient means of leveraging instrumentation built into the Linux kernel to extend visibility into Docker host networking.

The following commands build the Host sFlow binary package from sources on an Ubuntu 14.04 system:

sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install libpcap-dev
sudo apt-get install wget
wget https://github.com/sflow/host-sflow/archive/v1.29.1.tar.gz
tar -xvzf v1.29.1.tar.gz
cd host-sflow-1.29.1
make DOCKER=yes PCAP=yes deb

This resulting hsflowd_1.29.1-1_amd64.deb package can be copied and installed on all the hosts in the Docker cluster using configuration management tools such as Puppet, Chef, Ansible, etc.

This Continue reading

Lasers!

Cool! You mean that I actually have frickin' switches with frickin' laser beams attached to their frickin' ports?

Dr. Evil is right, lasers are cool! The draft sFlow Optical Interface Structures specification exports metrics gathered from instrumentation built into Small Form-factor Pluggable (SFP) and Quad Small Form-factor Pluggable (QSFP) optics modules. This article provides some background on optical modules and discusses the value of including optical metrics in the sFlow telemetry stream exported from switches and hosts.

Pluggable optical modules are intelligent devices that do more than simply convert between optical and electrical signals. The functional diagram below shows the elements within a pluggable optical module.

The transmit and receive functions are shown in the upper half of the diagram. Incoming optical signals are received on a fiber and amplified as they are converted to electrical signals that can be handled by the switch. Transmit data drives the modulation of a laser diode which transmits the optical signal down a fiber.

The bottom half of the diagram shows the management components. Power, voltage and temperature sensors are monitored and the results are written into registers in an EEPROM that are accessible via a management interface.

The proposed sFlow extension standardizes Continue reading

Minimizing cost of visibility

Visibility allows orchestration systems (OpenDaylight, ONOS, OpenStack Heat, Kubernetes, Docker Storm, Apache Mesos, etc.) to adapt to changing demand by targeting resources where they are needed to increase efficiency, improve performance, and reduce costs. However, the overhead of monitoring must be low in order to realize the benefits.

An analogous observation that readers may be familiar with is the importance of minimizing costs when investing in order to maximize returns - see Vanguard Principle 3: Minimize cost

Suppose that a 100 server pool is being monitored and visibility will allow the orchestration system to realize a 10% improvement by better workload scheduling and placement - increasing the pool's capacity by 10% without the need to add an additional 10 servers and saving the associated CAPEX/OPEX costs.

The chart shows the impact that measurement overhead has in realizing the potential gains in this example. If the measurement overhead is 0%, then the 10% performance gain is fully realized. However, even a relatively modest 2% measurement overhead reduces the potential improvement to just under 8% (over a 20% drop in the potential gains). A 9% measurement overhead wipes out the potential efficiency gain and measurement overheads greater than 9% result in Continue reading

Docker network visibility demonstration

The 2 minute live demonstration shows how the open source Host sFlow agent can be used to efficiently monitor Docker networking in production environments. The demonstration shows real-time tracking of 30Gbit/s traffic flows using less than 1% of a single processor core.

Internet Exchange (IX) Metrics

IX Metrics has been released on GitHub, https://github.com/sflow-rt/ix-metrics. The application provides real-time monitoring of traffic between members in an Internet Exchange (IX).

Close monitoring of exchange traffic is critical to operations:

Ensure that there is sufficient capacity to accommodate new and existing members.
Ensure that all traffic sources are accounted for and that there are no unauthorized connections.
Ensure that only allowed traffic types are present.
Ensure that non-unicast traffic is strictly controlled.
Ensure that packet size policies are controlled to avoid loss due to MTU mismatches.

IX Metrics imports information about exchange members using the IX Member List JSON Schema. The member information is used to create traffic analytics and traffic is checked against the schema to identify errors, for example, if a member is using a MAC address that isn't listed.

The measurements from the exchange infrastructure are useful to members since it allows them to easily see how much traffic they are exchanging with other members through their peering relationships. This information is easy to collect using the exchange infrastructure, but much harder for members to determine independently.

The sFlow standard has long been a popular method of monitoring exchanges for a number of reasons:

sFlow Continue reading

Network and system analytics as a Docker microservice

Microservices describes why the industry standard sFlow instrumentation embedded within cloud infrastructure is uniquely able to provide visibility into microservice deployments.

The sFlow-RT analytics engine is well suited to deployment as a Docker microservice since the application is stateless and presents network and system analytics as a RESTful service.

The following steps demonstrate how to create a containerized deployment of sFlow-RT.

First, create a directory for the project and edit the Dockerfile:

mkdir sflow-rt
cd sflow-rt
vi Dockerfile

Add the following contents to Dockerfile:

FROM   centos:centos6
RUN    yum install -y java-1.7.0-openjdk
RUN    rpm -i http://www.inmon.com/products/sFlow-RT/sflow-rt-2.0-1072.noarch.rpm
EXPOSE 8008 6343/udp
CMD    /etc/init.d/sflow-rt start && tail -f /dev/null

Build the project:

docker build -t sflow-rt .

Run the service:

docker run -p 8008:8008 -p 6343:6343/udp -d sflow-rt

Access the API at http://docker_host:8008/ to verify that the service is running.

Now configure sFlow agents to send data to the docker_host on port 6343:

Switch configurations
Install Host sFlow agents to monitor hosts, hypervisors, Docker, etc.

The following articles provide examples of using the sFlow-RT REST API:

The diagram shows how new and existing cloud based or locally hosted orchestration, operations, and security tools can leverage sFlow-RT's analytics service to gain real-time visibility. The solution is extremely scaleable, a single sFlow-RT instance can monitor thousands of servers and the network devices connecting them.

Microservices

Figure 1: Visibility and the software defined data center

In the land of microservices, the network is the king(maker) by Sudip Chakrabarti, Lightspeed Venture Partners, makes the case that visibility into network traffic is the key to monitoring, managing and securing applications that are composed of large numbers of communicating services running in virtual machines or containers.

While I genuinely believe that the network will play an immensely strategic role in the microservices world, inspecting and storing billions of API calls on a daily basis will require significant computing and storage resources. In addition, deep packet inspection could be challenging at line rates; so, sampling, at the expense of full visibility, might be an alternative. Finally, network traffic analysis must be combined with service-level telemetry data (that we already collect today) in order to get a comprehensive and in-depth picture of the distributed application.

Sampling isn't just an alternative, sampling is the key to making large scale microservice visibility a reality. Shrink ray describes how sampling acts as a scaling function, reducing the task of monitoring large scale microservice infrastructure from an intractable measurement and big data problem to a lightweight real-time data center wide visibility solution for monitoring, managing, Continue reading

Open vSwitch version 2.5 released

The recent Open vSwitch version 2.5 release includes significant network virtualization enhancements:

   - sFlow agent now reports tunnel and MPLS structures.
   ...
   - Add experimental version of OVN.  OVN, the Open Virtual Network, is a
     system to support virtual network abstraction.  OVN complements the
     existing capabilities of OVS to add native support for virtual network
     abstractions, such as virtual L2 and L3 overlays and security groups.

The sFlow Tunnel Structures specification enhances visibility into network virtualization by capturing encapsulation / decapsulation actions performed by tunnel end points. In many network virtualization implementations VXLAN, GRE, Geneve tunnels are terminate in Open vSwitch and so the new feature has broad application.

The second related feature is the inclusion of the Open Virtual Network (OVN), providing a simple method of building virtual networks for OpenStack and Docker.

The following articles provide additional background:

Linux bridge, macvlan, ipvlan, adapters

The open source Host sFlow project added a feature to efficiently monitor traffic on Linux host network interfaces: network adapters, Linux bridge, macvlan, ipvlan, etc. Implementation of high performance sFlow traffic monitoring is made possible by the inclusion of random packet sampling support in the Berkeley Packet Filter (BPF) implementation in recent Linux kernels (3.19 or later).

In addition to the new BPF capability, hsflowd has a couple of other ways to monitor traffic:

iptables, add a statistic rule to the iptables firewall to add traffic monitoring
Open vSwitch, has built-in sFlow instrumentation that can be configured by hsflowd.

The BPF sampling mechanism is less complex to configure than iptables and can be used to monitor any Linux network device, including: network adapters (e.g. eth0) and the Linux bridge (e.g. docker0). Monitoring a network adapter also provides visibility into lightweight macvlan and ipvlan network virtualization technologies that are likely to become more prevalent in the Linux container ecosystem, see Using Docker with macvlan Interfaces.

The following commands build and install hsflowd on an Ubuntu 14.03 host:

sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install libpcap-dev
sudo apt-get install git
git clone https://github. Continue reading

CloudFlare DDoS Mitigation Pipeline

The Usenix Enigma 2016 talk from Marek Majkowski describes CloudFlare's automated DDoS mitigation solution. CloudFlare provides reverse proxy services for millions of web sites and their customers are frequently targets of DDoS attacks. The talk is well worth watching in its entirety to learn about their experiences.

Network switches stream standard sFlow data to CloudFlare's "Gatebot" Reactive Automation component, which analyzes the data to identify attack vectors. Berkeley Packet Filter (BPF) rules are constructed to target specific attacks and apply customer specific mitigation policies. The rules are automatically installed in iptables firewalls on the CloudFlare servers.

The chart shows that over a three month period CloudFlare's mitigation system handled between 30 and 300 attacks per day.

Attack volumes mitigated regularly hit 100 million packers per second and reach peaks of over 150 million packets per second. These large attacks can cause significant damage and automated mitigation is critical to reducing their impact.

Elements of the CloudFlare solution are readily accessible to anyone interested in building DDoS mitigation solutions. Industry standard sFlow instrumentation is widely supported by switch vendors. Download sFlow-RT analytics software and combine real-time DDoS detection with business policies to automate mitigation actions. A number of DDoS mitigation examples are Continue reading

SignalFx

SignalFx is an example of a cloud based analytics service. SignalFx provides a REST API for uploading metrics and a web portal that it simple to combine and trend data and build and share dashboards.

This article describes a proof of concept demonstrating how SignalFx's cloud service can be used to cost effectively monitor large scale cloud infrastructure by leveraging standard sFlow instrumentation. SignalFx offers a free 14 day trial, making it easy to evaluate solutions based on this demonstration.

The diagram shows the measurement pipeline. Standard sFlow measurements from hosts, hypervisors, virtual machines, containers, load balancers, web servers and network switches stream to the sFlow-RT real-time analytics engine. Metrics are pushed from sFlow-RT to SignalFx using the REST API.

Over 40 vendors implement the sFlow standard and compatible products are listed on sFlow.org. The open source Host sFlow agent exports standard sFlow metrics from hosts, virtual machines and containers and local services. For additional background, the Velocity conference talk provides an introduction to sFlow and case study from a large social networking site.

SignalFx's service is priced based on the number of data points that they need to store and they estimate a cost of $15 per host Continue reading

Dell OS10 SDN router demo

In this video from Dell's Network Field Day 11 (#NFD11) presentation, Madhu Santhanam demonstrates an interesting use case for the new OS10 switch operating system that was introduced at the event.

The core of OS10 is an unmodified Linux kernel with an application development environment for Control Plane Services (CPS). These APIs allow software running on the switch: native linux applications, third party applications, and native OS10 applications to run on the core OS10 operating system.

The FIB Optimization application consists of three components: an sFlow agent to provide network visibility, Quagga for BGP routing, and the Selective Route Push agent which provides a REST API for selectively populating the hardware routing tables in the switch ASIC. The FIB Optimization application allows an inexpensive data center switch to replace a much more expensive high capacity Internet router.

In this use case, the data center is connected to a single transit provider and multiple additional peer networks. Initially all traffic is sent via a default route to the transit provider. The full Internet routing table consists of nearly 600,000 prefixes - far too many to fit in the switch hardware forwarding tables which in typical low cost switches can only handle Continue reading

Podcast with Nick Buraglio and Brent Salisbury

"Have you seen sFlow options in your router configuration or flow collector? Are you looking for alternatives to SNMP or NetFlow? Have you been curious about the instrumentation of your new white box or virtual switch? Yes? Then you will probably enjoy learning more about sFlow!"

Non-Blocking #1: SFlow With Peter Phaal Of InMon And SFlow.Org is a discussion between Brent Salisbury (networkstatic.net), Nick Buraglio (forwardingplane.net), and Peter Phaal (blog.sflow.com).

Web sites and tools mentioned in the podcast:

sFlow.org
Devices that support sFlow
Software to analyze sFlow
sFlow.org mailing list
sFlow structures
blog.sflow.com (incorrectly referenced as blog.sflow.org in the podcast)
Host sFlow
sflowtool

The podcast touches on a number of topics that have been explored in greater detail on this blog. The topics are listed in roughly the order they are mentioned in the podcast:

« Previous 1 … 9 10 11 12 13 … 15 Next »