Archive

Category Archives for "sFlow"

Emulating congestion with Containerlab

The Containerlab dashboard above shows variation in throughput in a leaf and spine network due to large "Elephant" flow collisions in an emulated network, see Leaf and spine traffic engineering using segment routing and SDN for a demonstration of the issue using physical switches.

This article describes the steps needed to emulate realistic network performance problems using Containerlab. First, using the FRRouting (FRR) open source router to build the topology provides a lightweight, high performance, routing implementation that can be used to efficiently emulate large numbers of routers using the native Linux dataplane for packet forwarding. Second, the containerlab tools netem set command can be used to introduce packet loss, delay, jitter, or restrict bandwidth of ports.

The netem tool makes use of the Linux tc (traffic control) module. Unfortunately, if you are using Docker desktop, the minimal virtual machine used to run containers does not include the tc module.

multipass launch docker
Instead, use Multipass as a convenient way to create and start an Ubuntu virtual machine with Docker support on your laptop. If you are already on a Linux system with Docker installed, skip forward to the git clone step.
multipass ls
List the multipass virtual machines.
 Continue reading

Dropped packet metrics with Prometheus and Grafana

Dropped packets due to black hole routes, buffer exhaustion, expired TTLs, MTU mismatches, etc. can result in insidious connection failures that are time consuming and difficult to diagnose. Dropped packet notifications with Arista Networks, VyOS dropped packet notifications and Using sFlow to monitor dropped packets describe implementations of the sFlow Dropped Packet Notification Structures extension for Arista Networks switches, VyOS routers, and Linux servers respectively, providing end to end visibility into packet drop events (including switch port, drop reason and packet header for each dropped packet).

Flow metrics with Prometheus and Grafana describes how define flow metrics and create dashboards to trend the flow metrics over time. This article describes how the same setup can be used to define and trend metrics based on dropped packet notifications.

  - job_name: sflow-rt-drops
    metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
    static_configs:
      - targets: ['sflow-rt:8008']
    params:
      metric: ['dropped_packets']
      key:
        - 'node:inputifindex'
        - 'ifname:inputifindex'
        - 'reason'
        - 'stack'
        - 'macsource'
        - 'macdestination'
        - 'null:vlan:untagged'
        - 'null:[or:ipsource:ip6source]:none'
        - 'null:[or:ipdestination:ip6destination]:none'
        - 'null:[or:icmptype:icmp6type:ipprotocol:ip6nexthdr]:none'
      label:
        - 'switch'
        - 'port'
        - 'reason'
        - 'stack'
        - 'macsource'
        - 'macdestination'
        - 'vlan'
        - 'src'
        - 'dst'
        - 'protocol'
      value: ['frames']
      dropped: ['true']
      maxFlows: ['20']
      minValue: ['0.001']

The Prometheus scrape configuration above is used to Continue reading

Dropped packet notifications with Arista Networks

Visibility into dropped packets is essential for Artificial Intelligence/Machine Learning (AI/ML) workloads, where a single dropped packet can stall large scale computational tasks, idling millions of dollars worth of GPU/CPU resources, and delaying the completion of business critical workloads. Enabling real-time sFlow telemetry provides the observability into traffic flows and packet drops needed to effectively manage these networks.

The availability of the Arista EOS 4.31.4M maintenance release brings sFlow dropped packet monitoring (previously demonstrated using the 4.30.1F feature release - see SC23 Dropped packet visibility demonstration) to production networks, see EOS Life Cycle Policy
sflow sampling 50000
sflow polling-interval 20
sflow vrf mgmt destination 203.0.113.100
sflow vrf mgmt source-interface Management0
sflow run
The above Arista EOS commands enable sFlow counter polling and packet sampling on all ports, sending the sFlow telemetry to the sFlow analyzer at 203.0.113.100
flow tracking mirror-on-drop
  sample limit 100 pps
  !
  tracker SFLOW
    exporter SFLOW
      format sflow
      collector sflow
      local interface Management0
  no shutdown
The above commands add sFlow Dropped Packet Notification Structures to the sFlow telemetry feed using Broadcom Mirror on Drop (MoD) instrumentation. Broadcom implements mirror-on-drop in Jericho 2, Trident 3, and Tomahawk 3, Continue reading

VyOS 1.4 LTS released

Protectli Vault - 4 Port

The VyOS 1.4.0 (Sagitta) LTS release announcement is exciting news! VyOS is an open source router operating system based on Linux that can be installed on commodity PC hardware - for optimal performance at least 1GB RAM and 4GB of storage space is recommended.

The new 1.4 LTS release includes a significantly enhanced implementation of industry standard sFlow telemetry based on the open source Host sFlow agent.

set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow interface eth3
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 192.0.2.100
Enter the commands above to enable sFlow monitoring on interfaces eth0, eth1, eth2, and eth3. Interface counters will be exported every 30 seconds, packets will be sampled with probability 1/1000, and up to 50 packet headers (and drop reasons) per second will collected from packets dropped by the router. The sFlow telemetry stream will be sent to an sFlow collector at 192.0.2.100.

Running Docker on the sFlow collector makes it easy to run a variety of Continue reading

Raspberry Pi 5 network emulation with Containerlab

The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.
Raspberry Pi 5 real-time network analytics describes how to install Docker on a Raspberry Pi 5.
docker run hello-world
Run the hello-world container to verify that Docker in properly installed and running before proceeding.
git clone https://github.com/sflow-rt/containerlab.git
Download the sflow-rt/containerlab project from GitHub.
cd containerlab
./run-clab
Start Containerlab.
containerlab deploy -t clos5.yml
Start the 5 stage leaf and spine topology shown at the top of this page. The initial launch may take a couple of minutes as the container images are downloaded for the first time. Once the images are downloaded, the topology deploys in around 10 seconds.
./topo.py clab-clos5
Push the topology to the sFlow-RT analytics software.
An instance of the sFlow-RT Continue reading

Raspberry Pi 5 real-time network analytics

CanaKit Raspberry Pi 5 Starter Kit - Aluminum
This article describes how build an inexpensive Raspberry Pi 5 based server for real-time flow analytics using industry standard sFlow streaming telemetry. Support for sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE.
In this example, we will use an 8G Raspberry Pi 5 running Raspberry Pi OS Lite (64-bit).  The easiest way to format a memory card and install the operating system is to use the Raspberry Pi Imager (shown above).
Click on EDIT SETTINGS button to customize the installation.
Set a hostname, username, and password.
Click on the SERVICES tab and select Enable SSH.  Click SAVE to save the settings and then YES to apply the settings and create a bootable micro SD card. These initial settings allow the Rasberry Pi to be accessed over the network without having to attach a screen, keyboard, and mouse.
ssh [email protected]
Use ssh to log into Raspberry Pi (having installled the micro SD card).
sudo apt-get update && sudo apt-get -y upgrade
Update packages and OS to latest version.
curl  Continue reading

SC23 Over 6 Terabits per Second of WAN Traffic

The world’s fastest temporary internet service gets turned on in Denver for one week only describes the SCinet temporary network built to support the The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) this week in Denver. The SC23 WAN Stress Test chart demonstrates that the provisioned 6.71 terabits bits per second capacity was pushed to the limits.
SC23 SCinet traffic describes the architecture of the real-time monitoring system used to comprehensively monitor the SCinet network and generate these charts. This chart shows that over 175 Petabytes of data were transfered during the show.
SC23 Dropped packet visibility demonstration describes a joint demonstration by InMon Corp and Arista Networks of one of newest developments in sFlow telemetry, identifying every dropped packet, the reason it was dropped, and the location it was dropped across all the switches in real-time.
SC23 WiFi Traffic Heatmap shows a real-time view of WiFi usage at the conference displayed on a conference floorplan.
Finally, SC23 Data Transfer Node TCP Metrics demonstrates how standard metrics maintained by the Linux kernel can be used to augment sFlow telemetry and track the performance of large science data transfers.

SC23 Data Transfer Node TCP Metrics

The dashboard shown above is based on the open source sflow-rt/dtn project. The dashboard shows data captured from The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) being held this week in Denver.

The dashboard displays data gathered from open source Host sFlow agents installed on Data Transfer Nodes (DTNs) run by the Caltech High Energy Physics Department and used for handling transfer of large scientific data sets (for example, accessing experiment data from the CERN particle accelerator). Network performance monitoring describes how the Host sFlow agents augment standard sFlow telemetry with measurements that the Linux kernel maintains as part of the normal operation of the TCP protocol stack.

The dashboard shows 5 large flows (greater than 50 Gigabits per Second). For each large flow being tracked, additional TCP performance metrics are displayed:

  • RTT The round trip time observed between DTNs
  • RTT Wait The amount of time that data waits on sender before it can be sent.
  • RTT Sdev The standard deviation on observed RTT. This variation is a measure of jitter.
  • Avg. Packet Size The average packet size used to send data.
  • Packets in Flight The number of unacknowledged packets.

See Defining Flows for full range of Continue reading

SC23 WiFi Traffic Heatmap

Real-time WiFi-Traffic Heatmap (source code GitHub: cod3monk/showfloor-heatmap) displays real-time WiFi traffic from The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) being held this week in Denver.
The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.
In this example, the sFlow-RT real-time analytics engine receives sFlow telemetry from switches, routers, and servers in the SCinet network and creates metrics to drive the real-time heatmap. Getting Started provides a quick introduction to deploying and using sFlow-RT for real-time network-wide flow analytics.

Additional use cases being demonstrated this week include, SC23 Dropped packet visibility demonstration and SC23 SCinet traffic.

SC23 SCinet traffic

The real-time dashboard shows total network traffic at The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) conference being held this week in Denver. The dashboard shows that 31 Petabytes of data have been transferred already and the conference hasn't even started.
The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.
In this example, the sFlow-RT real-time analytics engine receives sFlow telemetry from switches, routers, and servers in the SCinet network and creates metrics to drive the real-time charts in the dashboard. Getting Started provides a quick introduction to deploying and using sFlow-RT for real-time network-wide flow analytics.
The dashboard above trends SC23 Total Traffic. The dashboard was constructed using the Prometheus time series database to store metrics retrieved from sFlow-RT and Grafana to build the dashboard. Deploy real-time network dashboards using Docker compose demonstrates how to deploy and configure these tools to create custom dashboards like the one shown here.

Finally, check out the SC23 Dropped packet visibility demonstration to learn about one of newest developments in sFlow monitoring and see a live demonstration.

SC23 Dropped packet visibility demonstration

The real-time dashboard is a joint InMon / Arista Network Research Exhibition, SC23-NRE-026 Standard Packet Drop Monitoring In High Performance Networks. a part of The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) conference being held this week in Denver.
The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.

The SC23-NRE-026 Standard Packet Drop Monitoring In High Performance Networks dashboard combines telemetry from all the Arista switches in the SCinet network to provide real-time network-wide view of performance. Each of the three charts demonstrate a different type of measurement in the sFlow telemetry stream:

  • Counters: Total Traffic shows total traffic calculated from interface counters streamed from all interfaces. Counters provide a useful way of accurately reporting byte, frame, error and discard counters for each network interface. In this case, the chart rolls up data from all interfaces to trend total traffic on the network.
  • Samples: Top Flows shows the top 5 largest traffic flows traversing the network. The chart is based on sFlow's random packet sampling mechanism, providing a scaleable method of determining the hosts and services responsible Continue reading

Internet eXchange Provider (IXP) Metrics

IXP Metrics is available on Github. The application provides real-time monitoring of traffic between members of an Internet eXchange Provider (IXP) network.

This article will use Arista switches as an example to illustrate the steps needed to deploy the monitoring solution, however, these steps should work for other network equipment vendors (provided you modify the vendor specific elements in this example).

git clone https://github.com/sflow-rt/prometheus-grafana.git
cd prometheus-grafana
env RT_IMAGE=ixp-metrics ./start.sh

The easiest way to get started is to use Docker, see Deploy real-time network dashboards using Docker compose, and deploy the sflow/ixp-metrics image bundling the IXP Metrics application.

scrape_configs:
  - job_name: sflow-rt-ixp-metrics
    metrics_path: /app/ixp-metrics/scripts/metrics.js/prometheus/txt
    static_configs:
    - targets: ['sflow-rt:8008']
Follow the directions in the article to add a Prometheus scrape task to retrieve the metrics.
sflow source-interface management 1
sflow destination 10.0.0.50
sflow polling-interval 20
sflow sample 50000
sflow run

Enable sFlow on all exchange switches, directing sFlow telemetry to the Docker host (in this case 10.0.0.50).

Use the sFlow-RT Status page to confirm that sFlow is being received from the switches. In this case 286 sFlow datagrams per second are being received from 9 switches.
The IX-F Member Export JSON Schema Continue reading

Containerlab dashboard

The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.
The screen capture at the top of this article shows a real-time dashboard displaying up to the second traffic analytics gathered from the 5 stage Clos fabric shown above. This article walks through the steps needed to run the example.
git clone https://github.com/sflow-rt/containerlab.git
cd containerlab
./run-clab
Run the above commands to download the project and run Containerlab on a system with Docker installed. Docker Desktop is a conventient way to run the labs on a laptop.
containerlab deploy -t clos5.yml
Start the emulation.
./topo.py clab-clos5
Post topology to sFlow-RT REST API. Connect to http://localhost:8008/app/containerlab-dashboard/html/ to access the Dashboard shown at the top of this article.
docker exec -it clab-clos5-h1 iperf3 -c 172.16. Continue reading

Grafana Network Weathermap

The screen capture above shows a simple network weathermap, displaying a network topology with links animated by real-time network analytics.
Hovering over a link in the weathermap pops up a trend chart showing traffic on the link over the last 30 minutes.

Deploy real-time network dashboards using Docker compose, describes how to quickly deploy a real-time network analytics stack that includes the sFlow-RT analytics engine, Prometheus time series database, and Grafana to create dashboards. This article describes how to extend the example using the Grafana Network Weathermap Plugin to display network topologies like the ones shown here.

First, add a dashboard panel and select the Network Weathermap visualization. Next define the three metrics shown above. The ifinoctets and ifoutoctets need to be scaled by 8 to convert from bytes per second to bits per second. Creating a custom legend entry makes it easier to select metrics to associate metric instances with weathermap links.
Add a color scale that will be used to color links by link utilization. Defining the scale first ensures that links will be displayed correctly when they are added later.
Add the nodes to the canvas and drag them to their desired locations. There is a Continue reading

Deploy real-time network dashboards using Docker compose


This article demonstrates how to use docker compose to quickly deploy a real-time network analytics stack that includes the sFlow-RT analytics engine, Prometheus time series database, and Grafana to create dashboards.
git clone https://github.com/sflow-rt/prometheus-grafana.git
cd prometheus-grafana
./start.sh
Download the sflow-rt/prometheus-grafana project from GitHub on a system with Docker installed and start the containers. The start.sh script runs docker compose to bring up the containers specified in the compose.yml file, passing in user information so that the containers have correct permission to  write data files in the prometheus and grafana directories.
All the Docker images in this example are available for both x86 and ARM processors, so this stack can be deployed on Intel/AMD platforms as well as Apple M1/M2 or Raspberry Pi. Raspberry Pi 4 real-time network analytics describes how to configure a Raspberry Pi 4 to run Docker and perform real-time network analytics and is a simple way to run this stack for smaller networks.

Configure sFlow Agents in network devices to stream sFlow telemetry to the host running the analytics stack. See Getting Started for information on how to verify that sFlow telemetry is being received.

Connect to the Grafana web interface on Continue reading

Raspberry Pi 4 real-time network analytics

CanaKit Raspberry Pi 4 EXTREME Kit - Aluminum
This article describes how build an inexpensive Raspberry Pi 4 based server for real-time flow analytics of industry standard sFlow streaming telemetry. Support for sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE.

In this example, we will use an 8G Raspberry Pi 4 running Raspberry Pi OS Lite (64-bit).  The easiest way to format a memory card and install the operating system is to use the Raspberry Pi Imager (shown above).
Click on the gear icon to set a user and password and enable ssh access. These initial settings allow the Rasberry Pi to be accessed over the network without having to attach a screen, keyboard, and mouse.

Next, follow instruction for installing Docker Engine (Raspberry Pi OS Lite is based on Debian 11).

The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from industry standard sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking Continue reading

Leaf and spine network emulation on Mac OS M1/M2 systems


The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.

The Containerlab project currently has limited support for Mac OS, stating "ARM-based Macs (M1/2) are not supported, and no binaries are generated for this platform. This is mainly due to the lack of network images built for arm64 architecture as of now." However, this argument doesn't apply to the Linux based images used in these examples.

First install Docker Desktop on your Apple silicon based Mac (select the Apple Chip option).

mkdir clab
cd clab
docker run --rm -it --privileged \
  --network host --pid="host" \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /run/netns:/run/netns \
  -v $(pwd):$(pwd) -w $(pwd) \
  sflow/clab bash

Run Containerlab by typing the above commands in a terminal. This command uses a pre-built multi-architecture Continue reading

VyOS DDoS mitigation

Real-time flow analytics on VyOS describes how to install real-time analytics based on sFlow and the sFlow-RT analytics engine. This article extends the example to show how to automatically mitigate DDoS attacks using flow analytics combined with BGP Remotely Triggered Black Hole (RTBH) / Flowspec.
vyos@vyos:~$ add container image sflow/ddos-protect
First, download the sflow/ddos-protect image.
vyos@vyos:~$ mkdir -m 777 /config/sflow-rt
Create a directory to store persistent container state.
set container network sflowrt prefix 192.168.1.0/24
Define an internal network to connect to container. Currently VyOS BGP does not allow direct connections to local addresses (e.g. 127.0.0.1), so we need to put controller on its own network so the router can connect and receive DDoS mitigation BGP RTBH / Flowspec controls.
set container name sflow-rt image sflow/ddos-protect
set container name sflow-rt host-name sflow-rt
set container name sflow-rt arguments '-Dddos_protect.router=192.168.1.1 -Dddos_protect.enable.flowspec=yes'
set container name sflow-rt environment RTMEM value 200M
set container name sflow-rt memory 0
set container name sflow-rt volume store source /config/sflow-rt
set container name sflow-rt volume store destination /sflow-rt/store
set container name sflow-rt network sflowrt address 192.168.1.2

Configure a container to run the image. The Continue reading

Real-time flow analytics on VyOS

VyOS with Host sFlow agent describes support for streaming sFlow telemetry added to the open source VyOS router operating system. This article describes how to install analytics software on a VyOS router by configuring a container.
vyos@vyos:~$ add container image sflow/ddos-protect
First, download the sflow/ddos-protect image.
vyos@vyos:~$ mkdir -m 777 /config/sflow-rt
Create a directory to store persistent container state.
set container name sflow-rt image sflow/ddos-protect
set container name sflow-rt allow-host-networks
set container name sflow-rt arguments '-Dhttp.hostname=10.0.0.240'
set container name sflow-rt environment RTMEM value 200M
set container name sflow-rt memory 0
set container name sflow-rt volume store source /config/sflow-rt
set container name sflow-rt volume store destination /sflow-rt/store
Configure a container to run the image. The RMEM environment variable setting limits the amount of memory that the container will use to 200M bytes. The -Dhttp.hostname argument sets the internal web server to listen on management address, 10.0.0.240, assigned to eth0 on this router. The container has is no built-in authentication, so access needs to be limited using an ACL or through a reverse proxy - see Download and install.
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface  Continue reading

Dropped packet reason codes in VyOS

The article VyOS with Host sFlow agent describes how to use industry standard sFlow telemetry to monitor network traffic flows and statistics in the latest VyOS rolling releases. VyOS dropped packet notifications describes how sFlow also provides visibility into network packet drops and Dropped packet reason codes in Linux 6+ kernels describes how newer kernels are able to provide specific reasons for dropping packets. 
vyos@vyos:~$ uname -r
6.1.22-amd64-vyos

The latest VyOS rolling release runs on a Linux 6.1 kernel and the latest release of VyOS now provides enhanced visibility into dropped packets using kernel reason codes.

vyos@vyos:~$ show version
Version:          VyOS 1.4-rolling-202303310716
Release train:    current

Built by:         [email protected]
Built on:         Fri 31 Mar 2023 07:16 UTC
Build UUID:       1a7448d9-d53c-48a0-8644-ed1970c1abb8
Build commit ID:  75c9311fba375e

Architecture:     x86_64
Boot via:         installed image
System type:       guest

Hardware vendor:  innotek GmbH
Hardware model:   VirtualBox
Hardware S/N:     0
Hardware UUID:    da75808d-ff60-1d4c-babd-84a7fa341053

Copyright:        VyOS maintainers and contributors
Verify that the version of of VyOS is VyOS 1.4-rolling-202303310716 or later.

In the previous article, VyOS dropped packet notifications,  two tests were performed, the first a failed attempt to connect to the VyOS router using telnet (telnet has been disabled in Continue reading

1 2 3 14