Archive

Category Archives for "MTU Ninja | Vincent Bernat"

VXLAN: BGP EVPN with Cumulus Quagga

VXLAN is an overlay network to encapsulate Ethernet traffic over an existing (highly available and scalable, possibly the Internet) IP network while accomodating a very large number of tenants. It is defined in RFC 7348. For an uncut introduction on its use with Linux, have a look at my “VXLAN & Linux” post.

VXLAN deployment

In the above example, we have hypervisors hosting a virtual machines from different tenants. Each virtual machine is given access to a tenant-specific virtual Ethernet segment. Users are expecting classic Ethernet segments: no MAC restrictions1, total control over the IP addressing scheme they use and availability of multicast.

In a large VXLAN deployment, two aspects need attention:

  1. discovery of other endpoints (VTEPs) sharing the same VXLAN segments, and
  2. avoidance of BUM frames (broadcast, unknown unicast and multicast) as they have to be forwarded to all VTEPs.

A typical solution for the first point is using multicast. For the second point, this is source-address learning.

Introduction to BGP EVPN

BGP EVPN (RFC 7432 and draft-ietf-bess-evpn-overlay for its application with VXLAN Continue reading

VXLAN & Linux

VXLAN is an overlay network to carry Ethernet traffic over an existing (highly available and scalable) IP network while accommodating a very large number of tenants. It is defined in RFC 7348.

Starting from Linux 3.12, the VXLAN implementation is quite complete as both multicast and unicast are supported as well as IPv6 and IPv4. Let’s explore the various methods to configure it.

VXLAN setup

To illustrate our examples, we use the following setup:

  • an underlay IP network (highly available and scalable, possibly the Internet),
  • three Linux bridges acting as VXLAN tunnel endpoints (VTEP),
  • four servers believing they share a common Ethernet segment.

A VXLAN tunnel extends the individual Ethernet segments accross the three bridges, providing a unique (virtual) Ethernet segment. From one host (e.g. H1), we can reach directly all the other hosts in the virtual segment:

$ ping -c10 -w1 -t1 ff02::1%eth0
PING ff02::1%eth0(ff02::1%eth0) 56 data bytes
64 bytes from fe80::5254:33ff:fe00:8%eth0: icmp_seq=1 ttl=64 time=0.016 ms
64 bytes from fe80::5254:33ff:fe00:b%eth0: icmp_seq=1 ttl=64 time=4.98 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:9%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:a%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)

--- ff02::1%eth0 ping statistics ---
1 packets transmitted, 1 received, +3 duplicates,  Continue reading

Proper isolation of a Linux bridge

TL;DR: when configuring a Linux bridge, use the following commands to enforce isolation:

# bridge vlan del dev br0 vid 1 self
# echo 1 > /sys/class/net/br0/bridge/vlan_filtering

A network bridge (also commonly called a “switch”) brings several Ethernet segments together. It is a common element in most infrastructures. Linux provides its own implementation.

A typical use of a Linux bridge is shown below. The hypervisor is running three virtual hosts. Each virtual host is attached to the br0 bridge (represented by the horizontal segment). The hypervisor has two physical network interfaces:

  • eth0 is attached to a public network providing various services for the virtual hosts (DHCP, DNS, NTP, routers to Internet, …). It is also part of the br0 bridge.
  • eth1 is attached to an infrastructure network providing various services to the hypervisor (DNS, NTP, configuration management, routers to Internet, …). It is not part of the br0 bridge.

Typical use of Linux bridging with virtual machines

The main expectation of such a setup is that while the virtual hosts should be able to use resources from the public network, they should not be able to access resources from the infrastructure network (including resources hosted on the hypervisor itself, like a Continue reading

Netops with Emacs and Org mode

Org mode is a package for Emacs to “keep notes, maintain todo lists, planning projects and authoring documents”. It can execute embedded snippets of code and capture the output (through Babel). It’s an invaluable tool for documenting your infrastructure and your operations.

Here are three (relatively) short videos exhibiting Org mode use in the context of network operations. In all of them, I am using my own junos-mode which features the following perks:

  • syntax highlighting for configuration files,
  • commit of configuration snippets to remote devices, and
  • execution of remote commands.

Since some Junos devices can be quite slow, commits and remote executions are done asynchronously1 with the help of a Python helper.

In the first video, I take some notes about configuring BGP add-path feature (RFC 7911). It demonstrates all the available features of junos-mode.

In the second video, I execute a planned operation to enable this feature in production. The document is a modus operandi and contains the configuration to apply and the commands to check if it works as expected. At the end, the document becomes a detailed report of the operation.

In the third video, a cookbook has been prepared to execute Continue reading

Integration of a Go service with systemd

Unlike other programming languages, Go’s runtime doesn’t provide a way to reliably daemonize a service. A system daemon has to supply this functionality. Most distributions ship systemd which would fit the bill. A correct integration with systemd is quite straightforward. There are two interesting aspects: readiness & liveness.

As an example, we will daemonize this service whose goal is to answer requests with nifty 404 errors:

package main

import (
    "log"
    "net"
    "net/http"
)

func main() {
    l, err := net.Listen("tcp", ":8081")
    if err != nil {
        log.Panicf("cannot listen: %s", err)
    }
    http.Serve(l, nil)
}

You can build it with go build 404.go.

Here is the service file, 404.service1:

[Unit]
Description=404 micro-service

[Service]
Type=notify
ExecStart=/usr/bin/404
WatchdogSec=30s
Restart=on-failure

[Install]
WantedBy=multi-user.target

Readiness

The classic way for an Unix daemon to signal its readiness is to daemonize. Technically, this is done by calling fork(2) twice (which also serves other intents). This is a very common task and the BSD systems, as well as some other C libraries, supply a daemon(3) Continue reading

Write your own terminal emulator

I was an happy user of rxvt-unicode until I got a laptop with an HiDPI display. Switching from a LoDPI to a HiDPI screen and back was a pain: I had to manually adjust the font size on all terminals or restart them.

VTE is a library to build a terminal emulator using the GTK+ toolkit, which handles DPI changes. It is used by many terminal emulators, like GNOME Terminal, evilvte, sakura, termit and ROXTerm. The library is quite straightforward and writing a terminal doesn’t take much time if you don’t need many features.

Let’s see how to write a simple one.

A simple terminal

Let’s start small with a terminal with the default settings. We’ll write that in C. Another supported option is Vala.

#include <vte/vte.h>

int
main(int argc, char *argv[])
{
    GtkWidget *window, *terminal;

    /* Initialise GTK, the window and the terminal */
    gtk_init(&argc, &argv);
    terminal = vte_terminal_new();
    window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
    gtk_window_set_title(GTK_WINDOW(window), "myterm");

    /* Start a new shell */
    gchar **envp = g_get_environ();
    gchar **command = (gchar * Continue reading