Archive

Category Archives for "Networking"

Building BGP Route Reflector Configuration with Ansible/Jinja2

One of our subscribers sent me this email when trying to use ideas from Ansible for Networking Engineers webinar to build BGP route reflector configuration:

I’m currently discovering Ansible/Jinja2 and trying to create BGP route reflector configuration from Jinja2 template using Ansible playbook. As part of group_vars YAML file, I wish to list all route reflector clients IP address. When I have 50+ neighbors, the YAML file gets quite unreadable and it’s hard to see data model anymore.

Whenever you hit a roadblock like this one, you should start with the bigger picture and maybe redefine the problem.

Why use Typha in your Calico Kubernetes Deployments?

Calico is an open source networking and network security solution for containers, virtual machines, and native host-based workloads. Calico supports a broad range of platforms including Kubernetes, OpenShift, Docker EE, OpenStack, and bare metal. In this blog, we will focus on Kubernetes pod networking and network security using Calico.

Calico uses etcd as the back-end datastore. When you run Calico on Kubernetes, you can use the same etcd datastore through the Kubernetes API server. This is called a Kubernetes backed datastore (KDD) in Calico. The following diagram shows a block-level architecture of Calico.

Calico-node runs as a Daemonset, and has a fair amount of interaction with the Kubernetes API server. It’s easy for you to profile that by simply enabling audit logs for calico-node. For example, in my kubeadm cluster, I used the following audit configuration

 

To set the context, this is my cluster configuration.
As we are running Typha already, let us profile the API calls for both Calico and Typha components. I used the following commands to extract the unique API calls for each.

 

If you ignore the license key API calls from calico-node, you will see that the API calls Continue reading

Juniper QFX10K IPFIX Gotchas

IPFIX is problematic on the Juniper QFX10K switches. Documentation is sparse, and doesn’t have a complete configuration. Behavior changes between versions in undocumented ways. Here’s a couple of things I noticed when upgrading from Junos 17.3 to 17.4. These also apply if you are running 18.4 code. I hit more problems with 18.4, and ended up rolling back to 17.4.

Big Changes in Reported Throughput

Here’s a graph showing total reported throughput for a QFX10K I upgraded:

ipfix traffic report

There’s a few things going on there. First the reported traffic drops to zero after I upgraded. Then it starts coming up, after I fixed the first problem. But then after that the reported traffic is flat, and lower than it should be. Then it starts coming up again after I made the second fix.

First Problem: Chassis Sample Instance

The first configuration change I needed to add was this: set chassis fpc 0 sampling-instance sample-border, where sample-border is the name of the sampling instance I have configured under forwarding-options. This was not required with 17.3. If you don’t do it with 17.4, you won’t get any data.

Second Problem: DDoS-Protection

Some Juniper platforms implement Continue reading

Juniper QFX10K IPFIX Gotchas

IPFIX is problematic on the Juniper QFX10K switches. Documentation is sparse, and doesn’t have a complete configuration. Behavior changes between versions in undocumented ways. Here’s a couple of things I noticed when upgrading from Junos 17.3 to 17.4. These also apply if you are running 18.4 code. I hit more problems with 18.4, and ended up rolling back to 17.4.

Big Changes in Reported Throughput

Here’s a graph showing total reported throughput for a QFX10K I upgraded:

ipfix traffic report

There’s a few things going on there. First the reported traffic drops to zero after I upgraded. Then it starts coming up, after I fixed the first problem. But then after that the reported traffic is flat, and lower than it should be. Then it starts coming up again after I made the second fix.

First Problem: Chassis Sample Instance

The first configuration change I needed to add was this: set chassis fpc 0 sampling-instance sample-border, where sample-border is the name of the sampling instance I have configured under forwarding-options. This was not required with 17.3. If you don’t do it with 17.4, you won’t get any data.

Second Problem: DDoS-Protection

Some Juniper platforms implement Continue reading

April Customer Newsletter

Welcome to the April 2020 edition of the Tigera Calicommunication newsletter! In the March edition, we discussed context-aware flow logs. This edition covers the next component of logging, the audit logs.

Using Calico Enterprise Audit Logs to Improve Visibility, Security, and Compliance

Watch this short video to see how you can benefit from using Calico Enterprise Audit Logs.

What problems are we solving?

Kubernetes is an API-driven platform. Every action happens through an API call into the kube API server. Consequently, recording and monitoring API activity is very important. While most deployments end up sending these logs to a remote destination for compliance purposes, these logs are often not easily accessible when needed. Moreover, different roles (platform, network, security) have different requirements, and many may not even have access to the logs. Some use cases relevant to log analysis are as follows.

  • A policy change resulted in a sudden outage of a service. How do you find out which policies have changed in the last 24 hours? [network, security]
  • You are maintaining a critical namespace and want to monitor every pod that comes up in that namespace. Can you get an alert if a pod is created in that Continue reading

Lenovo intros an edge platform that runs Azure stack

Lenovo is boosting its ties to Microsoft with an edge-to-cloud platform that runs Microsoft’s Azure Stack in a hyperconverged infrastructure (HCI), putting HCI on the edge of the network rather than in a data center.The Lenovo ThinkAgile MX1021 server analyzes data at the edge near where it is gathered, a change in direction for the usual edge strategy. In earlier edge schemes,  data collected at an edge endpoint is merely sorted, and only the relevant data is sent up to the main data center where it is analyzed.[Get regularly scheduled insights by signing up for Network World newsletters.] The ThinkAgile MX1021 platform is a ruggedized, half-width, short-depth, 1U compact server that can be installed almost anywhere: hung on a wall, stacked on a shelf, or mounted in a rack. For connectivity, it supports Wi-Fi, 4G and 5G.To read this article in full, please click here

Daily Roundup: Is AT&T Readying More Job Cuts?

The carrier is “sizing our operations to economic activity”; VMware is helping Vodafone cut...

Read More »

© SDxCentral, LLC. Use of this feed is limited to personal, non-commercial use and is governed by SDxCentral's Terms of Use (https://www.sdxcentral.com/legal/terms-of-service/). Publishing this feed for public or commercial use and/or misrepresentation by a third party is prohibited.

How to Protect Your Virtual Meetings from Zoombombing

Imagine, if you will, you’re participating in a Eric Yuan has put a freeze on feature updates, in order to address the security issues. Zoom’s promise was to address the problem within the next 90 days, when Yuan said, “Over the next 90 days, we are committed to dedicating the resources needed to better identify, address, and fix issues proactively. We are also committed to being transparent throughout this process. We want to do what it takes to maintain your trust.” Another writer for The New Stack, Jennifer Riggins Continue reading

Cumulus content roundup: March 2020

Spring has sprung! With the change of the seasons, we’ve kept busy pumping out new content and useful resources. If you’re looking for a quick mental vacation, get ready to cozy up with this month’s edition of the Cumulus Content Roundup. We’ve got exciting announcements, fresh podcast episodes for your listening enjoyment, as well as blog posts packed with open networking and data center goodness.

From Cumulus Networks
Cumulus Networks launches the industry’s first open source and fully packaged automation solution — making open networking easier to deploy and manage and enabling infrastructure-as-code models: Cumulus Networks announces the release of its production-ready automation solution for organizations moving towards fully automated networks! Read all about how we are taking the next step in network automation in this blog post.

Production-ready automation — the how and why: So we’ve announced the industry’s first open source and fully packaged automation solution– but how exactly did we get there? This blog post dives into the challenges that customers were facing and the reason we wanted to help.

A new era for Cumulus in the Cloud: Can you believe that Cumulus in the Cloud was launched over two years ago? Yeah, we’re also Continue reading

AT&T Hints at COVID-19 Related Job Cuts

It expects "sizing our operations to economic activity" along with its ongoing business execution...

Read More »

© SDxCentral, LLC. Use of this feed is limited to personal, non-commercial use and is governed by SDxCentral's Terms of Use (https://www.sdxcentral.com/legal/terms-of-service/). Publishing this feed for public or commercial use and/or misrepresentation by a third party is prohibited.

Versa Targets SMBs, Pens SD-WAN Deal With Nuvias

Versa Titan promises to simplify the deployment and management of branch offices and make it easier...

Read More »

© SDxCentral, LLC. Use of this feed is limited to personal, non-commercial use and is governed by SDxCentral's Terms of Use (https://www.sdxcentral.com/legal/terms-of-service/). Publishing this feed for public or commercial use and/or misrepresentation by a third party is prohibited.

Vodafone Cut Costs 50% With VMware Telco Cloud

VMware’s network virtualization infrastructure supports voice core, data core, and service...

Read More »

© SDxCentral, LLC. Use of this feed is limited to personal, non-commercial use and is governed by SDxCentral's Terms of Use (https://www.sdxcentral.com/legal/terms-of-service/). Publishing this feed for public or commercial use and/or misrepresentation by a third party is prohibited.

Project Crossbow: Lessons from Refactoring a Large-Scale Internal Tool

Project Crossbow: Lessons from Refactoring a Large-Scale Internal Tool
Project Crossbow: Lessons from Refactoring a Large-Scale Internal Tool

Cloudflare’s global network currently spans 200 cities in more than 90 countries. Engineers working in product, technical support and operations often need to be able to debug network issues from particular locations or individual servers.

Crossbow is the internal tool for doing just this; allowing Cloudflare’s Technical Support Engineers to perform diagnostic activities from running commands (like traceroutes, cURL requests and DNS queries) to debugging product features and performance using bespoke tools.

In September last year, an Engineering Manager at Cloudflare asked to transition Crossbow from a Product Engineering team to the Support Operations team. The tool had been a secondary focus and had been transitioned through multiple engineering teams without developing subject matter knowledge.

The Support Operations team at Cloudflare is closely aligned with Cloudflare’s Technical Support Engineers; developing diagnostic tooling and Natural Language Processing technology to drive efficiency. Based on this alignment, it was decided that Support Operations was the best team to own this tool.

Learning from Sisyphus

Whilst seeking advice on the transition process, an SRE Engineering Manager in Cloudflare suggested reading: “A Case Study in Community-Driven Software Adoption”. This book proved a truly invaluable read for anyone thinking of doing internal tool development Continue reading

When We Come Together, We Are Richer for It

These are unsettling and unprecedented times.

The speed at which coronavirus has taken hold around the world, and the dramatic changes to our lives that it has brought, would have been difficult for many of us to contemplate just a few short weeks ago.

Social (and physical) distancing measures that were merely a suggestion then have suddenly become a strange reality for millions of people, with entire countries going into complete lockdown, borders and schools closing, planes no longer flying, and normal social activity placed on hold.

The vital role the Internet is playing is clear for all to see. It allows us to work together while we are socially apart, quietly and quickly providing many of us with a way to continue our lives. It has allowed us to maintain at least some sense of the ordinary during an extraordinary time.

We are asking a lot of the Internet, but it is ready for the challenge. It is enabling companies to keep working, schoolchildren to continue learning, and families and friends to stay connected. Even virtual birthday parties and weddings have become a hit!

The Internet means that self-isolation may be a physical reality, but it need not be Continue reading

Can We Trust BGP Next Hops (Part 1)?

Aldrin sent me an interesting question as a comment to one of my EVPN blog posts:

How does the network know that a VTEP is actually alive? (1) from the point of view of the control plane and (2) from the point of view of the data plane? And how do you ensure that control and data plane liveness monitoring has the same view? BFD for BGP is a possible solution for (1) but it’s not meant for 3rd party next hops, i.e. it doesn’t address (2).

Let’s stop right there (or you’ll stop reading in the next 10 milliseconds). I will also try to rephrase the question in more generic terms, hoping Aldrin won’t mind a slight detour… we’ll get back to the original question in another blog post.

UCL team uses supercomputers in worldwide effort to beat Covid-19

A group of researchers at University College London is tapping supercomputers in the U.S. and Europe in an effort to find ways to fight COVID-19, looking for vaccines and anti-viral drugs, among other things.The team at UCL is part of an international group called the Consortium on Coronavirus and is made up of more than 100 researchers around the world including those at UCL and eight other universities, five U.S. national laboratories, a private research center and a public academy.To read this article in full, please click here

UCL team uses supercomputers in worldwide effort to beat COVID-19

A group of researchers at University College London is tapping supercomputers in the U.S. and Europe in an effort to find ways to fight COVID-19, looking for vaccines and anti-viral drugs, among other things.The team at UCL is part of an international group called the Consortium on Coronavirus and is made up of more than 100 researchers around the world including those at UCL and eight other universities, five U.S. national laboratories, a private research center and a public academy.To read this article in full, please click here

Network Break 278: Palo Alto Buys SD-WAN Maker CloudGenix; Zoom Gets Called On Security, Privacy Problems

Our weekly installment of tech news analysis includes Palo Alto's $420 million purchase of CloudGenix, Zoom's very bad week for security and privacy, Cisco Live going virtual, updates on professional development opportunities, and more on Network Break from the Packet Pushers.

The post Network Break 278: Palo Alto Buys SD-WAN Maker CloudGenix; Zoom Gets Called On Security, Privacy Problems appeared first on Packet Pushers.