Troubleshooting container connectivity issues and performance hotspots in Kubernetes clusters can be a frustrating exercise in a dynamic environment where hundreds, possibly thousands of pods are continually being created and destroyed. If you are a DevOps or platform engineer and need to troubleshoot microservices and application connectivity issues, or figure out why a service or application is performing slowly, you might use traditional packet capture methods like executing tcpdump against a container in a pod. This might allow you to achieve your task in a siloed single-developer environment, but enterprise-level troubleshooting comes with its own set of mandatory requirements and scale. You don’t want to be slowed down by these requirements, but rather address them in order to shorten the time to resolution.
Dynamic Packet Capture is a Kubernetes-native way that helps you to troubleshoot your microservices and applications quickly and efficiently without granting extra permissions. Let’s look at a specific use case to see some challenges and best practices for live troubleshooting with packet capture in a Kubernetes environment.
Let’s talk about this use case in the context of a hypothetical situation.
Your organization’s DevOps and platform teams are trying to figure out Continue reading
If you have access to the internet, it’s likely that you have already heard of the critical vulnerability in the Log4j library. A zero-day vulnerability in the Java library Log4j, with the assigned CVE code of CVE-2021-44228, has been disclosed by Chen Zhaojun, a security researcher in the Alibaba Cloud Security team. It’s got people worried—and with good reason.
This is a serious flaw that needs to be addressed right away, since it can result in remote code execution (RCE) in many cases. By now, I have seen many creative ways in which this can be used to infiltrate or disturb services. The right solution is to identify and patch your vulnerable Log4j installations to the fixed versions as soon as possible. If you are using Log4j, make sure you are following this page where you can find the latest news about the vulnerability.
What else should you be doing, though, for this and similar exploits? In this blog post, I’ll look at the impact of the vulnerability in a Kubernetes cluster, and share a couple of ways that you can prevent such vulnerabilities in the future.
On its own, the Log4j vulnerability Continue reading
Yes, you read that right – in the comfort of your own laptop, as in, the entire environment running inside your laptop! Why? Well, read on. It’s a bit of a long one, but there is a lot of my learning that I would like to share.
I often find that Calico Open Source users ask me about BGP, and whether they need to use it, with a little trepidation. BGP carries an air of mystique for many IT engineers, for two reasons. Firstly, before its renaissance as a data center protocol, BGP was seen to be the domain of ISPs and service provider networks. Secondly, many high-profile and high-impact Internet outages have been due to BGP misuse or misconfiguration.
The short answer to the question is that in public cloud Kubernetes deployments, it is almost never necessary to configure or use BGP to make best use of Calico Open Source. Even in on-premise Kubernetes deployments, it is only needed in certain scenarios; you shouldn’t configure BGP unless you know why you need it. It is even less common to require complex BGP setups involving route reflectors and the like.
Calico is the industry standard for Kubernetes networking and security. It offers a proven platform for your workloads across a huge range of environments, including cloud, hybrid, and on-premises.
Calico has had a high-quality, production-ready, performant, eBPF data plane option for some time!
However, although many users are deploying it in production and benefitting, we still sometimes see users who don’t know that Calico has an eBPF data plane or feel confident deploying it, and:
We created the new CCO-L2-EBPF (Certified Calico Operator: eBPF) course specifically to address these points. The course will help you to understand the strengths of eBPF and when it is, or is not, the right choice. It will also help you see how easy it is to deploy the Calico eBPF data plane if you have made the choice that it is right for you and your cluster.
Last June, Tigera announced a first for Kubernetes: supporting open-source WireGuard for encrypting data in transit within your cluster. We never like to sit still, so we have been working hard on some exciting new features for this technology, the first of which is support for WireGuard on AKS using the Azure CNI.
First a short recap about what WireGuard is, and how we use it in Calico.
WireGuard is a VPN technology available in the Linux kernel since version 5.6 and is positioned as an alternative to IPsec and OpenVPN. It aims to be faster, simpler, leaner and more useful. This is manifested in WireGuard taking an opinionated stance on the configurability of supported ciphers and algorithms to reduce the attack surface and auditability of the technology. It is simple to configure with standard Linux networking commands, and it is only approximately 4,000 lines of code, making it easy to read, understand, and audit.
While WireGuard is a VPN technology and is typically thought of as client/server, it can be configured and used equally effectively in a peer-to-peer mesh architecture, which is how we designed our solution at Tigera to work in Kubernetes. Using Calico, Continue reading
A single Kubernetes cluster expends a small percentage of its total available assigned resources on delivering in-cluster networking. We don’t have to be satisfied with this, though—achieving the lowest possible overhead can provide significant cost savings and performance improvements if you are running network-intensive workloads. This article explores and explains the improvements that can be achieved in Microsoft Azure using Calico eBPF in AKS, including reducing CPU usage, decreasing complexity, enabling easier compliance and troubleshooting, and more.
Before going into details about how exactly Calico takes advantage of eBPF, it is important to note that in the context of this article, Calico is viewed as an additional networking layer on top of Azure CNI, providing functionality that turbocharges its performance. In particular, the standard instructions for installing Calico’s network policy engine with AKS use a version of Calico that pre-dates eBPF mode.
To show how Calico accelerates AKS network performance using eBPF, the Calico team ran a series of network performance benchmarks based on the k8s-bench-suite. These performance benchmarks compared the latest Calico eBPF data plane (using the iptables data plane) with a vanilla AKS cluster.
Tests were run using Standard_D2s_v3 nodes, which are a Continue reading
Cloud-native transformations come with many security and troubleshooting challenges. Real-time intrusion detection and the prevention of continuously evolving threats is challenging for cloud-native applications in Kubernetes. Due to the ephemeral nature of pods, it is difficult to determine source or destination endpoints and limit their blast radius.
Traditional perimeter-based firewalls are not ideal fit for Kubernetes and containers. Firewalls have traditionally been used to block attacks at the perimeter, but if the perimeter is breached, there’s no protection from within the cluster. The dynamic nature of Kubernetes requires a specialized approach to intrusion detection and prevention for containers, Kubernetes, and cloud.
Threat intelligence feeds, which record and track the IP addresses of known bad actors, are a critical part of modern cloud-native security. Calico Cloud now provides threat intelligence feeds, such as AlienVault, as part of its default security policies. This means that traffic to suspicious IPs is blocked from day one without the need for any extra configuration. Additionally, an anomaly detection dashboard in Calico’s UI shows full context, including which pod(s) was involved so you can analyze and remediate.
Another advanced method for intrusion detection and prevention introduced in Calico Cloud is deep packet inspection (DPI). DPI inspects, Continue reading
It’s that time again; we’re really happy to announce Calico v3.21! As always, thank you to everyone who contributed to this release! For detailed release notes, please go here. Alongside the usual-but-essential bug fixes and other improvements, there are some big new improvements to be aware of:
Calico supports BGP, which is used within the cluster in some scenarios, and to allow you to integrate cluster routing with your upstream network devices. Now though, you can even view the status of your BGP sessions, including RIB / FIB contents, and agent health via the new CalicoNodeStatus API. See the API documentation for more details.
In addition, you get more granular control; you can control BGP advertisement of certain prefixes using the new disableBGPExport option on each IP pool.
If you aren’t already familiar with them, the egress policy rules that can match on Kubernetes services, introduced in v3.20, are pretty transformative. However we have improved even further upon them in two ways:
We are excited to announce the release of our O’Reilly book, Kubernetes security and observability: A holistic approach to securing containers and cloud-native applications. The book, authored by Tigera’s Brendan Creane and Amit Gupta, helps you learn how to adopt a holistic security and observability strategy for building and securing cloud-native applications running on Kubernetes.
Security practitioners are faced with a wide range of considerations when securing, observing, and troubleshooting containerized workloads on Kubernetes. These considerations range from infrastructure choices and cluster configuration to deployment controls and runtime and network security. Although securing cloud-native applications can be a daunting task, our book will give you the knowledge and confidence you’ll need to establish security and observability for your cloud-native applications.
In 11 chapters, the book covers topics relevant to containers and cloud-native applications in detail, including:
After reading the book, you’ll have gained an understanding of key concepts behind security and observability for cloud-native applications, how to determine the best strategy, and which technology choices are available to support Continue reading
With the Calico 3.10 release, Dynamic Packet Capture is available in Dynamic Service Graph.
This means users who require self-service, live troubleshooting for microservices and Kubernetes workloads can capture and evaluate traffic packets on endpoints without writing a single line of code or using any 3rd-party troubleshooting tools. Users don’t need to learn about or have knowledge of kubectl or YAML to troubleshoot their microservices and Kubernetes cluster. Calico helps enforce organizational security policies by only allowing users to access their assigned namespaces and endpoints for troubleshooting.
In most situations when you need to do a packet capture, the problem doesn’t last long and usually happens randomly. But once you narrow down the issue to a particular time or activity, you will need to set the right action plan to tackle the problem. Packet capture is now much easier, simpler, and faster than before.
Dynamic Packet Capture facilitates fast troubleshooting and easy debugging of microservice connectivity issues and performance hotspots in Kubernetes clusters. It is a Kubernetes-native custom resource that runs as part of user code against specific workloads in the cluster, without the need to execute any programs inside the cluster. Dynamic Packet Capture Continue reading
In this blog post, I will be talking about label standard and best practices for Kubernetes security. This is a common area where I see organizations struggle to define the set of labels required to meet their security requirements. My advice is to always start with a hierarchical security design that is capable of achieving your enterprise security and compliance requirements, then define your label standard in alignment with your design. This is not meant to be a comprehensive guide for all your label requirements, but rather a framework that guides you through developing your own label standard to meet your specific security requirements.
Labels are key/value pairs that are attached to Kubernetes objects to identify attributes that are intuitive for users and that are required for specific purposes, such as inventory reporting or the enforcement of an intent.
Kubernetes network policies represent the intent of enforcing security controls to pods using labels to match intended endpoints. Label prefixes can be used to identify label classification. The following short-list is a high-level classification of endpoints required for developing a Kubernetes network policies design:
Labels Continue reading
October marks the five-year anniversary of Calico Open Source, the most widely adopted solution for container networking and security. Calico Open Source was born out of Project Calico, an open-source project with an active development and user community, and has grown to power 1.5M+ nodes daily across 166 countries.
When Calico was introduced 5 years ago, the world—and technology—was much different from what it is today. The march toward distributed applications and microservices had just begun. Today, open-source projects like Project Calico are enabling the large-scale adoption of a modern architecture that is ultimately responsible for the wholesale transition to digital transformations that we are witnessing.
As part of our celebration, we’ve compiled a few comments from people who have worked on the project over the years.
“Calico works well out of the box. It scales well, rarely has bugs, and is feature rich. Tigera does a good job supporting its customers also.” —Network engineer“[Calico is] the industry standard [for] networking for Kubernetes.” —Platform engineer“The support for a lot of K8s distributions (either on-prem or cloud managed) is great with Calico.” —Platform architect“[Calico helped us learn] about network segmentation in cloud-native environments.” Continue reading
Containers have changed how applications are developed and deployed, with Kubernetes ascending as the de facto means of orchestrating containers, speeding development, and increasing scalability. Modern application workloads with microservices and containers eventually need to communicate with other applications or services that reside on public or private clouds outside the Kubernetes cluster. However, securely controlling granular access between these environments continues to be a challenge. Without proper security controls, containers and Kubernetes become an ideal target for attackers. At this point in the Kubernetes journey, the security team will insist that workloads meet security and compliance requirements before they are allowed to connect to outside resources.
As shown in the table below, Calico Enterprise and Calico Cloud offer multiple solutions that address different access control scenarios to limit workload access between Kubernetes clusters and APIs, databases, and applications outside the cluster. Although each solution addresses a specific requirement, they are not necessarily mutually exclusive.
Your requirement | Calico’s solution | Advantages |
You want to use existing firewalls and firewall managers to enforce granular egress access control of Kubernetes workloads at the destination (outside the cluster) | Egress Access Gateway | Security teams can leverage existing investments, experience, and training in firewall infrastructure and Continue reading |
Calico Cloud is an industry-first security and observability SaaS platform for Kubernetes, containers, and cloud. Since its launch, we have seen customers use Calico Cloud to address a range of security and observability problems for regulatory and compliance requirements in a matter of days and weeks. In addition, they only paid for the services used, instead of an upfront investment commitment, thus aligning their budgets with their business needs.
We are excited to announce recent Calico Cloud enhancements. Highlights include:
Congratulations to the Kubespray team on the release of 2.17! This release brings support for two of the newer features in Calico: support for the eBPF data plane, and also for WireGuard encryption.
Let’s dive into configuring Kubespray to enable these new features.
If you’re interested in getting started with Kubespray and Calico, you can refer to Using Calico with Kubespray, which covers some of the settings you might want to use, as well as how to enable Calico in several of the quick start guides.
To configure Calico options when using Kubespray to deploy a cluster, you’ll need to configure some variables. If you’re using the examples in the Kubespray repository, those files are under inventory/…/group_vars/k8s_cluster/
, with the Calico options residing in k8s-net-calico.yml
.
Calico offers several different data planes, ensuring that end users can choose the technology that’s right for their particular use case. eBPF is a relatively new set of facilities in the Linux kernel that lets developers write code to modify its functionality at runtime in a way that is safe and efficient.
Calico’s eBPF data plane offers increased efficiency, as well as functionality like providing source IP preservation Continue reading
Observability is a staple of high-performing software and DevOps teams. Research shows that a comprehensive observability solution, along with a number of other technical practices, positively contributes to continuous delivery and service uptime.
Observability is sometimes confused with monitoring, but there is a clear difference between the two; it’s important to understand the distinction. Observability refers to a technical solution that enables teams to actively debug a system. It is based on exploring activities, properties, and patterns that are not defined in advance. Monitoring, in contrast, is a technical solution that enables teams to watch and understand the state of their systems and is based on gathering pre-defined sets of metrics or logs.
Conventional observability and monitoring tools were designed for monolithic systems, observing the health and behavior of a single application instance. Complex distributed microservices architectures, like Kubernetes, are constantly changing, with hundreds and even thousands of pods being created and destroyed within minutes. Because this environment is so dynamic, pre-defined metrics and logs aren’t effective for troubleshooting issues. Conventional observability approaches, which work well in traditional, monolithic environments, are inadequate for Kubernetes. So an observability solution that is purpose-built for a distributed microservices Continue reading
Amazon EKS Anywhere is an official Kubernetes distribution from AWS. It’s a new deployment option for Amazon EKS that allows the creation and operation of on-premises Kubernetes clusters on your existing infrastructure.
Since its general availability release, we’ve been working hard to ensure support for Calico on EKS Anywhere, and are happy to announce that users can now choose to use Calico for container networking and security. This gives organizations already using or planning to adopt EKS Anywhere the flexibility to choose the best container networking solution for their needs. Organizations currently using Calico can add EKS Anywhere clusters and use the same Calico solution for networking and security across on-premises and cloud platforms.
Let’s take a look at how you can get started with Calico on EKS Anywhere.
Notes:
Install EKS Anywhere as normal on vSphere, by following this documentation.
Removing Cilium from a cluster requires using the Cilium CLI, so Continue reading
Public cloud infrastructures and microservices are pushing the limits of resources and service delivery beyond what was imaginable until very recently. In order to keep up with the demand, network infrastructures and network technologies had to evolve as well. Software-defined networking (SDN) is the pinnacle of advancement in cloud networking; by using SDN, developers can now deliver an optimized, flexible networking experience that can adapt to the growing demands of their clients.
This article will discuss how Tigera’s new Vector Packet Processing (VPP) data plane fits into this landscape and share some benchmark details about its performance. Then it will demonstrate how to run a VPP-equipped cluster using AWS public cloud and secure it with Internet Protocol Security (IPsec).
Project Calico is an open-source networking and security solution. Although it focuses on securing Kubernetes networking, Calico can also be used with OpenStack and other workloads. Calico uses a modular data plane that allows a flexible approach to networking, providing a solution for both current and future networking needs.
VPP is an easily extensible, kernel-independent, highly optimised, and blazing-fast open-source data plane project that operates between layer 2 and layer 4 of the OSI Continue reading
Internet-facing applications are some of the most targeted workloads by threat actors. Securing this type of application is a must in order to protect your network, but this task is more complex in Kubernetes than in traditional environments, and it poses some challenges. Not only are threats magnified in a Kubernetes environment, but internet-facing applications in Kubernetes are also more vulnerable than their counterparts in traditional environments. Let’s take a look at the reasons behind these challenges, and the steps you should take to protect your Kubernetes workloads.
One of the fundamental challenges in a Kubernetes environment is that there is no finite set of methods that exist in terms of how workloads can be attacked. This means there are a multitude of ways an internet-facing application could be compromised, and a multitude of ways that such an attack could propagate within the environment.
Kubernetes is designed in such a way that allows anything inside a cluster to communicate with anything else inside the cluster by default, essentially giving an attacker who manages to gain a foothold unlimited access and a large attack surface. Because of this design, any time you have Continue reading
This post will highlight and explain the importance of a pluggable data plane. But in order to do so, we first need an analogy. It’s time to talk about a brick garden wall!
Imagine you have been asked to repair a brick garden wall, because one brick has cracked through in the summer sun. You have the equipment you need, so the size of the job will depend to a great extent on how easily the brick can be removed from the wall without interfering with all the ones around it. Good luck.
Now that we have that wonderful imagery in mind, let’s look at how to go about designing walls — and how they can be maintained.
“Coupling” is the term used to describe the interdependence between pieces of software. Closely coupled systems are interdependent and difficult to separate; loosely coupled systems are more like building blocks designed to work together, but they come apart cleanly. So, since the bricks in our garden wall are closely coupled (in this case, by cement), attempting to remove just one creates difficult challenges.
We can think of software as being built in “walls,” Continue reading