Reza Ramezanpour

Author Archives: Reza Ramezanpour

Native Kubernetes cluster mesh with Calico

workloads from remote clusters

As Kubernetes continues to gain traction in the cloud-native ecosystem, the need for robust, scalable, and highly available cluster deployments has become more noticeable.

While a Kubernetes cluster can easily expand via additional nodes, the downside of such an approach is that you might have to spend a lot of time troubleshooting the underlying networking or managing and updating resources between clusters. On top of that, a multi-regional scenario or hyper-cloud environment might be off the limits depending on the limitations that a cloud provider or your Kubernetes distro might impose on your environment.

Calico Enterprise cluster mesh is a suite of features native to Kubernetes with a multi-layer design that connects two or more Kubernetes clusters and seamlessly shares resources between them. This post will explore cluster mesh, its benefits, and how it can enhance your Kubernetes environment.

Projects that provide cluster mesh

Multiple projects offer cluster mesh, and while they are all similar in basic principles, each has a different take on implementing this solution in an environment.

The following table is a brief overview of notable projects that offer cluster mesh:

Calico Open Source Calico Enterprise Cilium Calico Enterprise Submariner
Encapsulation IPIP Direct Continue reading

Kubernetes network policies: 4 pain points and how to address them

Kubernetes is used everywhere, from test environments to the most critical production foundations that we use daily, making it undoubtedly a de facto in cloud computing. While this is great news for everyone who works with, administers, and expands Kubernetes, the downside is that it makes Kubernetes a favorable target for malicious actors.

Malicious actors typically exploit flaws in the system to gain access to a portion of the environment. They then chain these flaws together to move laterally within the environment, ultimately seeking root access or access to critical information.

While the best way to fix security flaws in any software is to patch it with appropriate fixes that the project maintainers publish, there are certain security practices that you can adopt to fortify your environment, like using network policies. However, most people find network policies complex and overwhelming, which discourages them from implementing policies in their environment.

In this blog post, we will examine four pain points that people face when they want to implement network policies and provide solutions to help you effectively secure your Kubernetes environment.

What is a network policy and why should I use it?

In Kubernetes, a network policy (KNP) resource is the Continue reading

What is new in Calico 3.28

TL/DR

  • A new Grafana dashboard that helps you monitor Calico Typha’s performance and troubleshoot issues.
  • Calico eBPF dataplane IPv6 is now GA. It supports true IPv6-only clusters as well as dual-stack clusters. 🐝
  • Optional Pod startup delay to ensure networking is up in high-churn scenarios.
  • Tigera operator now supports multiple IP pools, IP pool modification, affinity for operator pods, priorityclassname, and more!
  • Improved policy performance in both eBPF and iptables.
  • Calico now ships with a pprof server. Activate the performance server for real-time views of Typha and Felix components and real-time debugging.

🚨 Important changes 🚨

Calico 3.28 now has enabled VXLAN checksum offload by default for environments with the kernel version of 5.8 or above. In the past, offloading was disabled due to kernel bugs.

Please keep in mind, if you are upgrading to 3.28 this change will take effect after node restarts.

If you encounter unexpected performance issues, you can use the following command to revert to the previous method by using the following command:

kubectl patch felixconfiguration default --type="merge" -p='{"spec":{"FeatureDetectOverride":"ChecksumOffloadBroken=true"}'

Please keep in mind that you can report any issues via GitHub tickets or Slack and include a detailed description of the environment (NIC hardware, kernel, distro, Continue reading

Amazon EKS networking options

When setting up a Kubernetes environment with Amazon Elastic Kubernetes Service (EKS), it is crucial to understand your available networking options. EKS offers a range of networking choices that allow you to build a highly available and scalable cloud environment for your workloads.

In this blog post, we will explore the networking and policy enforcement options provided by AWS for Amazon EKS. By the end, you will have a clear understanding of the different networking options and network policy enforcement engines, and other features that can help you create a functional and secure platform for your Kubernetes workloads and services.

Amazon Elastic Kubernetes Service (EKS)

Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service that simplifies routine operations, such as cluster deployment and maintenance, by automating tasks such as patching and updating operating systems and their underlying components. EKS enhances scalability through AWS Auto Scaling groups and other AWS service integrations and offers a highly available control plane to manage your cluster.

Amazon EKS in the cloud has two options:

  • Managed
  • Self-managed

Managed clusters rely on the AWS control plane node, which AWS hosts and controls separately from your cluster. This node operates in isolation and cannot be directly Continue reading

Exploring AKS networking options

At Kubecon 2023 in Amsterdam, Azure made several exciting announcements and introduced a range of updates and new options to Azure-CNI (Azure Container Networking Interface). These changes will help Azure Kubernetes Services (AKS) users to solve some of the pain points that they used to face in previous iterations of Azure-CNI such as IP exhaustion and big cluster deployments with custom IP address management (IPAM). On top of that, with this announcement Microsoft officially added an additional dataplane to the Azure platform.

The big picture

Worker nodes in an AKS (Azure Kubernetes Service) cluster are Azure VMs pre-configured with a version of Kubernetes that has been tested and certified by Azure. These clusters communicate with other Azure resources and external sources (including the internet) via the Azure virtual network (VNet).

Now, let’s delve into the role of the dataplane within this context. The dataplane operations take place within each Kubernetes node. It is responsible for handling the communication between your workloads, and cluster resources. By default, an AKS cluster is configured to utilize the Azure dataplane, which Continue reading

Turbocharging host workloads with Calico eBPF and XDP

In Linux, network-based applications rely on the kernel’s networking stack to establish communication with other systems. While this process is generally efficient and has been optimized over the years, in some cases it can create unnecessary overhead that can impact the overall performance of the system for network-intensive workloads such as web servers and databases.

XDP (eXpress Data Path) is an eBPF-based high-performance datapath inside the Linux kernel that allows you to bypass the kernel’s networking stack and directly handle packets at the network driver level. XDP can achieve this by executing a custom program to handle packets as they are received by the kernel. This can greatly reduce overhead, improve overall system performance, and improve network-based applications by shortcutting the normal networking path of ordinary traffic. However, using raw XDP can be challenging due to its programming complexity and the high learning curve involved. Solutions like Calico Open Source offer an easier way to tame these technologies.

Calico Open Source is a networking and security solution that seamlessly integrates with Kubernetes and other cloud orchestration platforms. While infamous for its policy engine and security capabilities, there are many other features that can be used in an environment by installing Continue reading

What’s new in Calico v3.26

We are excited to announce the release of Calico v3.26! This latest milestone brings a range of enhancements and new features to the Calico ecosystem, delivering an optimized and secure networking solution. This release has a strong emphasis on product performance, with strengthened security measures, expanded compatibility with Windows Server 2022 and OpenStack Yoga, and notable improvements to the Calico eBPF dataplane.

As always, let’s begin by thanking our awesome community members who helped us in this release.

Community shoutout

Big thanks to our GitHub users afshin-deriv, blue-troy, and winstonu for their valuable contributions in enhancing the Kind installation and VXLAN documentation, as well as improving the code comments.

Additionally, we would like to extend our appreciation to laibe and yankay for their efforts in updating the flannel version and improving the IPtables detection mechanism. Their contributions have been instrumental in improving the overall functionality and reliability of our project.

Finally, a huge thank-you to dilyevsky, detailyang, mayurjadhavibm, and olljanat for going above and beyond in pushing Calico beyond its original scope and for generously sharing their solutions with the rest of the community.

Community-driven enhancement request: Fine-grained BGP route control

The primary responsibility Continue reading

Monitoring Kubernetes clusters activity with Azure Managed Grafana and Calico

Cloud computing revolutionized how a business can establish its digital presence. Nowadays, by leveraging cloud features such as scalability, elasticity, and convenience, businesses can deploy, grow, or test an environment in every corner of the world without worrying about building the required infrastructure.

Unlike the traditional model, which was based on notifying the service provider to set up the resources for customers in advance, in an on-demand model, cloud providers implement application programming interfaces (API) that can be used by customers to deploy resources on demand. This allows the customer to access an unlimited amount of resources on-demand and only pay for the resources they use without worrying about the infrastructure setup and deployment complexities.

For example, a load balancer service resource is usually used to expose an endpoint to your clients. Since a cloud provider’s bandwidth might be higher than what your cluster of choice can handle, a huge spike or unplanned growth might cause some issues for your cluster and render your services unresponsive.

To solve this issue, you can utilize the power of proactive monitoring and metrics to find usage patterns and get insight into your system’s overall health and performance.

In this hands-on tutorial, I will Continue reading

Calico’s 3.26.0 update unlocks high density vertical scaling in Kubernetes

Kubernetes is a highly popular and widely used container orchestration platform designed to deploy and manage containerized applications at a scale, with strong horizontal scaling capabilities that can support up to 5,000 nodes; the only limit in adding nodes to your cluster is your budget. However, its vertical scaling is restricted by its default configurations, with a cap of 110 pods per node. To maximize the use of hardware resources and minimize the need for costly horizontal scaling, users can adjust the kubelet maximum pod configuration to increase this limit allowing more pods to run concurrently on a single node.

To avoid network performance issues and achieve efficient horizontal scaling in a Kubernetes cluster that is tasked to run a large number of pods, high-speed links and switches are essential. A reliable and flexible Software Defined Networking (SDN) solution, such as Calico, is also important for managing network traffic efficiently. Calico has been tested and proven by numerous companies for horizontal scaling, but in this post, we will discuss recent improvements made to help vertical scaling of containerized applications to just work.

For example, the following chart illustrates the efficiency achieved with the improvements of vertical scaling in Calico 3. Continue reading

Navigating the security challenges of multi-tenancy in a cloud environment

Multi-tenancy can maximize the number of resources that are utilized in a cluster by sharing these resources between different groups, teams, or customers. However, boundaries must be placed to avoid problems associated with resource-sharing. On top of that, in a multi-tenant cluster, the number of security policies might gradually grow to the point where a slight misconfiguration could cause major security problems, performance issues, and service disruptions.

In this blog post, we will focus on multi-tenancy issues such as bandwidth shortage, security policy scaling, privacy impacts, and suggest a few solutions that you can deploy to solve them in your environment. We will also look at how an eBPF-based security design can offer better performance and help you navigate the complex multi-tenant environment with ease.

What is multi-tenancy?

Technologies such as virtualization, containerization, or any other technologies that allow a range of different workloads to share the underlying hardware resources, all have a common goal—allocate resources as efficiently as possible and make the most of the available hardware. However, it is common for workloads that are running in such an environment to not fully utilize all the potential power that the hardware can offer, and in many cases, leave a Continue reading

Kubernetes network monitoring: What is it, and why do you need it?

In this article, we will dive into Kubernetes network monitoring and metrics, examining these concepts in detail and exploring how metrics in an application can be transformed into tangible, human-readable reports. The article will also include a step-by-step tutorial on how to enable Calico’s integration with Prometheus, a free and open-source CNCF project created for monitoring the cloud. By the end of the article, you will be able to create customized reports and graphical dashboards from the metrics that Calico publishes to get better insight into the inner workings of your cluster and its various components. In addition, you will have the fundamental knowledge of how these pieces can fit together to establish Kubernetes network monitoring for any environment.

Background

The benefits offered by cloud computing and infrastructure as code, including scalability, easy distribution, and quick and flexible deployment, have caused cloud service adoption to skyrocket. But this rapid adoption requires checks and balances to ensure that cloud services are secure and running in their desired state. Furthermore, any security events and problems should be logged and reported for future examination.

Read our guide on Kubernetes logging: Approaches and best practices

In the past, traditional monitoring solutions such as Nagios Continue reading

Turbocharging Host Workloads with Calico eBPF and XDP

In Linux, network-based applications rely on the kernel’s networking stack to establish communication with other systems. While this process is generally efficient and has been optimized over the years, in some cases it can create unnecessary overhead that can affect the overall performance of the system for network-intensive workloads such as web servers and databases. Calico Open Source offer an easier way to tame these technologies. Calico Open Source is a networking and security solution that seamlessly integrates with Kubernetes and other cloud orchestration platforms. While infamous for its policy engine and security capabilities, there are many other features that can be used in an environment by installing Continue reading

What’s new in Calico v3.25

We’ve just released Calico v3.25! This milestone release includes a number of eBPF dataplane improvements designed to deliver an even faster upgrade experience, smaller memory footprint, and shorter eBPF networking object load time speed.

But before we get into the details of these changes, let’s welcome and thank our new community problem-solvers who got their first contribution requests merged into our beloved project.

Community shoutout

Documentation is the most essential part of any project since that is the go-to place for everyone to get a better idea about the capabilities or deployment of that project. So let’s start by giving a big shout-out to @cavcrosby, @Congrool, @chenbojian, and @gopihc for their attention to detail and fixing issues in the project documentation.

Shoutout to @OrvilleQ and @masap for extending the exclusion list of interfaces to make the automatic interface selection of Calico even faster.

Shoutout to @gregwhorley, @dlipovetsky, @nickperry, and @tamcore for their updates to `tigera-operator` that will make the installation and maintenance experience of Calico even better.

Shoutout to @ramanujadasu for enhancing the logic behind the unicast IP address hashing.

Shoutout to @chrisjohnson00 and @vitaliy-leschenko for enhancing the Calico windows installer and adding Continue reading

Securing Windows workloads

Containers are a great way to package applications, with minimal libraries required. It guarantees that you will have the same deployment experience, regardless of where the containers are deployed. Container orchestration software pushes this further by preparing the necessary foundation to create containers at scale.

Linux and Windows support containerized applications and can participate in a container orchestration solution. There is an incredible number of guides and how-to articles on Linux containers and container orchestration, but these resources get scarce when it comes to Windows, which can discourage companies from running Windows workloads.

This blog post will examine how to set up a Windows-based Kubernetes environment to run Windows workloads and secure them using Calico Open Source. By the end of this post, you will see how simple it is to apply your current Kubernetes skills and knowledge to rule a hybrid environment.

Windows containers

A container is similar to a lightweight packaging technique. Each container packages an application in an isolated environment that shares its kernel with the underlying host, making it bound by the limits of the host operating system. These days, everyone is familiar with Linux containers, a popular way to run Linux-based binary files in an Continue reading

How to build a service mesh with Istio and Calico

Microservices are loosely coupled software that provides flexibility and scalability to a cloud environment. However, securing this open architecture from vulnerabilities and malicious actors can be challenging without a service mesh.

This blog post will demonstrate how you can create an Istio and Calico integration to establish a service mesh that will manipulate HTTP traffic in the application layer. This Istio-Calico integration provides a unified way to write security policies interacting with applications and implement restrictions without disturbing the entire system.

What’s a service mesh?

A service mesh is a software layer that sits between the microservices that form your workload. After deploying and enabling a service mesh system for your workloads, an injector will add a sidecar container to each. These sidecars then collect and manipulate information via the rules you provide, allowing you to secure your cluster on an application level without requiring any change inside your software.

Without a service mesh, to ensure communication integrity and confidentiality between workloads, you must modify each to embed encryption methods. On top of that, gathering insight into the events that are happening in the application layer will require modifying the workload application itself, which all requires a good amount of Continue reading

Getting started with EKS and Calico

Cloud-native applications offer a lot of flexibility and scalability, but to leverage these advantages, we must create and deploy a suitable environment that will enable cloud-native applications to work their magic.

Managed services, self-managed services, and bare metal are three primary categories of Kubernetes deployment in a cloud environment. Our focus in this article will be on Amazon Web Service’s (AWS) managed Kubernetes service, Elastic Kubernetes Service (EKS), and capabilities that Calico Open Source adds to the EKS platform.

Managed services

A managed cluster is a quick and easy way to deploy an enterprise-grade Kubernetes cluster. In a managed cluster, mundane operations such as provisioning new nodes, upgrading the OS/Kubernetes, and scaling resources are transferred to the cloud provider, which allows you to expand your application with ease.

EKS is a managed service by AWS that offers a fault-tolerant Kubernetes control plane endpoint and automates worker node maintenance and deployment process.

Comparing popular CNI options in EKS

Most popular managed services, such as EKS, come with an official CNI that offers networking and other features for your cluster. While these CNIs are highly integrated with the underlying system, they can introduce some limitations. To remedy these limitations and unlock the Continue reading

What is new in Calico v3.24

A couple of weeks ago, TIgera engineers released the new version of Calico, as part of a community effort to drive cloud security and networking even further. But before I begin diving into the details of this new release, I want to first spotlight a few of our community members who have merged their contributions to Calico Open Source for the first time.

Shout out to @agaffney for adding configurable labels and annotations to the tigera-operator deployment in Helm charts.

Shout out to @backjo for improving the Calico Windows installation script and adding support for IMDSv2 in AWS EC2 data retrieval.

Shout out to @EugenMayer for pointing out an improvement for the calicoctl binary in a Helm chart installation and @lou-lan for making it happen.

Shout out to @joskuijpers for informing the community about the outdated ipset package in the calico-node ARM64 image and @ScOut3R for updating it.

Shout out to @juanfresia for contributing changes to enable Calico to run without programming the route table, useful when integrating with other routing mechanisms.

Shout out to @muff1nman, who added Wireguard traffic to the Calico failsafe ports, allowing us to confidently apply network security policies without worrying about accidentally cutting off Continue reading

What is eBPF and what are its use cases

With the recent advancements in service delivery through containers, Linux has gained a lot of popularity in cloud computing by enabling digital businesses to expand easily regardless of their size or budget. These advancements have also brought a new wave of attack, which is challenging to address with the same tools we have been using for non cloud-native environments. eBPF offers a new way to interact with the Linux kernel, allowing us to reexamine the possibilities that once were difficult to achieve.

In this post, I will go through a brief history of the steps that eBPF had to take to become the Swiss army knife inside the Linux kernel and point out how it can be used to achieve security in a cloud-native environment. I will also share my understanding of what happens inside the kernel that prevents BPF programs from wreaking havoc on your operating system.

BPF history

In the early days of computing, Unix was a popular solution for capturing network traffic, and using CMU/Stanford packet filter (CSPF) to capture packets using 64KB PDP-11 was gaining popularity by the second. Without a doubt, this was a pioneering work and a leap forward for its time but like Continue reading

A practical guide to container networking

An important part of any Kubernetes cluster is the underlying containers. Containers are the workloads that your business relies on, what your customers engage with, and what shapes your networking infrastructure. Long story short, containers are arguably the soul of any containerized environment.

One of the most popular open-source container orchestration systems, Kubernetes, has a modular architecture. On its own, Kubernetes is a sophisticated orchestrator that helps you manage multiple projects in order to deliver highly available, scalable, and automated deployment solutions. But to do so, it relies on having a suite of underlying container orchestration tools.

This blog post focuses on containers and container networking. Throughout this post, you will find information on what a container is, how you can create one, what a namespace means, and what the mechanisms are that allow Kubernetes to limit resources for a container.

Containers

A container is an isolated environment used to run an application. By utilizing the power of cgroup, namespace, and filesystem from the Linux kernel, containers can be allocated with a limited amount of resources and filesystems inside isolated environments.

Note: Some applications deliver containers that use other technologies. In this post, I will focus on these Continue reading

How to maximize K3s resource efficiency using Calico’s eBPF data plane

Amazon’s custom-built Graviton processor allows users to create ARM instances in the AWS public cloud, and Rancher K3s is an excellent way to run Kubernetes in these instances. By allowing a lightweight implementation of Kubernetes optimized for ARM with a single binary, K3s simplifies the cluster initialization process down to executing a simple command.

In an earlier article, I discussed how ARM architecture is becoming a rival to x86 in cloud computing, and steps that can be taken to leverage this situation and be prepared for this new era. Following the same narrative, in this article I’ll look at an example of the Calico eBPF data plane running on AWS, using Terraform to bootstrap our install to AWS, and Rancher K3s to deploy the cluster.

A few changes to Calico are needed for ARM compatibility, including updating parts, enabling eBPF, and compiling operators for the ARM64 environment:.

  • Tigera Operator Tigera Operator is the recommended way to install Calico.
  • go-build go-build is a container environment packed with all the utilities that Calico requires in its compilation process.
  • Calico-node Calico-node is the pod that hosts Felix (i.e. it is the brain that carries control plane decisions fto Continue reading