Reza Ramezanpour

Author Archives: Reza Ramezanpour

What’s New in Calico v3.31: eBPF, NFTables, and More

We’re excited to announce the release of Calico v3.31,  🎉 which brings a wave of new features and improvements.

For a quick look, here are the key updates and improvements in this release:

How to Deploy Whisker and Goldmane in Manifest Only Calico Setups

Your Step-by-Step to Observability Without the Operator

If you’re running Calico using manifests, you may have found that enabling the observability features introduced in version 3.30, including Whisker and Goldmane, requires a more hands-on approach. Earlier documentation focused on the Tigera operator, which automates key tasks such as certificate management and secure service configuration. In a manifest-based setup, these responsibilities shift to the user. While the process involves more manual steps, it provides greater transparency and control over each component. With the right guidance, setting up these observability tools is entirely achievable and offers valuable insight into the internal behavior of your Calico deployment.

We’ve heard from many of you in the Calico Slack community: you’re eager to try out Whisker and Goldmane but aren’t sure how to set them up without Helm or the operator. For anyone who’s up for a challenge, this blog post provides a step-by-step guide on how to get everything wired up the hard way.
However, even if you already use the operator, keep reading! We’re going to pull back the curtain on the magic it performs behind the scenes. Understanding these mechanics will help you troubleshoot, customize, and better appreciate a managed Continue reading

Kubernetes Observability: Your Q&A Guide to Calico Whisker

Calico Whisker vs. Traditional Observability: Why Context Matters in Kubernetes Networking

Are you tired of digging through cryptic logs to understand your Kubernetes network? In today’s fast-paced cloud environments, clear, real-time visibility isn’t a luxury, it’s a necessity. Traditional logging and metrics often fall short, leaving you without the context needed to troubleshoot effectively.

That’s precisely what Calico Whisker’s recent launch (with Calico v3.30) aims to solve. This tool provides clarity where logs alone fall short. In the sections below, you’ll get a practical overview of how it works and how it fits into modern Kubernetes networking and security workflows.

If you’re relying on logs for network observability, you’re not alone. While this approach can provide some insights, it’s often a manual, resource-intensive process that puts significant load on your distributed systems. It’s simply not a cloud-native solution for real-time insights.

So are we doomed? No. Calico Whisker transforms network observability from a chore into a superpower.

What is Calico Whisker?

Calico Whisker is a free, lightweight, Kubernetes-native observability user interface (UI) created by Tigera and introduced with Calico Open Source v3.30. It’s designed to give you a simple yet powerful window into your cluster’s network traffic, helping you understand network flows and evaluate policy behavior in real-time.

In Continue reading

How 1&1 Mail & Media Scaled Kubernetes Networking with eBPF and Calico

“We started in 2017 with Calico and never regretted it!”
—Stefan Fudeus, Product Owner/Lead Architect, 1&1 Mail & Media

Challenge

1&1 Mail & Media, part of the IONOS group, powers popular European internet brands including GMX and Web.de, serving more than 50% of Germany’s population with critical identity and email infrastructure. With roughly 45 to 50 million users, network reliability is non-negotiable. Any downtime could affect millions.

By 2022, the company had containerized 80% of its workloads on Kubernetes across three self-managed data centers. While the platform, backed by bare metal nodes and custom network layers, was highly scalable, network throughput bottlenecks began to emerge. Pods were limited to 2.5 Gbps of bandwidth due to IP encapsulation overhead, despite 10 Gbps network interfaces.

The team needed a solution that:

  • Improved pod-to-pod network performance
  • Maintained strong network policy isolation across up to 40 tenants per cluster
  • Scaled to millions of network connections and 1.4 million HTTP requests per second

Solution

1&1 Mail & Media had adopted Calico back in 2017, largely for its unique Kubernetes NetworkPolicy standard support. As their Kubernetes platform evolved, with clusters scaling to 300 bare metal nodes, 16,000 pods, and over 4 million Continue reading

Top 5 Kubernetes Network Issues You Can Catch Early with Calico Whisker

Kubernetes networking is deceptively simple on the surface, until it breaks, silently leaks data, or opens the door to a full-cluster compromise. As modern workloads become more distributed and ephemeral, traditional logging and metrics just can’t keep up with the complexity of cloud-native traffic flows.

That’s where Calico Whisker comes in. Whisker is a lightweight Kubernetes-native observability tool created by Tigera. It offers deep insights into real-time traffic flow patterns, without requiring you to deploy heavyweight service meshes or packet sniffer. And here’s something you won’t get anywhere else: Whisker is data plane-agnostic. Whether you run Calico eBPF data plane, nftables, or iptables, you’ll get the same high-fidelity flow logs with consistent fields, format, and visibility. You don’t have to change your data plane, Whisker fits right in and shows you the truth, everywhere.

Let’s walk through 5 network issues Whisker helps you catch early, before they turn into outages or security incidents.

1. Policy Misconfigurations

Traditional observability tools often show whether a packet was forwarded, accepted or dropped, but not why. They lack visibility into which Kubernetes network policy was responsible or if one was even applied.

With Whisker, each network flow is paired with:

Kubernetes Is Powerful, But Not Secure (at least not by default)

Kubernetes has transformed how we deploy and manage applications. It gives us the ability to spin up a virtual data center in minutes, scaling infrastructure with ease. But with great power comes great complexities, and in the case of Kubernetes, that complexity is security.

By default, Kubernetes permits all traffic between workloads in a cluster. This “allow by default” stance is convenient during development, and testing but it’s dangerous in production. It’s up to DevOps, DevSecOps, and cloud platform teams to lock things down.

To improve the security posture of a Kubernetes cluster, we can use microsegmentation, a practice that limits each workload’s network reach so it can only talk to the specific resources it needs. This is an essential security method in today’s cloud-native environments.

Why Is Microsegmentation So Hard?

We all understand that network policies can achieve microsegmentation; or in other words, it can divide our Kubernetes network model into isolated pieces. This is important since Kubernetes is usually used to provide multiple teams with their infrastructural needs or host multiple workloads for different tenants. With that, you would think network policies are first citizens of clusters. However, when we dig into implementing them, three operational challenges Continue reading

Dry Run: Your Kubernetes network policies with Calico staged network policies

Kubernetes Network Policies (KNP) are powerful resources that help secure and isolate workloads in a cluster. By defining what traffic is allowed to and from specific pods, KNPs provide the foundation for zero-trust networking and least-privilege access in cloud-native environments.

But there’s a problem: KNPs are risky, and applying them without a clear game plan can be potentially disruptive.

Without deep insight into existing traffic flows, applying a restrictive policy can instantly break connectivity killing live workloads, user sessions, or critical app dependencies. An even scarier scenario is when we implement policies that we think cover everything and workloads actually work, but after a restart or scaling operation we hit new problems. Kubernetes, with all of its features, has no built-in “dry run” mode for policies, and no first-class observability to show what would be blocked or allowed which is the right decision since Kubernetes is an orchestrator not an implementer.

This forces platform teams into a difficult choice, deploy permissive or no policies and weaken security, or Risk service disruption while debugging restrictive ones. As a result, many teams delay implementing network policies entirely only to regret it after a zero-day exploit like Log4Shell, XZ backdoor, or other vulnerabilities Continue reading

Securing Kubernetes Traffic with Calico Ingress Gateway

If you’ve managed traffic in Kubernetes, you’ve likely navigated the world of Ingress controllers. For years, Ingress has been the standard way of getting our HTTP/S services exposed. But let’s be honest, it often felt like a compromise. We wrestled with controller-specific annotations to unlock critical features, blurred the lines between infrastructure and application concerns, and sometimes wished for richer protocol support or a more standardized approach. This “pile of vendor annotations,” while functional, highlighted the limitations of a standard that struggled to keep pace with the complex demands of modern, multi-team environments. The Ingress model was stretched well beyond what it was originally designed for, and over time that led to portability issues, inconsistent behaviour, and real security vulnerabilities.

Ingress NGINX Retirement: Why This Matters Now

The Kubernetes Security Response Committee recently announced the retirement of Ingress NGINX, with support ending in March 2026. This decision reinforces the exact challenges the community has been raising for years. The same flexibility that made it popular early on, especially features like snippet-based configuration, became a major source of technical debt, vendor lockin and security exposure.

After the retirement date, Ingress NGINX will no longer receive security updates or bug fixes. Running Continue reading

Is It Time to Migrate? A Practical Look at Kubernetes Ingress vs. Gateway API

If you’ve managed traffic in Kubernetes, you’ve likely worked with Ingress controllers. For years, Ingress has been the standard way to expose HTTP and HTTPS services. But in practice, it often came with trade-offs. Controller-specific annotations were required to unlock critical features, the line between infrastructure and application responsibilities was unclear, and configurations often became tied to the implementation rather than the intent.

Ingress NGINX Retirement Raises the Stakes

Recently, the Kubernetes community announced that Ingress NGINX will be formally retired, with only best-effort maintenance provided until March 2026. After that point, there will be no bug fixes, no security updates, and the project will move to read-only archival status. Any cluster still relying on Ingress NGINX after that date will be running an unsupported controller, which increases maintenance overhead and security risk.

For many organizations, now is the time to treat this as a high-priority project: inventory all clusters using Ingress NGINX, create a migration plan (test, convert, cut over), and avoid ending up in a reactive scramble as the March 2026 deadline approaches.

If the move from Ingress to the Gateway API once felt optional, this new timeline changes the situation. Depending on an aging data-plane component without Continue reading

Recap: KubeCon + CloudNativeCon Europe 2025

When I got the assignment to attend KubeCon 1st of April I thought it was an April prank, but as the date got closer I realized—this is for real and I’ll be on the ground in London at the tenth anniversary of cloud native computing. I’ve seen a lot of tech events during my years in the industry while trying not to get replaced by AI and I have to say this one stands out!

Image source: CNCF YouTube Channel

Here is my recap of KubeCon + CloudNativeCon Europe 2025.

CalicoCon 2025

CalicoCon is an event that happens twice every year, as a co-located event during KubeCon NA and EU. It’s a free event that allows you to learn about Tigera’s vision for the future of networking and security in the cloud. There’s also an after-party to celebrate our community and people like you who are on this journey with us!

This year our main focus was on Calico v3.30, our upcoming release that will add a lot of anticipated features to Calico, unlocking things like observability, staged network policy, and gateway api. CalicoCon brought together cloud-native enthusiasts to explore the latest advancements in Calico and Kubernetes networking.

Continue reading

How to get started with Calico Observability features

Kubernetes, by default, adopts a permissive networking model where all pods can freely communicate unless explicitly restricted using network policies. While this simplifies application deployment, it introduces significant security risks. Unrestricted network traffic allows workloads to interact with unauthorized destinations, increasing the potential for cyberattacks such as Remote Code Execution (RCE), DNS spoofing, and privilege escalation.

To better understand these problems, let’s examine a sample Kubernetes application: ANP Demo App.

This application comprises a deployment that spawns pods and a service that exposes them to external users in a similar situation like any real word workload which you will encounter in your environment.

If you open the application service before implementing any policies, the application reports the following messages:

  1. Container can reach the Internet – Without network policies, an attacker can use our container as an entry point by exploiting it with a vulnerability. This could allow them to exfiltrate data or establish remote control over the workload by leveraging its Internet access.
  2. Container can reach CoreDNS Pods – Kubernetes relies heavily on DNS, with records served using CoreDNS Pods. While communication between your Pods and CoreDNS is essential and not inherently a vulnerability, pairing it with unrestricted access to Continue reading

Calico Open Source 3.30: Exploring the Goldmane API for custom Kubernetes Network Observability

Kubernetes is built on the foundation of APIs and abstraction, and Calico leverages its extensibility to deliver network security and observability in both its commercial and open source versions. APIs are the special sauce that help automate and operationalize your Kubernetes platforms as part of a CI/CD pipeline and other GitOps workflows.

Calico OSS 3.30, introduces numerous battle-tested observability and security tools from our commercial editions. This includes the following key features:

  • Goldmane – A gRPC-based API for accessing and capturing flow logs and policy evaluation metrics
  • Whisker – A web-based tool for viewing and filtering flow logs to troubleshoot connectivity issues and author and maintain Calico network security policies
  • GlobalStagedNetworkPolicy and StagedNetworkPolicy – New custom resources that allow you to audit the behavior of a new policy before you actively enforce it
  • Calico Ingress Gateway – Our 100% upstream, enterprise-ready implementation of the Gateway API that is based on Envoy Gateway
  • Calico Cloud ready – Every OSS cluster includes the required components to connect to a stateless, read-only, and free version of Calico Cloud

You may know about the Calico REST API, which allows you to manage Calico resources, such as Calico network policy, Calico IPAM configurations Continue reading

Calico Whisker, Your New Ally in Network Observability

With the upcoming release of Calico v3.30 on the horizon, we are excited to introduce Calico Whisker, a simple yet powerful User Interface (UI) designed to enhance network observability and policy debugging. If you’ve ever struggled to make sense of network flow logs or troubleshoot policies in a complex Kubernetes cluster, Whisker is your friend!

Whisker is a three part deployment that holds a UI, backend and a gRPC channel to communicate with the Felix brain of Calico to gather live flow information and present it in a human readable, easy to understand way. But before we get started let’s dive into why Whisker is a must-have for your Kubernetes environment, what problems it solves, and how it can streamline your policy management.

Navigating Network Flows is Difficult

In Kubernetes environments, network flows are the backbone of communication between workloads. As clusters scale, so does the complexity of managing these flows and their security. Without clear visibility and effective observability tools, teams often struggle with:

  • Diagnosing unexplained workload behavior and determining why certain applications aren’t working as expected.
  • Identifying the real reason why certain workload communications are permitted or denied, which stems from understanding which policies are affecting specific Continue reading

High-Performance Kubernetes Networking with Calico eBPF

Kubernetes has revolutionized cloud-native applications, but networking remains a crucial aspect of ensuring scalability, security, and performance. Default networking approaches, such as iptables-based packet filtering, often introduce performance bottlenecks due to inefficient packet processing and complex rule evaluations. This is where Calico eBPF comes into play, offering a powerful alternative that enhances networking efficiency and security at scale.

Understanding Kubernetes Networking

Kubernetes networking consists of two primary components:

  1. Physical Network Infrastructure – Connects cloud resources to external networks, ensuring communication between nodes and the broader internet.
  2. Cluster Network Infrastructure – Manages internal workload communication within the Kubernetes cluster, including service-to-service traffic and pod-to-pod interactions.

Choosing the right data plane is critical for optimal performance. Factors such as cluster size, throughput, and security requirements should guide this choice. Poor networking choices can lead to congestion, excessive latency, and resource starvation.

Data Plane Options in Kubernetes Networking

Networking in Kubernetes is an abstract idea. While Kubernetes lays the foundation, your Container Networking Interface (CNI) is in charge of the actual networking. To better understand networking, we usually divide it into two sections: a control plane and a data plane.

Native Kubernetes cluster mesh with Calico

workloads from remote clusters

As Kubernetes continues to gain traction in the cloud-native ecosystem, the need for robust, scalable, and highly available cluster deployments has become more noticeable.

While a Kubernetes cluster can easily expand via additional nodes, the downside of such an approach is that you might have to spend a lot of time troubleshooting the underlying networking or managing and updating resources between clusters. On top of that, a multi-regional scenario or hyper-cloud environment might be off the limits depending on the limitations that a cloud provider or your Kubernetes distro might impose on your environment.

Calico Enterprise cluster mesh is a suite of features native to Kubernetes with a multi-layer design that connects two or more Kubernetes clusters and seamlessly shares resources between them. This post will explore cluster mesh, its benefits, and how it can enhance your Kubernetes environment.

Projects that provide cluster mesh

Multiple projects offer cluster mesh, and while they are all similar in basic principles, each has a different take on implementing this solution in an environment.

The following table is a brief overview of notable projects that offer cluster mesh:

Calico Open Source Calico Enterprise Cilium Calico Enterprise Submariner
Encapsulation IPIP Direct Continue reading

Kubernetes network policies: 4 pain points and how to address them

Kubernetes is used everywhere, from test environments to the most critical production foundations that we use daily, making it undoubtedly a de facto in cloud computing. While this is great news for everyone who works with, administers, and expands Kubernetes, the downside is that it makes Kubernetes a favorable target for malicious actors.

Malicious actors typically exploit flaws in the system to gain access to a portion of the environment. They then chain these flaws together to move laterally within the environment, ultimately seeking root access or access to critical information.

While the best way to fix security flaws in any software is to patch it with appropriate fixes that the project maintainers publish, there are certain security practices that you can adopt to fortify your environment, like using network policies. However, most people find network policies complex and overwhelming, which discourages them from implementing policies in their environment.

In this blog post, we will examine four pain points that people face when they want to implement network policies and provide solutions to help you effectively secure your Kubernetes environment.

What is a network policy and why should I use it?

In Kubernetes, a network policy (KNP) resource is the Continue reading

What is new in Calico 3.28

TL/DR

  • A new Grafana dashboard that helps you monitor Calico Typha’s performance and troubleshoot issues.
  • Calico eBPF dataplane IPv6 is now GA. It supports true IPv6-only clusters as well as dual-stack clusters. 🐝
  • Optional Pod startup delay to ensure networking is up in high-churn scenarios.
  • Tigera operator now supports multiple IP pools, IP pool modification, affinity for operator pods, priorityclassname, and more!
  • Improved policy performance in both eBPF and iptables.
  • Calico now ships with a pprof server. Activate the performance server for real-time views of Typha and Felix components and real-time debugging.

🚨 Important changes 🚨

Calico 3.28 now has enabled VXLAN checksum offload by default for environments with the kernel version of 5.8 or above. In the past, offloading was disabled due to kernel bugs.

Please keep in mind, if you are upgrading to 3.28 this change will take effect after node restarts.

If you encounter unexpected performance issues, you can use the following command to revert to the previous method by using the following command:

kubectl patch felixconfiguration default --type="merge" -p='{"spec":{"FeatureDetectOverride":"ChecksumOffloadBroken=true"}'

Please keep in mind that you can report any issues via GitHub tickets or Slack and include a detailed description of the environment (NIC hardware, kernel, distro, Continue reading

Amazon EKS networking options

When setting up a Kubernetes environment with Amazon Elastic Kubernetes Service (EKS), it is crucial to understand your available networking options. EKS offers a range of networking choices that allow you to build a highly available and scalable cloud environment for your workloads.

In this blog post, we will explore the networking and policy enforcement options provided by AWS for Amazon EKS. By the end, you will have a clear understanding of the different networking options and network policy enforcement engines, and other features that can help you create a functional and secure platform for your Kubernetes workloads and services.

Amazon Elastic Kubernetes Service (EKS)

Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service that simplifies routine operations, such as cluster deployment and maintenance, by automating tasks such as patching and updating operating systems and their underlying components. EKS enhances scalability through AWS Auto Scaling groups and other AWS service integrations and offers a highly available control plane to manage your cluster.

Amazon EKS in the cloud has two options:

  • Managed
  • Self-managed

Managed clusters rely on the AWS control plane node, which AWS hosts and controls separately from your cluster. This node operates in isolation and cannot be directly Continue reading

Exploring AKS networking options

At Kubecon 2023 in Amsterdam, Azure made several exciting announcements and introduced a range of updates and new options to Azure-CNI (Azure Container Networking Interface). These changes will help Azure Kubernetes Services (AKS) users to solve some of the pain points that they used to face in previous iterations of Azure-CNI such as IP exhaustion and big cluster deployments with custom IP address management (IPAM). On top of that, with this announcement Microsoft officially added an additional dataplane to the Azure platform.

The big picture

Worker nodes in an AKS (Azure Kubernetes Service) cluster are Azure VMs pre-configured with a version of Kubernetes that has been tested and certified by Azure. These clusters communicate with other Azure resources and external sources (including the internet) via the Azure virtual network (VNet).

Now, let’s delve into the role of the dataplane within this context. The dataplane operations take place within each Kubernetes node. It is responsible for handling the communication between your workloads, and cluster resources. By default, an AKS cluster is configured to utilize the Azure dataplane, which Continue reading

1 2 3