Bikram Gupta

Author Archives: Bikram Gupta

Solving Microservices Connectivity Issues with Network Logs

The network is foundational to distributed application environments. A distributed application has multiple microservices, each running in a set of pods often located on different nodes. Problem areas in a distributed application can be in network layer connectivity (think network flow logs), or application resources unavailability (think metrics), or component unavailability (think tracing). Network layer connectivity can be impacted by various factors such as routing configuration, IP pool configuration, network policies, etc. When service A cannot talk to service B over the network, or an external application cannot connect to service A, network logs become an essential source of historical data needed for troubleshooting connectivity issues. Just like in a traditional network, network logs enable cluster administrators to monitor the Kubernetes microservices network.

 

 

Network Logs Can Address Multiple Use Cases

Network logs can be used to serve the unique requirements of different teams (DevOps, SecOps, Platform, Network). The value of Kubernetes network logs resides in the information collected, such as detailed context about endpoints (e.g., pods, labels, namespaces) and the network policies deployed in configuring the connection. Within the IT estate, DevOps, SecOps, Network and Platform teams can use network logs to address use cases that Continue reading

Security Policy Self-Service for Developers and DevOps Teams

In today’s economy, digital assets (applications, data, and processes) determine business success. Cloud-native applications are designed to iterate rapidly, creating rapid time-to-value for businesses. Organizations that are able to rapidly build and deploy their applications have significant competitive advantage. To this end, more and more developers are creating and leading DevOps teams that not only drive application development, but also take on operational responsibilities formerly owned by platform and security teams.

What’s the Value of a Self-Service Approach?

Cloud-native applications are often designed and deployed as microservices. The development team that owns the microservice understands the behavior of the service, and is in the best position to define and manage the network security of their microservice. A self-service model enables developers to follow a simple workflow and generate network policies with minimal effort. When problems occur with. application connectivity, developers should be able to diagnose connectivity issues and resolve them quickly without having to depend on resources outside of the team.

Developers and DevOps teams can also take a leading role in managing security, which is an integral part of cloud-native applications. There are two aspects to security in the context of Kubernetes.

Enforcing Enterprise Security Controls in Kubernetes using Calico Enterprise

Hybrid cloud infrastructures run critical business resources and are subject to some of the strictest network security controls. Irrespective of the industry and resource types, these controls broadly fall into three categories.

  1. Segmenting environments (Dev, Staging, Prod)
  2. Enforcing zones (DMZ, Trusted, etc.)
  3. Compliance requirements (GDPR, PCI DSS)

Workloads (pods) running on Kubernetes are ephemeral in nature, and IP-based controls are no longer effective. The challenge is to enforce the organizational security controls on the workloads and Kubernetes nodes themselves. Customers need the following capabilities:

  • Ability to implement security controls both globally and on a per-app basis: Global controls help enforce segmentation across the cluster, and work well when the workloads are classified into different environments and/or zones using labels. As long as the labels are in place, these controls will work for any new workloads.
  • Generate alerts if security controls are tampered with: Anyone with valid permissions can make changes to the controls. There is a possibility that these controls can be modified without proper authorization or even with a malicious intent to bypass the security. Hence, it is important to monitor changes to the policies.
  • Produce an audit log showing changes to security controls over time: This is Continue reading

Enabling Microsegmentation with Calico Enterprise

Microsegmentation is a security technique that is used to isolate workloads from one another. Microsegmentation limits the blast radius of a data breach by making network security more granular. Should a breach occur, the damage is confined to the affected segment.  Application workloads have evolved over time – starting from bare metal, to a mix of on-prem and cloud virtual machines and containers. Similarly, the pace of change has dramatically increased, both in terms of release updates and auto-scaling.

Enforcement of network security has also evolved over time, with organizations using a mix of physical/virtual firewalls and platform-specific security groups to manage network security. This creates the following challenges:

  1. Management Overhead – Organizations have to maintain different products, teams and workflows to manage and operate segmentation across containers, VMs and bare metal. The diagram above shows how different platforms may require different approaches to segmentation, thereby creating a burden on the operations team.
  2. Lack of Cloud-Native Performance – With hybrid cloud becoming a norm, products built for traditional workloads can neither scale nor enforce security for cloud-native deployments with minimal latency.

Calico Enterprise provides a common policy language for segmentation that works across all of your hybrid cloud and Continue reading

Designing On-Prem Kubernetes Networks for High Availability

Designing and maintaining networks is hard.  When deploying Kubernetes in your on-prem data center, you will need to answer a basic question: Should it be an overlay network on top of an existing network, or should it be part of an existing network? The Networking options table provides guidelines to choose the right type of networking based on various factors. If you decide to use native peering (pre-dominant option for on-prem), you will have to configure the network to ensure availability in the event of network outage (ex. Cable disconnected, TOR switch failure etc.). We cover a typical L3 highly-available network design in this post.

A cluster spans multiple racks. In an L3 deployment, these racks have different CIDR ranges. So the nodes in different racks should be able to talk to each other. Referring to the diagram below, that traffic goes through the network fabric. If you want to build out such a lab for your own learning, here is the example.

If you have a leaf-spine fabric with a single TOR (top-of-rack, or leaf switch), then that TOR becomes a point of failure for the entire rack. If all the master nodes are on the same Continue reading

Why use Typha in your Calico Kubernetes Deployments?

Calico is an open source networking and network security solution for containers, virtual machines, and native host-based workloads. Calico supports a broad range of platforms including Kubernetes, OpenShift, Docker EE, OpenStack, and bare metal. In this blog, we will focus on Kubernetes pod networking and network security using Calico.

Calico uses etcd as the back-end datastore. When you run Calico on Kubernetes, you can use the same etcd datastore through the Kubernetes API server. This is called a Kubernetes backed datastore (KDD) in Calico. The following diagram shows a block-level architecture of Calico.

Calico-node runs as a Daemonset, and has a fair amount of interaction with the Kubernetes API server. It’s easy for you to profile that by simply enabling audit logs for calico-node. For example, in my kubeadm cluster, I used the following audit configuration

 

To set the context, this is my cluster configuration.
As we are running Typha already, let us profile the API calls for both Calico and Typha components. I used the following commands to extract the unique API calls for each.

 

If you ignore the license key API calls from calico-node, you will see that the API calls Continue reading

Decentralized Calico Network Security Policy Deployment for GitOps – Part 2

In part 1 of the GitOps blog series, we discussed the value of using GitOps for Calico policies, and how to roll out such a framework. In this second part of the series, we will expand the scope to include decentralized deployment and GitOps.

We see different personas among our customers deploying three types of controls:

  1. Cluster hardening policies are enforced and controlled by the platform admin
  2. Organizational security controls are enforced and controlled by the security admin
  3. Each application team may have their own unique requirements. This is controlled by the DevOps admin for the specific application.

This is different from the traditional firewall world, where the security admin is responsible for managing security policies, and the change management window could be several weeks in duration. Adopting that model in Kubernetes is simply counter to the very principles of enabling the developers. So how can we make policy creation and enforcement simple, yet adhere to organizational processes? The answer lies in simple tooling, GitOps and governance.

Policies have business logic that must be implemented in YAML. The business logic (allow access for service A to service B, open port 443 inbound on service B, permit access to slack webhook Continue reading

Decentralized Calico Network Security Policy Deployment for GitOps – Part 2

In part 1 of the GitOps blog series, we discussed the value of using GitOps for Calico policies, and how to roll out such a framework. In this second part of the series, we will expand the scope to include decentralized deployment and GitOps.

We see different personas among our customers deploying three types of controls:

  1. Cluster hardening policies are enforced and controlled by the platform admin
  2. Organizational security controls are enforced and controlled by the security admin
  3. Each application team may have their own unique requirements. This is controlled by the DevOps admin for the specific application.

This is different from the traditional firewall world, where the security admin is responsible for managing security policies, and the change management window could be several weeks in duration. Adopting that model in Kubernetes is simply counter to the very principles of enabling the developers. So how can we make policy creation and enforcement simple, yet adhere to organizational processes? The answer lies in simple tooling, GitOps and governance.

Policies have business logic that must be implemented in YAML. The business logic (allow access for service A to service B, open port 443 inbound on service B, permit access to slack webhook Continue reading

Enforcing Network Security Policies with GitOps – Part 1

“How do I enable GitOps for my network security policies?” This is a common question we hear from security teams. Getting started with Kubernetes is relatively simple, but moving production workloads to Kubernetes requires alignment from all stakeholders – developers, platform engineering, network engineering, and security.

Most security teams already have a high-level security blueprint for their data centers. The challenge is in implementing that in the context of a Kubernetes cluster and workload security. Network policy is a key element of Kubernetes security. Network policy is expressed as a YAML configuration and works very well with GitOps.

We will do a three-part blog series covering GitOps for network security policies. In part one (this part), we cover the overview and getting started with a working example tutorial. In part two, we will extend the tutorial to cover an enterprise-wide decentralized security architecture. In the final part, we will delve into policy assurance with examples.

Note that all policies in Calico Enterprise (network security policy, RBAC, threat detection, logging configuration, etc.) are enforced as YAML configuration files, and can be enforced via a GitOps practice.

By adopting GitOps, security teams benefit in the following ways:

Enable GitOps for Kubernetes Security – Part 1

“How do I enable GitOps for my network policies?”

That is a common question we hear from security teams. Getting started with Kubernetes is relatively simple, but moving production workloads to Kubernetes requires alignment from all stakeholders – developers, platform engineering, network engineering, security.

Most security teams already have a high-level security blueprint for their data centers. The challenge is in implementing that in the context of a Kubernetes cluster and workload security. Network policy is a key element of Kubernetes security. Network policy is expressed as an YAML configuration, and works very well with GitOps.

We will do a 3 part blog series covering GitOps for network policies. In part 1 (this part), we cover the overview and getting started with a working example tutorial. In part 2, we will extend the tutorial to cover an enterprise-wide decentralized security architecture. In the final part, we will delve into policy assurance with examples. Note that all policies in Tigera Secure (network policy, RBAC, Threat detection, Logging configuration, etc.) are enforced as YAML configuration files, and can be enforced via a GitOps practice.

By adopting GitOps, security teams benefit as follows.