EKS, Bottlerocket, and Cilium with Pulumi

In late 2023, I added some Go code for use with Pulumi to stand up an Amazon Elastic Kubernetes Service (EKS) cluster “from scratch,” meaning without using any prebuilt Pulumi components (like the AWSX VPC component or the EKS component). The code is largely illustrative for newer users, written to show how to stitch together all the components needed for an EKS cluster. In this post, I’ll show you how to modify that code to use Bottlerocket OS as the node OS for your EKS cluster—and share some information on installing Cilium into (onto?) the cluster.

The example code can be found in the pulumi/eks-from-scratch folder in my “learning-tools” GitHub repository. As I mentioned, it’s written in Go, and the associated README file has full instructions for how to use that code in your own environment. Since the code was intended to be illustrative, I have tried to provide enough comments in the code for readers to be able to decode what’s happening without too much difficulty.

To use Bottlerocket OS on the EKS nodes in your cluster, you’ll have to modify the main.go file. Specifically, changes are needed in the section of code that creates a Continue reading

What’s new in Cloudflare One: Digital Experience (DEX) monitoring notifications and seamless access to Cloudflare Gateway with China Express

At Cloudflare, we are constantly innovating and launching new features and capabilities across our product portfolio. We are introducing roundup blog posts to ensure that you never miss the latest updates across our platform. In this post, we are excited to share two new ways that our customers can continue to keep their web properties performant and secure with Cloudflare One: new Digital Experience Monitoring (DEX) notifications help proactively identify issues that can affect the end-user digital experience, and integration with China Express enables secure access to China-hosted sites for Cloudflare Gateway customers.   

Using DEX Notifications for proactive monitoring with Cloudflare Zero Trust

As with other notification types, DEX notifications can be configured and reviewed from Cloudflare dashboard notifications.

What problem does it solve?

DEX notifications address the challenge of proactively identifying issues affecting the digital experience of your end users. By monitoring device health and conducting synthetic tests from WARP clients deployed on your fleet's end-user devices, DEX provides valuable insights. These notifications empower IT administrators to quickly identify and address connectivity and application performance problems before they impact a wide range of users.

By proactively notifying administrators when problems arise, DEX helps minimize user disruption and provides Continue reading

CJ Desai: Why I joined Cloudflare as President of Product and Engineering

I am thrilled to embark on this journey to run Product and Engineering at Cloudflare, driving forward the mission of helping build a better Internet. 

A little about me

While I was a graduate student at University of Illinois, the university introduced the Mosaic web browser to students. In addition to being super easy to install and use, it displayed pictures next to text for the first time. This may not seem impressive today, but back then it felt like a magical step forward.

This simple but powerful upgrade opened up the once niche user base from academics to the masses, transforming the world wide web to become an Internet phenomenon. Since then, I’ve always sought to be part of teams that worked on transformational technologies, including Software-as-a-Service, cloud computing, and AI. Innovation is the life blood of every technology company. To this day, I’m inspired by building products and technology that get adopted at mass scale.

Why Cloudflare

The world is in a very interesting moment for technological innovation: the AI landscape is uncharted and developing at an exponential rate; the urgency for enterprises to reduce tech debt and reliance on legacy applications is at an all Continue reading

CJ Desai: Why I joined Cloudflare as President of Product and Engineering

I am thrilled to embark on this journey to run Product and Engineering at Cloudflare, driving forward the mission of helping build a better Internet. 

A little about me

While I was a graduate student at University of Illinois, the university introduced the Mosaic web browser to students. In addition to being super easy to install and use, it displayed pictures next to text for the first time. This may not seem impressive today, but back then it felt like a magical step forward.

This simple but powerful upgrade opened up the once niche user base from academics to the masses, transforming the world wide web to become an Internet phenomenon. Since then, I’ve always sought to be part of teams that worked on transformational technologies, including Software-as-a-Service, cloud computing, and AI. Innovation is the life blood of every technology company. To this day, I’m inspired by building products and technology that get adopted at mass scale.

Why Cloudflare

The world is in a very interesting moment for technological innovation: the AI landscape is uncharted and developing at an exponential rate; the urgency for enterprises to reduce tech debt and reliance on legacy applications is at an all time Continue reading

Setting up Active Directory for ISE Lab

A key component of an ISE home lab is having Active Directory installed. In this post I’ll go through setting up basic AD for use with ISE. This post is not going to cover licensing. I’m assuming you are running the eval version, which is good for 180 days, or that you already have a valid license.

My server is running in an ESX environment based on the following specs:

  • OS – Windows Server 2022
  • CPU – 4 vCPU
  • RAM – 16 GB
  • Disk – 90 GB

I’m using more than the minimum requirements. Spec it as you like based on what capacity you have available.

The first step is installing the OS. This part is easy and pretty much only requires you to set an Administrator password.

When the server has booted, start by changing the name of the server. It’s better to do this before changing any roles. Go to System Settings -> Computer Name and click Change… Set the desired name. I’m using the name dc01 in my lab. Click OK.

Changing the name is going to trigger a restart. Choose Restart Now.

From Server Manager, click Add roles and features. Click Next until you get to Continue reading

What’s new in Cloudflare One: Digital Experience (DEX) monitoring notifications and seamless access to Cloudflare Gateway with China Express

At Cloudflare, we are constantly innovating and launching new features and capabilities across our product portfolio. We are introducing roundup blog posts to ensure that you never miss the latest updates across our platform. In this post, we are excited to share two new ways that our customers can continue to keep their web properties performant and secure with Cloudflare One: new Digital Experience Monitoring (DEX) notifications help proactively identify issues that can affect the end-user digital experience, and integration with China Express enables secure access to China-hosted sites for Cloudflare Gateway customers.   

Using DEX Notifications for proactive monitoring with Cloudflare Zero Trust

Digital Experience Monitoring (DEX) offers device, application, and network performance monitoring, providing IT administrators with insights to quickly identify and resolve issues. With DEX notifications , account administrators can create configurable alert rules based on available algorithms (z-score, SLO) and existing DEX filters. When notification criteria are satisfied, customers are notified via email, Pagerduty, or Webhooks

As with other notification types, DEX notifications can be configured and reviewed from Cloudflare dashboard notifications.

What problem does it solve?

DEX notifications address the challenge of proactively identifying issues affecting the digital experience of your end users. Continue reading

NAN075: Mastering Networking in the Age of AI: Advice for Aspiring Engineers

Ivan Pepelnjak joins host Eric Chou to reflect on his extensive career, his decision to reduce his content creation, and offer advice for young engineers. They discuss the evolution of networking technologies, emphasizing the importance of mastering Linux and obtaining relevant certifications. Ivan highlights the significance of creating professional visibility and owning one’s content. The... Read more »

Improving platform resilience at Cloudflare through automation

Failure is an expected state in production systems, and no predictable failure of either software or hardware components should result in a negative experience for users. The exact failure mode may vary, but certain remediation steps must be taken after detection. A common example is when an error occurs on a server, rendering it unfit for production workloads, and requiring action to recover.

When operating at Cloudflare’s scale, it is important to ensure that our platform is able to recover from faults seamlessly. It can be tempting to rely on the expertise of world-class engineers to remediate these faults, but this would be manual, repetitive, unlikely to produce enduring value, and not scaling. In one word: toil; not a viable solution at our scale and rate of growth.

In this post we discuss how we built the foundations to enable a more scalable future, and what problems it has immediately allowed us to solve.

Growing pains

The Cloudflare Site Reliability Engineering (SRE) team builds and manages the platform that helps product teams deliver our extensive suite of offerings to customers. One important component of this platform is the collection of servers that power critical products such as Durable Objects, Workers, Continue reading

Building an ISE Homelab

One of the best ways of learning something is building a lab for it. Especially when it comes to complex topics like network authentication. When I started learning about network authentication and Cisco Identity Services Engine (ISE), I found that there wasn’t a lot of clear information on how you build a lab. Not in Cisco documentation and also not on blogs, etc. In this post I’ll explain how I built my lab using CML and ESX.

Having a lab with ISE only is not going to get you very far. At a minimum, I think the following devices are needed in a network authentication lab:

  • Cisco ISE.
  • Active Directory Domain Services.
  • Public Key Infrastructure (PKI) such as Active Directory Certificate Services (ADCS).
  • Network Authentication Device (NAD) such as Catalyst 9000.

For my lab, I’m using only virtual devices. The focus is on learning network authentication and ISE which is why I’ve setup a very simple PKI, ignoring best practices such as offline root, intermediate CA, and so on. I might lab that at a later stage, but that’s not the current focus.

The topology of my lab is shown below:

Note that some VMs such as the virtual Catalyst Continue reading

Reclaiming Disk Space from Old Windows Install

This is a quick post to describe how to reclaim disk space being used by an old Windows install. Recently, I upgraded to Windows 11 from Windows 10. I noticed that I was starting to run a bit low on disk space on my SSD. I have a 512 GB SSD and had less than 100 GB available:

I noticed that there is a folder named Windows.old that is 40 GB in size:

The instructions to reclaim the space seemed clear. Go to Settings -> System -> Storage and reclaim the space labeled as Previous Windows installation. However, ther was no such category when I tried:

After some searching and a little bit of thinking, I realized that this is probably a privileges problem. I became local admin by using the PAM tool. Then I ran the disk cleanup util as administrator:

I can now see that there are previous Windows installations:

I select to delete Previous Windows installations:

You have to confirm that it’s OK to delete:

The deletion process starts:

This will take some time…

There is now more space available:

If you’re running low on disk, check if you have previous Windows installations that you can Continue reading

The Size of Packets

We’ve now been running packet-switched networks for many decades, and these days it’s packets and not virtual circuits lie behind most of the world’s digital communications service. But some very fundamental questions remain unanswered in this packet-switched world. Perhaps the most basic question is: “How big should a packet be?” And, surprisingly enough, there is no clear answer!

PP034: Driving Security and Network Assurance with Juniper Networks (Sponsored)

Today on the Packet Protector podcast we talk with sponsor Juniper Networks about how to simplify the complexity that affects network and cybersecurity teams alike. From tool sprawl to floods of data, complexity bedevils operations and troubleshooting. We talk about what Juniper brings to the table for networking and security professionals to help them do... Read more »

Cloudflare acquires Kivera to add simple, preventive cloud security to Cloudflare One

We’re excited to announce that Kivera, a cloud security, data protection, and compliance company, has joined Cloudflare. This acquisition extends our SASE portfolio to incorporate inline cloud app controls, empowering Cloudflare One customers with preventative security controls for all their cloud services.

In today’s digital landscape, cloud services and SaaS (software as a service) apps have become indispensable for the daily operation of organizations. At the same time, the amount of data flowing between organizations and their cloud providers has ballooned, increasing the chances of data leakage, compliance issues, and worse, opportunities for attackers. Additionally, many companies — especially at enterprise scale — are working directly with multiple cloud providers for flexibility based on the strengths, resiliency against outages or errors, and cost efficiencies of different clouds. 

Security teams that rely on Cloud Security Posture Management (CSPM) or similar tools for monitoring cloud configurations and permissions and Infrastructure as code (IaC) scanning are falling short due to detecting issues only after misconfigurations occur with an overwhelming volume of alerts. The combination of Kivera and Cloudflare One puts preventive controls directly into the deployment process, or ‘inline’, blocking errors before they happen. This offers a proactive approach essential to Continue reading

Leveraging Kubernetes virtual machines at Cloudflare with KubeVirt

Cloudflare runs several multi-tenant Kubernetes clusters across our core data centers. These general-purpose clusters run on bare metal and power our control plane, analytics, and various engineering tools such as build infrastructure and continuous integration.

Kubernetes is a container orchestration platform. It enables software engineers to deploy containerized applications to a cluster of machines. This enables teams to build highly-available software on a scalable and resilient platform.

In this blog post we discuss our Kubernetes architecture, why we needed virtualization, and how we’re using it today.

Multi-tenant clusters

Multi-tenancy is a concept where one system can share its resources among a wide range of customers. This model allows us to build and manage a small number of general purpose Kubernetes clusters for our internal application teams. Keeping the number of clusters small reduces our operational toil. This model shrinks costs and increases computational efficiency by sharing hardware. Multi-tenancy also allows us to scale more efficiently. Scaling is done at either a cluster or application level. Cluster operators scale the platform by adding more hardware. Teams scale their applications by updating their Kubernetes manifests. They can scale vertically by increasing their resource requests or horizontally by increasing the number of Continue reading