Lost in transit: debugging dropped packets from negative header lengths

Lost in transit: debugging dropped packets from negative header lengths
Lost in transit: debugging dropped packets from negative header lengths

Previously, I wrote about building network load balancers with the maglev scheduler, which we use for ingress into our Kubernetes clusters. At the time of that post we were using Foo-over-UDP encapsulation with virtual interfaces, one for each Internet Protocol version for each worker node.

To reduce operational toil managing the traffic director nodes, we've recently switched to using IP Virtual Server's (IPVS) native support for encapsulation. Much to our surprise, instead of a smooth change, we instead observed significant drops in bandwidth and failing API requests. In this post I'll discuss the impact observed, the multi-week search for the root cause, and the ultimate fix.

Recap and the change

To support our requirements we've been creating virtual interfaces on our traffic directors configured to encapsulate traffic with Foo-Over-UDP (FOU). In this encapsulation new UDP and IP headers are added to the original packet. When the worker node receives this packet, the kernel removes the outer headers and injects the inner packet back into the network stack. Each virtual interface would be assigned a private IP, which would be configured to send traffic to these private IPs in "direct" mode.

Lost in transit: debugging dropped packets from negative header lengths

This configuration presents several problems for our operations teams.

Continue reading

Recapping Speed Week 2023

Recapping Speed Week 2023

This post is also available in Deutsch.

Recapping Speed Week 2023

Speed Week 2023 is officially a wrap.

In our Welcome to Speed Week 2023 blog post, we set a clear goal:

“This week we will help you measure what matters. We’ll help you gain insight into your performance, from Zero Trust and API’s to websites and applications. And finally we’ll help you get faster. Quickly.”.

This week we published five posts on how to measure performance, explaining which metrics and approaches make sense and why. We had a deep dive on the latest Core Web Vital, “Interaction to Next Paint”, what it means and how we can help. There was a post on Time To First Byte (TTFB) and why it isn't a good way to measure good web performance. We also wrote about how to measure Zero Trust performance, and announced the Internet Quality page of Cloudflare Radar - giving everyone the ability to compare Internet connection quality across Internet Service Providers, countries, and more.

We launched new products such as Observatory, Digital Experiencing Monitoring and Timing Insights. These products give an incredible window into how your applications and websites are performing through the eyes of website visitors Continue reading

How to deploy Red Hat Ansible Automation Platform on AWS to AWS GovCloud in the United States

This blog is co-authored by Zack Kayyali and Hicham (he-sham) Mourad

Deploying Red Hat Ansible Automation Platform Foundation

The steps below detail how to install Ansible Automation Platform on AWS United States GovCloud from the AWS Marketplace. The steps to deploy into AWS GovCloud and AWS Commercial cloud are nearly identical. Before starting your deployment process, please ensure the AWS account you are using to deploy has the following IAM roles. These IAM roles are required to deploy the AWS foundation stack offering. The foundation stack offering here refers to the base Ansible Automation Platform 2 deployment.

This blog details how to deploy Ansible Automation Platform on AWS and access the application. This deployment process will be configured to set up Ansible Automation Platform in its own Virtual Private Cloud (VPC) that it creates and manages. We also support deploying into an existing VPC.

To begin, first log into your Commercial AWS account. If you have a private offer, ensure that these are accepted for both the foundation and extension node offerings.

Note: 

  • The foundation offer refers to the “Red Hat Ansible Automation Platform 2 - Up to 100 Managed Nodes” marketplace item. 
  • The extension node offer refers to Continue reading

Welcome to the Ansible Lightspeed with IBM Watson Code Assistant Technical Preview

Screenshot 2023-06-05 at 3.31.19 PM

Welcome to the Ansible Lightspeed with IBM Watson Code Assistant Technical Preview

By Craig Brandt

At Red Hat Summit and AnsibleFest 2023, we announced Ansible Lightspeed with IBM Watson Code Assistant, a new generative AI service for Ansible automation. Today, we are thrilled to announce the Ansible Lightspeed technical preview launch.

In this blog, we’ll walk through the steps to access the Ansible Lightspeed with IBM Watson Code Assistant technical preview service and get it up and running in your Visual Studio Code environment. Then we’ll share more about what you can expect from the experience and how to generate your first Ansible tasks with generative AI.

This is exciting stuff, so let’s dive right in.

Technical Preview: Empowering Ansible Users with AI

Ansible Lightspeed with IBM Watson Code Assistant is a purpose-built generative AI tool that aims to streamline the creation of Ansible content. This capability is natively integrated into your VS Code editor via the Ansible VS Code extension. The AI capabilities are powered by Watson Code Assistant, a foundation model trained on Ansible Galaxy, GitHub, and other open sources of data.

The technical preview is open and available, free of charge, to all Ansible users. As more users engage with Continue reading

How IT pros can benefit from generative AI safely

The enterprise IT landscape is littered with supposedly paradigm-shifting technologies that failed to live up to the hype, and until now, one could argue that AI fell into that category. But generative AI, which has taken the world by storm in the form of OpenAI’s ChatGPT chatbot, just might be the real deal.Chris Bedi, chief digital information officer at ServiceNow, says the release of ChatGPT last November was “an iPhone moment,” an event that captured the public’s attention in a way that “changed everything forever.” He predicts that generative AI will become embedded into the fabric of every enterprise, and he recommends that CIOs and other IT leaders should begin now to develop their generative AI strategies.To read this article in full, please click here

How IT pros can benefit from generative AI safely

The enterprise IT landscape is littered with supposedly paradigm-shifting technologies that failed to live up to the hype, and intil now, one could argue that AI fell into that category. But generative AI, which has taken the world by storm in the form of OpenAI’s ChatGPT chatbot, just might be the real deal.Chris Bedi, chief digital information officer at ServiceNow, says the release of ChatGPT last November was “an iPhone moment,” an event that captured the public’s attention in a way that “changed everything forever.” He predicts that generative AI will become embedded into the fabric of every enterprise, and he recommends that CIOs and other IT leaders should begin now to develop their generative AI strategies.To read this article in full, please click here

Welcome to the Ansible Lightspeed with IBM Watson Code Assistant Technical Preview

Welcome to the Ansible Lightspeed with IBM Watson Code Assistant Technical Preview

At Red Hat Summit and AnsibleFest 2023, we announced Ansible Lightspeed with IBM Watson Code Assistant, a new generative AI service for Ansible automation. Today, we are thrilled to announce the Ansible Lightspeed technical preview launch.

In this blog, we'll walk through the steps to access the Ansible Lightspeed with IBM Watson Code Assistant technical preview service and get it up and running in your Visual Studio Code environment. Then we'll share more about what you can expect from the experience and how to generate your first Ansible tasks with generative AI.

This is exciting stuff, so let's dive right in.

Technical Preview: Empowering Ansible Users with AI

Ansible Lightspeed with IBM Watson Code Assistant is a purpose-built generative AI tool that aims to streamline the creation of Ansible content. This capability is natively integrated into your VS Code editor via the Ansible VS Code extension. The AI capabilities are powered by Watson Code Assistant, a foundation model trained on Ansible Galaxy, GitHub, and other open sources of data.

The technical preview is open and available, free of charge, to all Ansible users. As more users engage with Ansible Lightspeed, the Continue reading

Worth Reading: Always the Same Warning Signs

Found an interesting article describing the shenanigans of a biotech startup. Admittedly, it has nothing to do with networking apart from the closing paragraph…

But people will find all sorts of ways to believe what they want to believe, to avoid hearing things that they don’t want to hear, and to avoid thinking about things that are too worrisome to contemplate.

… which is a perfect description of why people believe in centralized control planes, flow-based forwarding, or long-distance vMotion.

Worth Reading: Always the Same Warning Signs

Found an interesting article describing the shenanigans of a biotech startup. Admittedly, it has nothing to do with networking apart from the closing paragraph…

But people will find all sorts of ways to believe what they want to believe, to avoid hearing things that they don’t want to hear, and to avoid thinking about things that are too worrisome to contemplate.

… which is a perfect description of why people believe in centralized control planes, flow-based forwarding, or long-distance vMotion.

Hedge 183: Mike Bushong on Operational Excellence

What’s next for network engineering? While we normally think of answers to this question in terms of technology, Mike Bushong joins this episode of the Hedge to argue the future is in operations—and operational excellence. Join Mike, Tom, and Russ as we discuss how the importance of operating a network is impacting the design of hardware, software, and networks.

download

How we scaled and protected Eurovision 2023 voting with Pages and Turnstile

How we scaled and protected Eurovision 2023 voting with Pages and Turnstile
How we scaled and protected Eurovision 2023 voting with Pages and Turnstile

2023 was the first year that non-participating countries could vote for their favorites during the Eurovision Song Contest, adding millions of additional viewers and voters to an already impressive 162 million tuning in from the participating countries. It became a truly global event with a potential for disruption from multiple sources. To prepare for anything, Cloudflare helped scale and protect the voting application, used by millions of dedicated fans around the world to choose the winner.

In this blog we will cover how once.net built their platform based.io to monitor, manage and scale the Eurovision voting application to handle all traffic using many Cloudflare services. The speed with which DNS changes made through the Cloudflare API propagate globally allowed them to scale their backend within seconds. At the same time, Cloudflare Pages was ready to serve any amount of traffic to the voting landing page so fans didn’t miss a beat. And to cap it off, by combining Cloudflare CDN, DDoS protection, WAF, and Turnstile, they made sure that attackers didn’t steal any of the limelight.

The unsung heroes

Based.io is a resilient live data platform built by the once.net team, with the capability to scale Continue reading

All the way up to 11: Serve Brotli from origin and Introducing Compression Rules

All the way up to 11: Serve Brotli from origin and Introducing Compression Rules

This post is also available in 简体中文, 日本語, Español and Deutsch.

All the way up to 11: Serve Brotli from origin and Introducing Compression Rules

Throughout Speed Week, we have talked about the importance of optimizing performance. Compression plays a crucial role by reducing file sizes transmitted over the Internet. Smaller file sizes lead to faster downloads, quicker website loading, and an improved user experience.

Take household cleaning products as a real world example. It is estimated “a typical bottle of cleaner is 90% water and less than 10% actual valuable ingredients”. Removing 90% of a typical 500ml bottle of household cleaner reduces the weight from 600g to 60g. This reduction means only a 60g parcel, with instructions to rehydrate on receipt, needs to be sent. Extrapolated into the gallons, this weight reduction soon becomes a huge shipping saving for businesses. Not to mention the environmental impact.

This is how compression works. The sender compresses the file to its smallest possible size, and then sends the smaller file with instructions on how to handle it when received. By reducing the size of the files sent, compression ensures the amount of bandwidth needed to send files over the Internet is a lot less. Where files are stored in expensive cloud providers like AWS Continue reading

Making Cloudflare Pages the fastest way to serve your sites

Making Cloudflare Pages the fastest way to serve your sites
Making Cloudflare Pages the fastest way to serve your sites

In an era where visitors expect instant gratification and content on-demand, every millisecond counts. If you’re a web application developer, it’s an excellent time to be in this line of business, but with great power comes great responsibility. You’re tasked with creating an experience that is not only intuitive and delightful but also quick, reactive and responsive – sometimes with the two sides being at odds with each other. To add to this, if your business completely runs on the internet (say ecommerce), then your site’s Core Web Vitals could make or break your bottom line.

You don’t just need fast – you need magic fast. For the past two years, Cloudflare Pages has been serving up performant applications for users across the globe, but this week, we’re showing off our brand new, lightning fast architecture, decreasing the TTFB by up to 10X when serving assets.

And while a magician never reveals their secrets, this trick is too good to keep to ourselves. For all our application builders, we’re thrilled to share the juicy technical details on how we adopted Workers for Platforms — our extension of Workers to build SaaS businesses on top of — to make Pages one Continue reading

Speeding up APIs with Ricochet for API Gateway

Speeding up APIs with Ricochet for API Gateway
Speeding up APIs with Ricochet for API Gateway

APIs form the backbone of communication between apps and services on the Internet. They are a quick way for an application to ask for data or ask that a task be performed by a service. For example, anyone can write a weather app without being a meteorologist: simply ask a weather API for the forecast and display it in your app.

Speed is inherent to the API use case. Rather than transferring bulky files like images and HTML, APIs only share the essential data needed to render a webpage or an app. However, despite their efficiency, Internet latency can still impede API data transfers. If the server processing a user’s API request is located far from that user, the network round trip time can degrade that user’s experience.

Cloudflare's global network is specifically designed to optimize and accelerate internet traffic, including APIs. Our users enjoy features like 11ms DNS responses, load balancing, and Argo Smart Routing, which significantly improve API traffic speed. For web content, Cloudflare customers have always been able to cache their web traffic, serving requests from the closest data center and thereby reducing network round trip time and server processing time to a Continue reading