What Can You Learn From Facebook’s Meltdown?

I wanted to wait to put out a hot take on the Facebook issues from earlier this week because failures of this magnitude always have details that come out well after the actual excitement is done. A company like Facebook isn’t going to do the kind of in-depth post-mortem that we might like to see but the amount of information coming out from other areas does point to some interesting circumstances causing this situation.

Let me start off the whole thing by reiterating something important: Your network looks absolutely nothing like Facebook. The scale of what goes on there is unimaginable to the normal person. The average person has no conception of what one billion looks like. Likewise, the scale of the networking that goes on at Facebook is beyond the ken of most networking professionals. I’m not saying this to make your network feel inferior. More that I’m trying to help you understand that your network operations resemble those at Facebook in the same way that a model airplane resembles a space shuttle. They’re alike on the surface only.

Facebook has unique challenges that they have to face in their own way. Network automation there isn’t a bonus. It’s Continue reading

What happened on the Internet during the Facebook outage

What happened on the Internet during the Facebook outage

It's been a few days now since Facebook, Instagram, and WhatsApp went AWOL and experienced one of the most extended and rough downtime periods in their existence.

When that happened, we reported our bird's-eye view of the event and posted the blog Understanding How Facebook Disappeared from the Internet where we tried to explain what we saw and how DNS and BGP, two of the technologies at the center of the outage, played a role in the event.

In the meantime, more information has surfaced, and Facebook has published a blog post giving more details of what happened internally.

As we said before, these events are a gentle reminder that the Internet is a vast network of networks, and we, as industry players and end-users, are part of it and should work together.

In the aftermath of an event of this size, we don't waste much time debating how peers handled the situation. We do, however, ask ourselves the more important questions: "How did this affect us?" and "What if this had happened to us?" Asking and answering these questions whenever something like this happens is a great and healthy exercise that helps us improve our own resilience.

Continue reading

Technology Short Take 146

Welcome to Technology Short Take #146! Over the last couple of weeks, I’ve gathered a few technology-related links for you all. There’s some networking stuff, a few security links, and even a hardware-related article. But enough with the introduction—let’s get into the content!

Networking

Servers/Hardware

  • Chris Mellor speculates that Cisco UCS may be on the way out; Kevin Houston responds with a “I don’t think so.” Who will be correct? I guess we will just have to wait and see.

Security

Cloud Computing/Cloud Management

VMware resource MOID lookup filter

Are you trying to manage private clouds easily and efficiently using Ansible Automation Platform? When it comes to VMware infrastructure automation, the latest release of the vmware.vmware_rest Collection and new lookup plugins bring a set of fresh features to build, manage and govern various VMware use cases and accelerate the process from development to production.

The modules in the vmware.vmware_rest Collection rely on the resource MOID a lot. This is a design decision that we covered in an earlier blog. Consequently, when the users want to modify a VMware resource, they need to first write Ansible tasks to identify its MOID.

The new 2.1.0 release of vmware.vmware_rest Collection comes with a series of filter plugins dedicated to gathering the resource MOID. In this blog post, we will help you to keep your VMware automation playbooks concise.

 

But first, What is a MOID?

Internally VMware vSphere manages resources in the form of objects. Every object has a type and an ID. What we are calling MOID stands for Managed Object ID. Using the vSphere UI obfuscates the MOID logic from users and presents the objects in a visible hierarchy, potentially at several different locations.

 

Continue reading

Helping Apache Servers stay safe from zero-day path traversal attacks (CVE-2021-41773)

Helping Apache Servers stay safe from zero-day path traversal attacks (CVE-2021-41773)
Helping Apache Servers stay safe from zero-day path traversal attacks (CVE-2021-41773)

On September 29, 2021, the Apache Security team was alerted to a path traversal vulnerability being actively exploited (zero-day) against Apache HTTP Server version 2.4.49. The vulnerability, in some instances, can allow an attacker to fully compromise the web server via remote code execution (RCE) or at the very least access sensitive files. CVE number 2021-41773 has been assigned to this issue. Both Linux and Windows based servers are vulnerable.

An initial patch was made available on October 4 with an update to 2.4.50, however, this was found to be insufficient resulting in an additional patch bumping the version number to 2.4.51 on October 7th (CVE-2021-42013).

Customers using Apache HTTP Server versions 2.4.49 and 2.4.50 should immediately update to version 2.4.51 to mitigate the vulnerability. Details on how to update can be found on the official Apache HTTP Server project site.

Any Cloudflare customer with the setting normalize URLs to origin turned on have always been protected against this vulnerability.

Additionally, customers who have access to the Cloudflare Web Application Firewall (WAF), receive additional protection by turning on the rule with the following IDs:

Video: Theoretical View of Network Addressing

After explaining the basics of (network) names, addresses and routes, I wasted a few minutes of everyone’s time discussing the theoretical aspects of layered addressing, and then got back to practical issues like address scopes, namespaces, and address provisioning.

The video ends with a simple (and unappreciated) truth: if you have a point-to-point link between two nodes you don’t need data-link-layer addresses. The consequences of that fact are left as an exercise for the viewer (or you can wait till the next video ;)

You need Free ipSpace.net Subscription to watch the video, and the Standard ipSpace.net Subscription to register for upcoming live sessions.

Video: Theoretical View of Network Addressing

After explaining the basics of (network) names, addresses and routes, I wasted a few minutes of everyone’s time discussing the theoretical aspects of layered addressing, and then got back to practical issues like address scopes, namespaces, and address provisioning.

The video ends with a simple (and unappreciated) truth: if you have a point-to-point link between two nodes you don’t need data-link-layer addresses. The consequences of that fact are left as an exercise for the viewer (or you can wait till the next video ;)

You need Free ipSpace.net Subscription to watch the video, and the Standard ipSpace.net Subscription to register for upcoming live sessions.

Cisco: Networking, security, collaboration at heart of hybrid workforce concerns

When it comes to supporting the emerging hybrid workforce, getting the network and security right is top of mind among enterprise IT leaders.That's one finding detailed in Cisco’s new Hybrid Work Index, which the company says will be updated quarterly to gauge how worker and technology habits are evolving as the COVID-19 pandemic continues.The 10 most powerful companies in enterprise networking 2021 Cisco says the index gleans information from anonymized customer data points culled from a number of its products, including Meraki networking, ThousandEyes internet visibility, Webex collaboration, and security platforms Talos, Duo and Umbrella. The index also incorporates third-party survey data from more than 39,000 respondents across 34 countries.To read this article in full, please click here

The Slow But Tectonic Shifts In Datacenter Infrastructure

One of the more interesting trends in infrastructure that we try to get a handle on every once in a while is how much of the server and storage capacity is being deployed in bare metal, standalone fashion and how much is being sold to run utility style, cloud environments.

The Slow But Tectonic Shifts In Datacenter Infrastructure was written by Timothy Prickett Morgan at The Next Platform.

In a win for the Internet, federal court rejects copyright infringement claim against Cloudflare

In a win for the Internet, federal court rejects copyright infringement claim against Cloudflare
In a win for the Internet, federal court rejects copyright infringement claim against Cloudflare

Since the founding of the Internet, online copyright infringement has been a real concern for policy makers, copyright holders, and service providers, and there have been considerable efforts to find effective ways to combat it. Many of the most significant legal questions around what is called “intermediary liability” — the extent to which different links in the chain of an Internet transmission can be held liable for problematic online content — have been pressed on lawmakers and regulators, and played out in courts around issues of copyright.

Although section 230 of the Communications Decency Act in the United States provides important protections from liability for intermediaries, copyright and other intellectual property claims are one of the very few areas carved out of that immunity.

A Novel Theory of Liability

Over the years, copyright holders have sometimes sought to hold Cloudflare liable for infringing content on websites using our services. This never made much sense to us. We don’t host the content of the websites at issue, we don’t aggregate or promote the content or in any way help end users find it, and our services are not even necessary for the content’s availability online. Infrastructure service providers like Cloudflare are Continue reading

DDoS protection quickstart guide

DDoS Protect is an open source denial of service mitigation tool that uses industry standard sFlow telemetry from routers to detect attacks and automatically deploy BGP remotely triggered blackhole (RTBH) and BGP Flowspec filters to block attacks within seconds.

This document pulls together links to a number of articles that describe how you can quickly try out DDoS Protect and get it running in your environment:

Installing Cilium via a ClusterResourceSet

In this post, I’m going to walk you through how to install Cilium onto a Cluster API-managed workload cluster using a ClusterResourceSet. It’s reasonable to consider this post a follow-up to my earlier post that walked you through using a ClusterResourceSet to install Calico. There’s no need to read the earlier post, though, as this post includes all the information (or links to the information) you need. Ready? Let’s jump in!

Prerequisites

If you aren’t already familiar with Cluster API—hereafter just referred to as CAPI—then I would encourage you to read my introduction to Cluster API post. Although it is a bit dated (it was written in the very early days of the project, which recently released version 1.0). Some of the commands referenced in that post have changed, but the underlying concepts remain valid. If you’re not familiar with Cilium, check out their introduction to the project for more information. Finally, if you’re not familiar at all with the idea of ClusterResourceSets, you can read my earlier post or check out the ClusterResourceSet CAEP document.

Installing Cilium via a ClusterResourceSet

If you want to install Cilium via a ClusterResourceSet, the process looks something like this:

  1. Create Continue reading

The Advantages and Challenges of Going ‘Edge Native’

As the internet fills every nook and cranny of our lives, it runs into greater complexity for developers, operations engineers, and the organizations that employ them. How do you reduce latency? How do you comply with the regulations of each region or country where you have a virtual presence? How do you keep data near where it’s actually used? For a growing number of organizations, the answer is to use the edge. In this episode of the New Stack Makers podcast, Sheraline Barthelmy, head of product,  marketing and customer success for Cox Edge, were joined by The Advantages and Challenges of Going ‘Edge Native’ Also available on Google Podcasts, PlayerFM, Spotify, TuneIn The edge is composed of servers that are physically located close to the customers who will use them — the “last Continue reading

Waiting Room: Random Queueing and Custom Web/Mobile Apps

Waiting Room: Random Queueing and Custom Web/Mobile Apps
Waiting Room: Random Queueing and Custom Web/Mobile Apps

Today, we are announcing the general availability of Cloudflare Waiting Room to customers on our Enterprise plans, making it easier than ever to protect your website against traffic spikes. We are also excited to present several new features that have user experience in mind — an alternative queueing method and support for custom web/mobile applications.

First-In-First-Out (FIFO) Queueing

Waiting Room: Random Queueing and Custom Web/Mobile Apps

Whether you’ve waited to check out at a supermarket or stood in line at a bank, you’ve undoubtedly experienced FIFO queueing. FIFO stands for First-In-First-Out, which simply means that people are seen in the order they arrive — i.e., those who arrive first are processed before those who arrive later.

When Waiting Room was introduced earlier this year, it was first deployed to protect COVID-19 vaccine distributors from overwhelming demand — a service we offer free of charge under Project Fair Shot. At the time, FIFO queueing was the natural option due to its wide acceptance in day-to-day life and accurate estimated wait times. One problem with FIFO is that users who arrive later could see long estimated wait times and decide to abandon the website.

We take customer feedback seriously and improve products based on it. A frequent request Continue reading