It’s never been more crucial to help remote workforces stay fully operational — for the sake of countless individuals, businesses, and the economy at large. In light of this, Cloudflare recently launched a program that offers our Cloudflare for Teams suite for free to any company, of any size, through September 1. Some of these firms have been curious about how Cloudflare itself uses these tools.
Here’s how Cloudflare’s next-generation VPN alternative, Cloudflare Access, came to be.
Rewind to 2015. Back then, as with many other companies, all of Cloudflare’s internally-hosted applications were reached via a hardware-based VPN. When one of our on-call engineers received a notification (usually on their phone), they would fire up a clunky client on their laptop, connect to the VPN, and log on to Grafana.
It felt a bit like solving a combination lock with a fire alarm blaring overhead.
But for three of our engineers enough was enough. Why was a cloud network security company relying on clunky on-premise hardware?
And thus, Cloudflare Access was born.
Many of the products Cloudflare builds are a direct result of the challenges our own team is looking to address, and Access is a Continue reading
With so many people at Cloudflare now working remotely, it's worth stepping back and looking at the systems we use to get work done and how we protect them. Over the years we've migrated from a traditional "put it behind the VPN!" company to a modern zero-trust architecture. Cloudflare hasn’t completed its journey yet, but we're pretty darn close. Our general strategy: protect every internal app we can with Access (our zero-trust access proxy), and simultaneously beef up our VPN’s security with Spectrum (a product allowing the proxying of arbitrary TCP and UDP traffic, protecting it from DDoS).
Before Access, we had many services behind VPN (Cisco ASA running AnyConnect) to enforce strict authentication and authorization. But VPN always felt clunky: it's difficult to set up, maintain (securely), and scale on the server side. Each new employee we onboarded needed to learn how to configure their client. But migration takes time and involves many different teams. While we migrated services one by one, we focused on the high priority services first and worked our way down. Until the last service is moved to Access, we still maintain our VPN, keeping it protected with Spectrum.
Some of our services didn't Continue reading
As part of our ongoing compliance efforts Cloudflare’s PCI scope is reviewed quarterly and after any significant changes to ensure all in-scope systems are operating in accordance with the PCI DSS. This review also allows us to periodically review each product we offer as a PCI validated service provider and identify where there might be opportunities to provide greater value to our customers.
With our customers in mind, we completed our latest assessment and have increased our PCI certified product offering!
Building trust in our products is one critical component that allows Cloudflare’s mission of “Building a Better Internet” to succeed. We reaffirm our dedication to building trust in our products by obtaining industry standard security compliance certifications and complying with regulations.
Cloudflare is a Level 1 Merchant, the highest level, and also provides services to organizations to help secure their cardholder data environment. Maintaining PCI DSS compliance is important for Cloudflare because (1) we must ensure that our transmission and processing of cardholder data is secure for our own customers, (2) that our customers know they can trust Cloudflare’s products to transmit cardholder data securely, and (3) that anyone who interacts with Cloudflare’s services know that their information is Continue reading
I am an engineer that loves docs. Well, OK, I don’t love all docs but I believe docs are a crucial, yet often neglected element to a great developer experience. I work on the developer experience team for Cloudflare Workers focusing on several components of Workers, particularly on the docs that we recently migrated to Gatsby.
Through porting our documentation site to Gatsby I learned a lot. In this post, I share some of the learnings that could’ve saved my former self from several headaches. This will hopefully help others considering a move to Gatsby or another static site generator.
Prior to our migration to Gatsby, we used Hugo for our developer documentation. There are a lot of positives about working with Hugo - fast build times, fast load times - that made building a simple static site a great use case for Hugo. Things started to Continue reading
Data encryption at rest is a must-have for any modern Internet company. Many companies, however, don't encrypt their disks, because they fear the potential performance penalty caused by encryption overhead.
Encrypting data at rest is vital for Cloudflare with more than 200 data centres across the world. In this post, we will investigate the performance of disk encryption on Linux and explain how we made it at least two times faster for ourselves and our customers!
When it comes to encrypting data at rest there are several ways it can be implemented on a modern operating system (OS). Available techniques are tightly coupled with a typical OS storage stack. A simplified version of the storage stack and encryption solutions can be found on the diagram below:
On the top of the stack are applications, which read and write data in files (or streams). The file system in the OS kernel keeps track of which blocks of the underlying block device belong to which files and translates these file reads and writes into block reads and writes, however the hardware specifics of the underlying storage device is abstracted away from the filesystem. Finally, the block subsystem actually Continue reading
We started the Bandwidth Alliance in 2018 with a group of like-minded cloud and networking partners. Our common goal was to help our mutual customers reduce or eliminate data transfer charges, sometimes known as "bandwidth” or “egress” fees, between the cloud and the consumer. By reducing or eliminating these costs, our customers can more easily choose a best of breed set of solutions because they don’t have to worry about data charges from moving workloads between vendors, and thereby becoming locked-in to a single provider for all their needs. Today we’re announcing an important milestone: the addition of Alibaba, Zenlayer, and Cherry Servers to the Bandwidth Alliance, expanding it to a total of 20 partners. These partners offer our customers a wide choice of cloud services and products each suited to different needs.
In addition, we are working with our existing partners including Microsoft Azure, Digital Ocean and several others to onboard customers and provide them the benefits of the Bandwidth Alliance. Contact us at [email protected] if you are interested.
Over the past year we have seen several customers take advantage of the Bandwidth Alliance and wanted to highlight two examples.
Part of Cloudflare's service is a CDN that makes millions of Internet properties faster and more reliable by caching web assets closer to browsers and end users.
We make improvements to our infrastructure to make end-user experiences faster, more secure, and more reliable all the time. Here’s a case study of one such engineering effort where something counterintuitive turned out to be the right approach.
Our storage layer, which serves millions of cache hits per second globally, is powered by high IOPS NVMe SSDs.
Although SSDs are fast and reliable, cache hit tail latency within our system is dominated by the IO capacity of our SSDs. Moreover, because flash memory chips wear out, a non-negligible portion of our operational cost, including the cost of new devices, shipment, labor and downtime, is spent on replacing dead SSDs.
Recently, we developed a technology that reduces our hit tail latency and reduces the wear out of SSDs. This technology is a memory-SSD hybrid storage system that puts unpopular assets in memory.
The end result: cache hits from our infrastructure are now faster for all customers.
You may have thought that was a Continue reading
When the security team at Cloudflare takes on new projects, we approach them with the goal of achieving the “builder first mindset” whereby we design, develop, and deploy solutions just as any standard engineering team would. Additionally, we aim to dogfood our products wherever possible. Cloudflare as a security platform offers a lot of functionality that is vitally important to us, including, but not limited to, our WAF, Workers platform, and Cloudflare Access. We get a lot of value out of using Cloudflare to secure Cloudflare. Not only does this allow us to test the security of our products; it provides us an avenue of direct feedback to help improve the roadmaps for engineering projects.
One specific product that we get a lot of use out of is our serverless platform, Cloudflare Workers. With it, we can have incredible flexibility in the types of applications that we are able to build and deploy to our edge. An added bonus here is that our team does not have to manage a single server that our code runs on.
The Cloudflare Load Balancer was introduced over three years ago to provide our customers with a powerful, easy to use tool to intelligently route traffic to their origins across the world. During the initial design process, one of the questions we had to answer was ‘where do we send traffic if all pools are down?’ We did not think it made sense just to drop the traffic, so we used the concept of a ‘fallback pool’ to send traffic to a ‘pool of last resort’ in the case that no pools were detected as available. While this may still result in an error, it gave an eyeball request a chance at being served successfully in case the pool was still up.
As a brief reminder, a load balancer helps route traffic across your origin servers to ensure your overall infrastructure stays healthy and available. Load Balancers are made up of pools, which can be thought of as collections of servers in a particular location.
Over the past three years, we’ve made many updates to the dashboard. The new designs now support the fallback pool addition to the dashboard UI. The use of a fallback pool is incredibly helpful in Continue reading
Check out our thirteenth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.
Sign up below to have The Serverlist sent directly to your mailbox.
At Cloudflare, we produce all types of video content, ranging from recordings of our Weekly All-Hands to product demos. Being able to stream video on demand has two major advantages when compared to live video:
Historically, we haven’t had a central, secure repository of all video content that could be easily accessed from the browser. Various teams choose their own platform to share the content. If I wanted to find a recording of a product demo, for example, I’d need to search Google Drive, Gmail and Google Chat with creative keywords. Very often, I would need to reach out to individual teams to finally locate the content.
So we decided we wanted to build CloudflareTV, an internal Netflix-like application that can only be accessed by Cloudflare employees and has all of our videos neatly organized and immediately watchable from the browser.
We wanted to achieve the following when building CloudflareTV:
This week, like many of you reading this article, I am working from home. I don’t know about you, but I’ve found it hard to stay focused when the Internet is full of news related to the coronavirus.
CNN. Twitter. Fox News. It doesn’t matter where you look, everyone is vying for your attention. It’s totally riveting…
… and it’s really hard not to get distracted.
It got me annoyed enough that I decided to do something about it. Using Cloudflare’s new product, Cloudflare Gateway, I removed all the online distractions I normally get snared by — at least during working hours.
This blog post isn’t very long, but that’s a function of how easy it is to get Gateway up and running!
To get started, you’ll want to set up Gateway under your Cloudflare account. Head to the Cloudflare for Teams dashboard to set it up for free (if you don’t already have a Cloudflare account, hit the ‘Sign up’ button beneath the login form).
If you are using Gateway for the first time, the dashboard will take you through an onboarding experience:
The onboarding flow will help you set up your first location. A location is Continue reading
This may sound like a weird title, but hear me out. You’d think keepalives would always be helpful, but turns out reality isn’t always what you expect it to be. It really helps if you read Why does one NGINX worker take all the load? first. This post is an adaptation of a rather old post on Cloudflare’s internal blog, so not all details are exactly as they are in production today but the lessons are still valid.
This is a story about how we were seeing some complaints about sporadic latency spikes, made some unconventional changes, and were able to slash the 99.9th latency percentile by 4x!
I'm going to focus only on two parts of our edge stack: FL and SSL.
Here’s a diagram:
These days we route all traffic through SSL for simplicity, but in the grand scheme of things it’s not going to matter much.
Each of these processes is not itself a single process, but rather a master process and a collection of Continue reading
Back when Cloudflare was created, over 10 years ago now, the dominant HTTP server used to power websites was Apache httpd. However, we decided to build our infrastructure using the then relatively new NGINX server.
There are many differences between the two, but crucially for us, the event loop architecture of NGINX was the key differentiator. In a nutshell, event loops work around the need to have one thread or process per connection by coalescing many of them in a single process, this reduces the need for expensive context switching from the operating system and also keeps the memory usage predictable. This is done by processing each connection until it wants to do some I/O, at that point, the said connection is queued until the I/O task is complete. During that time the event loop is available to process other in-flight connections, accept new clients, and the like. The loop uses a multiplexing system call like epoll (or kqueue) to be notified whenever an I/O task is complete among all the running connections.
In this article we will see that despite its advantages, event loop models also have their limits and falling back to good old threaded architecture is sometimes Continue reading
As the COVID-19 emergency continues and an increasing number of cities and countries are establishing quarantines or cordons sanitaire, the Internet has become, for many, the primary method to keep in touch with their friends and families. And it's a vital motor of the global economy as many companies have employees who are now working from home.
Traffic towards video conferencing, streaming services and news, e-commerce websites has surged. We've seen growth in traffic from residential broadband networks, and a slowing of traffic from businesses and universities.
The Cloudflare team is fully operational and the Network Operating Center (NOC) is watching the changing traffic patterns in the more than 200 cities in which we operate hardware.
Big changes in Internet traffic aren't unusual. They often occur around large sporting events like the Olympics or World Cup, cultural events like the Eurovision Song Contest and even during Ramadan at the breaking of the fast each day.
The Internet was built to cope with an ever changing environment. In fact, it was literally created, tested, debugged and designed to deal with changing load patterns.
Over the last few weeks, the Cloudflare Network team has noticed some new patterns and we wanted to Continue reading
Back in March 2019, we released Firewall Analytics which provides insights into HTTP security events across all of Cloudflare's protection suite; Firewall rule matches, HTTP DDoS Attacks, Site Security Level which harnesses Cloudflare's threat intelligence, and more. It helps customers tailor their security configurations more effectively. The initial release was for Enterprise customers, however we believe that everyone should have access to powerful tools, not just large enterprises, and so in December 2019 we extended those same enterprise-level analytics to our Business and Pro customers.
Since then, we’ve built on top of our analytics platform; improved the usability, added more functionality and extended it to additional Cloudflare services in the form of Account Analytics, DNS Analytics, Load Balancing Analytics, Monitoring Analytics and more.
Until recently, all of our dashboards were mostly HTTP-oriented and provided visibility into HTTP attributes such as the user agent, hosts, cached resources, etc. This is valuable to customers that use Cloudflare to protect and accelerate HTTP Continue reading
The last few weeks have seen unprecedented changes in how people live and work around the world. Over time more and more companies have given their employees the right to work from home, restricted business travel and, in some cases, outright sent their entire workforce home. In some countries, quarantines are in place keeping people restricted to their homes.
These changes in daily life are showing up as changes in patterns of Internet use around the world. In this blog post I take a look at changing patterns in northern Italy, South Korea and the Seattle area of Washington state.
To understand how Internet use is changing, it’s first helpful to start with what a normal pattern looks like. Here’s a chart of traffic from our Dallas point of presence in the middle of January 2020.
This is a pretty typical pattern. If you look carefully you can see that Internet use is down a little at the weekend and that Internet usage is diurnal: Internet use drops down during the night and then picks up again in the morning. The peaks occur at around 2100 local time and the troughs in the dead of night at around 0300. Continue reading
As the status of COVID-19 continues to impact people and businesses around the world, Cloudflare is committed to providing awareness and transparency to our customers, employees, and partners about how we are responding. We do not anticipate any significant disruptions in Cloudflare services.
Our Business Continuity Team is monitoring the situation closely and all company personnel are kept up to date via multiple internal communication channels including a live chat room. Customers and the public are encouraged to visit this blog post for the latest information.
Yes, Cloudflare’s Business Continuity Team is a cross-functional, geographically diverse group dedicated to navigating through a health crisis like COVID-19 as well as a variety of other scenarios that may impact employee safety and business continuity.
In addition to Cloudflare’s existing Disaster Recovery Plan we have implemented the following strategies:
This email was sent to all Cloudflare customers a short while ago
From: Matthew Prince
Date: Thu, Mar 12, 2020 at 4:20 PM
Subject: Cloudflare During the Coronavirus Emergency
We know that organizations and individuals around the world depend on Cloudflare and our network. I wanted to send you a personal note to let you know how Cloudflare is dealing with the Coronavirus emergency.
First, the health and safety of our employees and customers is our top priority. We have implemented a number of sensible policies to this end, including encouraging many employees to work from home. This, however, hasn't slowed our operations. Our network operations center (NOC), security operations center (SOC), and customer support teams will remain fully operational and can do their jobs entirely remote as needed.
Second, we are tracking Internet usage patterns globally. As more people work from home, peak traffic in impacted regions has increased, on average, approximately 10%. In Italy, which has imposed a nationwide quarantine, peak Internet traffic is up 30%. Traffic patterns have also shifted so peak traffic is occurring earlier in the day in impacted regions. None of these traffic changes raise any concern for us. Cloudflare's network is well provisioned Continue reading
On January 7th, we announced Cloudflare for Teams, a new way to protect organizations and their employees globally, without sacrificing performance. Cloudflare for Teams centers around two core products - Cloudflare Access and Cloudflare Gateway. Cloudflare Access is already available and used by thousands of teams around the world to secure internal applications. Cloudflare Gateway solves the other end of the problem by protecting those teams from security threats without sacrificing performance.
Today, we’re excited to announce new secure DNS filtering capabilities in Cloudflare Gateway. Cloudflare Gateway protects teams from threats like malware, phishing, ransomware, crypto-mining and other security threats. You can start using Cloudflare Gateway at dash.teams.cloudflare.com. Getting started takes less than five minutes.
We built Cloudflare Gateway to address key challenges our customers experience with managing and securing global networks. The root cause of these challenges is architecture and inability to scale. Legacy network security models solved problems in the 1990s, but teams have continued to attempt to force the Internet of the 2020s through them.
Historically, branch offices sent all of their Internet-bound traffic to one centralized data center at or near corporate headquarters. Administrators configured that to make sure all Continue reading