Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently
Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently Veeraraghavan et al., OSDI’18
Here’s a really valuable paper detailing four plus years of experience dealing with datacenter outages at Facebook. Maelstrom is the system Facebook use in production to mitigate and recover from datacenter-level disasters. The high level idea is simple: drain traffic away from the failed datacenter and move it to other datacenters. Doing that safely, reliably, and repeatedly, not so simple!
Modern Internet services are composed of hundreds of interdependent systems spanning dozens of geo-distributed datacenters. At this scale, seemingly rare natural disasters, such as hurricanes blowing down power lines and flooding, occur regularly.
How regularly? Well, we’re told that Maelstrom has been in production at Facebook for over four years, and in that time has helped Facebook to recover from over 100 disasters. I make that about one disaster every two weeks!! Not all of these are total loss of a datacenter, but they’re all serious datacenter-wide incidents. One example was fibrecuts leading to a loss of 85% of the backbone network capability connecting a datacenter to the FB infrastructure. Maelstrom had most user-facing traffic drained in about 17 minutes, and the all traffic Continue reading

The telecom giants will invest a like amount in DT's MobiledgeX edge computing subsidiary and SK's partner ID Quantique, which uses quantum physics to secure communication transmissions.
The company’s software enables organizations to see where business-critical data is going and if people are doing things with it that they shouldn’t be – either accidentally or maliciously.
Oracle co-CEO Mark Hurd predicted that by 2025 all cloud apps will include artificial intelligence. And likely because of this AI strategy, the company reached a deal to acquire DataFox and its cloud-based AI data engine.
The widespread deployment of cloud infrastructure has led IT teams to demand freedom of choice, but that might not always be what’s best for the organization as a whole.

This new research increases predictions made by IHS in May as the cloud services market has seen steady growth in existing regions and big growth in developing regions.
CFO Matt Ellis said there have been no surprises so far in the company’s four-market pre-standard 5G fixed wireless launch.
The vendor is working with Red Hat on a plug-in to link CloudFabric to Red Hat’s OpenShift Container Platform.