Exploiting commutativity for practical fast replication

Exploiting commutativity for practical fast replication Park & Ousterhout, NSDI’19

I’m really impressed with this work. The authors give us a practical-to-implement enhancement to replication schemes (e.g., as used in primary-backup systems) that offers a signification performance boost. I’m expecting to see this picked up and rolled-out in real-world systems as word spreads. At a high level, CURP works by dividing execution into periods of commutative operation where ordering does not matter, punctuated by full syncs whenever commutativity would break.

The Consistent Unordered Replication Protocol (CURP) allows clients to replicate requests that have not yet been ordered, as long as they are commutative. This strategy allows most operations to complete in 1 RTT (the same as an unreplicated system).

When integrated with RAMCloud write latency was improved by ~2x, and write throughput by 4x. Which is impressive given that RAMCloud isn’t exactly hanging around in the first place! When integrated with Redis, CURP was able to add durability and consistency while keeping similar performance to non-durable Redis.

CURP can be easily applied to most existing systems using primary-backup replication. Changes required by CURP are not intrusive, and it works with any kind of backup mechanism (e.g., Continue reading

rbenv Install CentOS 7

rbenv is a utility for installing multiple ruby versions on a host machine. Using rbenv allows you to install ruby in a path you have ownership over so you can install gems without having to have sudo or root privileges. rbenv also allows you to target the exact ruby version in development...

Cumulus Networks is Excited to Announce being the First to Power Facebook’s Next Generation, Open Modular Platform, Minipack

Cumulus Networks, the leader in building open, modern and scalable networks, announced at OCP Summit that Cumulus Linux is the first network operating system to fully support the Minipack next-generation modular switch platform. Developed by Edgecore and contributed by Facebook to the Open Compute Project, Minipack empowers organizations of all sizes to architect, design and scale their infrastructure with unprecedented flexibility, capacity and interoperability.

Figure 1: Minipack Modular Chassis

Minipack is a modular switch platform, which means together, Cumulus Networks and Edgecore are bringing the benefits of web-scale networking to the mainstream. Minipack follows the open networking principles of disaggregation that allow customers to maintain consistent automated provisioning across all their switches of different form-factors (fixed or chassis).

Minipack leverages the latest ASIC technology from Broadcom including the Tomahawk III, the industry’s highest performance switch silicon. Compared to its predecessor, Backpack, Minipack is ½ the height, uses ½ the power and offers equivalent capacity making it one of the most operationally efficient open networking data center spine switches available today.

Additionally, Minipack offers either 100GE or 400GE options with Field Replaceable Port Interface Modules (PIM)’s in the following form factors:

Open Cloud Networking-Redefined

Networking vendors have long touted distinct routers and switches with different LAN/WAN interfaces for different customer use cases. After three decades of evolution, Ethernet now truly addresses all aspects of the present state and the next generation of networking, making it possible to support these previously separate use cases from a single common platform, which flexibly incorporates new capabilities in an open, standards-based approach. Arista, together with an ecosystem of partners including Broadcom and Cloud Titan customers, has a history of collaborating in many industry forums to define these new networking capabilities, including OCP, 25/50G and COBO, while driving next generation optics such as OSFP and QSFP-DD.

How did Facebook go down despite multiple data centers?

The Mercury retrograde kicked in big time on Wednesday as Facebook suffered an eight hour-outage that also affected Instagram and Facebook Messenger.No one was believed to be harmed; a few might have even had offline interactions with other human beings. Learn about backup and recovery: Backup vs. archive: Why it’s important to know the difference How to pick an off-site data-backup method Tape vs. disk storage: Why isn’t tape dead yet? The correct levels of backup save time, bandwidth, space Facebook said it wasn’t an attack, like a Denial of Service attack, and has since issued a statement attributing it to a configuration error.To read this article in full, please click here

How did Facebook go down despite multiple data centers?

The Mercury retrograde kicked in big time on Wednesday as Facebook suffered an eight hour-outage that also affected Instagram and Facebook Messenger.No one was believed to be harmed; a few might have even had offline interactions with other human beings. Learn about backup and recovery: Backup vs. archive: Why it’s important to know the difference How to pick an off-site data-backup method Tape vs. disk storage: Why isn’t tape dead yet? The correct levels of backup save time, bandwidth, space Facebook said it wasn’t an attack, like a Denial of Service attack, and has since issued a statement attributing it to a configuration error.To read this article in full, please click here

How did Facebook go down despite its several data centers?

The Mercury retrograde kicked in big time on Wednesday as Facebook suffered an eight-hour outage that also affected Instagram and Facebook Messenger.No one was believed to be harmed; a few might have even had offline interactions with other human beings. Learn about backup and recovery: Backup vs. archive: Why it’s important to know the difference How to pick an off-site data-backup method Tape vs. disk storage: Why isn’t tape dead yet? The correct levels of backup save time, bandwidth, space Facebook said it wasn’t attacked, such as via a denial-of-service attack, and has since issued a statement attributing the problem to a configuration error.To read this article in full, please click here

3 Stumbling Blocks for Network Engineers Adopting Ansible

Ansible, ansible, ansible seems to be all we hear these days. There are lots of resources out there all trying to convince us this is the new way get stuff done. The reality is quite different – adoption of tools like this is slow in the networking world, and making the move is hard for command-line devotees.

Here are the three main problems I encountered in my adoption of Ansible as a modern way to manage devices:

1. Most network devices don’t support Python

Ansible is derived from the systems world, and is only latterly coming to be used for managing network devices. It is often said that Ansible is agentless, but when managing a Linux host (for example) the control machine pushes the Ansible playbook to that host and executes it there. In effect, *Python* is the agent.

Most network devices don’t have on-box Python, so when using Ansible against a router or a switch you have to have ‘connection: local’ in your playbook:





---
name: Get info
hosts: all
roles:
Juniper.junos # Invokes the Junos Ansible module
connection: local # Tells it to run locally
gather_facts: no

What this does is run the playbook using the local Continue reading