adriancolyer

Author Archives: adriancolyer

MadMax: surviving out-of-gas conditions in Ethereum smart contracts

MadMax: surviving out-of-gas conditions in ethereum smart contracts Grech et al., OOPSLA’18

We’re transitioning to look at a selection of papers from the recent OOPSLA conference this week. MadMax won a distinguished paper award, and makes a nice bridge from the CCS blockchain papers we were looking at last week.

Analysis and verification of smart contracts is a high-value task, possibly more so than in any other programming setting. The combination of monetary value and public availability makes the early detection of vulnerabilities a task of paramount importance. (Detection may occur after contract deployment. Despite the code immutability, which prevents bug fixes, discovering a vulnerability before an attacker may exploit it could enable a trusted third party to move vulnerable funds to safety).

MadMax is in the same vein as Securify, performing EVM bytecode analysis using Datalog (also with Soufflé) to infer security issues in contracts. In this instance, MadMax focuses on detecting vulnerabilities caused by out-of-gas conditions. The paper touches on some nice reusable building blocks (e.g. Vandal). I could easily see Vandal + Soufflé becoming a standard foundation for powerful EVM-based smart contract analysis.

MadMax is available on GitHub at https://github.com/nevillegreech/MadMax.

MaxMax Continue reading

RapidChain: scaling blockchain via full sharding

RapidChain: scaling blockchain via full sharding Zamani et al., CCS’18

RapidChain is a sharding-based public blockchain protocol along the lines of OmniLedger that we looked at earlier in the year. RapidChain is resilient to Byzantine faults from up to 1/3 of its participants, requires no trusted setup, and can achieve more than 7,300 tx/sec with an expected confirmation latency of roughly 8.7 seconds in a network of 4,000 nodes with a time-to-failure of more than 4,500 years. Those are pretty interesting numbers!


(Enlarge)

RapidChain partitions the set of nodes into multiple smaller groups of nodes called committees that operate in parallel on disjoint blocks of transactions and maintain disjoint ledgers.

With n nodes, each committee is of size m = c \log n where c is a security parameter typically set around 20. E.g. with 1,000 nodes we’ll have around 17 committees of 60 nodes each. To make all these work we’ll need a number of different parts:

  • A way to bootstrap the initial system
  • A way of forming (and re-forming) committees, and allowing nodes to join and leave
  • A way of reaching consensus within a committee (shard)
  • A way of verifying transactions which cross-shards

Bootstrapping

The initial set of participants start Continue reading

FairSwap: how to fairly exchange digital goods

FairSwap: how to fairly exchange digital goods Dziembowski et al., CCS’18

(Preprint)

This is a transactions paper with a twist. The transactions we’re talking about are purchases of digital assets. More specifically, the purchase of a file (document, movie, archive of a dataset, …). The property we strongly care about is atomicity: either the seller receives payment and the buyer receives a valid file or neither of these things happen. The buyer and seller don’t trust each other (so e.g., “you send me the payment and then I’ll send you the file” is not an acceptable solution, nor is “you send me the file and then I’ll send you the payment”). This is known as the fair exchange problem.

Fair exchange is a well studied research problem. It has been shown that without further assumptions fair exchange cannot be achieved without a Trusted Third Party (TTP). To circumvent this impossibility, research has studied weaker security models— most notably, the optimistic model in which a TTP is consulted only in case one party deviates from the expected behavior.

In many real-world scenarios, escrow services play the role of the trusted third party. Unfortunately this means you Continue reading

Securify: practical security analysis of smart contracts

Securify: practical security analysis of smart contracts Tsankov et al., CCS’18

Sometimes the perfect is the enemy of the good. When we’re talking about securing smart contracts, we need all the help we can get! Bugs can cost millions of dollars. Securify uses a set of expert heuristics (patterns) to help identify issues in smart contracts. It’s available at https://securify.ch, has analysed over 18K uploaded contracts, and is used by security auditors as part of their arsenal.

The increased adoption of smart contracts demands strong security guarantees. Unfortunately, it is challenging to create smart contracts that are free of security bugs. As a consequence, critical vulnerabilities in smart contracts are discovered and exploited every few months. In turn, these exploits have led to losses reaching millions worth of USD in the past few years…. Despite their potential, repeated security concerns have shaken the trust in handling billions of USD by smart contracts.

Too right! We’ve examined some of the challenges involved in creating correct smart contracts in previous editions of The Morning Paper, as well as tools such as Zeus that help with verification.

It’s not a solvable problem in the general case (i.e., ‘perfect’ Continue reading

LEMNA: explaining deep learning based security applications

LEMNA: explaining deep learning based security applications Guo et al., CCS’18

Understanding why a deep learning model produces the outputs it does is an important part of gaining trust in the model, and in some situations being able to explain decisions is a strong requirement. Today’s paper shows that by carefully considering the architectural features of a given model, it’s possible to co-design an explanatory model. The idea is applied to deep learning models in the security domain (to detect the start of functions within binaries, and to detect malware) where for reasons we’ll look at next, the assumptions made by black-box explainers such as LIME don’t apply.

Like LIME, LEMNA approximates a local area of a complex deep learning decision boundary using a simple interpretable model. Unlike LIME, LEMNA can handle non-linear local boundaries, and feature dependencies (e.g., for a sequences fed into RNNs, which explicitly model dependencies in sequential data).

Why explainability matters

While intrigued by the high accuracy, security practitioners are concerned about the lack of transparency of deep learning models, and thus hesitate to widely adopt deep learning classifiers in security and safety-critical areas.

Explanations that are understandable by security analysts can help Continue reading

Towards usable checksums: automating the integrity verification of web downloads for the masses

Towards usable checksums: automating the integrity verification of web downloads for the masses Cherubini et al., CCS’18

If you tackled Monday’s paper on BEAT you deserve something a little easier to digest today, and ‘Towards usable checksums’ fits the bill nicely! There’s some great data-driven product management going on here as the authors set out to quantify current attitudes and behaviours regarding downloading files from the Internet, design a solution to improve security and ease-of-use, and then test their solution to gather feedback and prepare for a more widely deployed beta version.

When I was growing up we were all taught “Don’t talk to strangers”, and “Never get in a stranger’s car”. As has been well noted by others, so much for that advice! Perhaps the modern equivalent is “Don’t download unknown files from the Internet!” This paper specifically looks at applications made directly available from developer websites (vs downloads made through app stores).

A popular and convenient way to download programs is to use official app stores such as Apple’s Mac App Store and Microsoft’s Windows Store. Such platforms, however, have several drawbacks for developers, including long review and validation times, technical restrictions (e.g., sandboxing), Continue reading

BEAT: asynchronous BFT made practical

BEAT: asynchronous BFT made practical Duan et al., CCS’18

Reaching agreement (consensus) is hard enough, doing it in the presence of active adversaries who can tamper with or destroy your communications is much harder still. That’s the world of Byzantine fault tolerance (BFT). We’ve looked at Practical BFT (PBFT) and HoneyBadger on previous editions of The Morning Paper. Today’s paper, BEAT, builds on top of HoneyBadger to offer BFT with even better latency and throughput.

Asynchronous BFT protocols are arguably the most appropriate solutions for building high-assurance and intrusion-tolerant permissioned blockchains in wide-are (WAN) environments, as these asynchronous protocols are inherently more robust against timing and denial-of-service (DoS) attacks that can be mounted over an unprotected network such as the Internet.

The best performing asynchronous BFT protocol, HoneyBadger, still lags behind the partially synchronous PBFT protocol in terms of throughput and latency. BEAT is actually a family of five different asynchronous BFT protocols that start from the HoneyBadger baseline and make improvements targeted at different application scenarios.

Unlike HoneyBadgerBFT, which was designed to optimize throughput only, BEAT aims to be flexible and versatile, providing protocol instances optimized for latency, throughput, bandwidth, or scalability (in terms of the number Continue reading

Uncertainty propagation in data processing systems

Uncertainty propagation in data processing systems Manousakis et al., SoCC’18

When I’m writing an edition of The Morning Paper, I often imagine a conversation with a hypothetical reader sat in a coffee shop somewhere at the start of their day. There are three levels of takeaway from today’s paper choice:

  • If you’re downing a quick espresso, then it’s good to know that uncertainty can creep into our data in lots of different ways, and if you compute with those uncertain values as if they were precise, errors can compound quickly leading to incorrect results or false confidence.
  • If you’re savouring a cortado, then you might also want to dip into the techniques we can use to propagate uncertainty through a computation.
  • If you’re lingering over a latte, then the UP (Uncertainty Propagation) framework additionally shows how to integrate these techniques into a dataflow framework.

We implement this framework in a system called UP-MapReduce, and use it to modify ten applications, including AI/ML, image processing, and trend analysis applications to process uncertain data. Our evaluation shows that UP-MapReduce propagates uncertainties with high accuracy and, in many cases, low performance overheads.

Are you sure?

Uncertainty can arise from a number of Continue reading

Continuum: a platform for cost-aware low-latency continual learning

Continuum: a platform for cost-aware low-latency continual learning Tian et al., SoCC’18

Let’s start with some broad approximations. Batching leads to higher throughput at the cost of higher latency. Processing items one at a time leads to lower latency and often reduced throughput. We can recover throughput to a degree by throwing horizontally scalable resources at the problem, but it’s hard to recover latency. In many business scenarios latency matters, so we’ve been seeing a movement overtime from batching through micro-batching to online streaming.

Continuum looks at the same issues from the perspective of machine learning models. Offline (batch) trained models can suffer from concept drift (loss of accuracy over time) as a result of not incorporating the latest data. I.e., there’s a business cost incurred for higher latency of update incorporation. Online models support incremental updates. Continuum determines the optimum time to retrain models in the presence of incoming data, based on user policy (best effort, cost-aware, or user-defined). There’s some great data here about the need for and benefit of continual learning, and a surprising twist in the tale where it turns out that even if you can afford it, updating the model on Continue reading

ScootR: scaling R dataframes on dataflow systems

ScootR: scaling R dataframes on dataflow systems Kunft et al., SoCC’18

The language of big data is Java ( / Scala). The languages of data science are Python and R. So what do you do when you want to run your data science analysis over large amounts of data?

…programming languages with rich support for data manipulation and statistics, such as R and Python, have become increasingly popular… [but]… they are typically designed for single machine and in-memory usage…. In contrast, parallel dataflow systems, such as Apache Flink and Apache Spark, are able to handle large amounts of data. However, data scientists are often unfamiliar with the systems’ native language and programming abstraction, which is crucial to achieve good performance.

A tempting solution is to embed Python / R support within the dataflow engine. There are two basic approaches to this today:

  1. Keep the guest language components in a separate process and use IPC (inter-process communication) to exchange input and output data between the dataflow engine and the guest language process. This approach can support the full power of the guest language, but pays a heavy price in IPC and serialisation costs.
  2. Use source-to-source (STS) translation to translate guest Continue reading

Overload control for scaling WeChat microservices

Overload control for scaling WeChat microservices Zhou et al., SoCC’18

There are two reasons to love this paper. First off, we get some insights into the backend that powers WeChat; and secondly the authors share the design of the battle hardened overload control system DAGOR that has been in production at WeChat for five years. This system has been specifically designed to take into account the peculiarities of microservice architectures. If you’re looking to put a strategy in place for your own microservices, you could do a lot worse than start here.

WeChat

The WeChat backend at this point consists of over 3000 mobile services, including instant messaging, social networking, mobile payment, and third-party authorization. The platform sees between 10^{10} - 10^{11} external requests per day. Each such request can triggers many more internal microservice requests, such that the WeChat backend as a whole needs to handle hundreds of millions of requests per second.

WeChat’s microservice system accommodates more than 3000 services running on over 20,000 machines in the WeChat business system, and these numbers keep increasing as WeChat is becoming immensely popular… As WeChat is ever actively evolving, its microservice system has been undergoing fast iteration of service updates. For instance, Continue reading

Unikernels as processes

Unikernels as processes Williams et al., SoCC’18

Ah, unikernels. Small size, fast booting, tiny attack surface, resource efficient, hard to deploy on existing cloud platforms, and undebuggable in production. There’s no shortage of strong claims on both sides of the fence.

See for example:

In today’s paper choice, Williams et al. give us an intriguing new option in the design space: running unikernels as processes. Yes, that’s initially confusing to get your head around! That means you still have a full-fat OS underneath the process, and you don’t get to take advantage of the strong isolation afforded by VMs. But through a clever use of seccomp, unikernels as processes still have strong isolation as well as increased throughput, reduced startup time, and increased memory density. Most importantly though, with unikernels as processes we can reuse standard infrastructure and tools:

We believe that running unikernels as processes is an important step towards running them in production, because, as processes, they can Continue reading

Debugging distributed systems with why-across-time provenance

Debugging distributed systems with why-across-time provenance Whittaker et al., SoCC’18

This value is 17 here, and it shouldn’t be. Why did the get request return 17?

Sometimes the simplest questions can be the hardest to answer. As the opening sentence of this paper states:

Debugging distributed systems is hard.

The kind of why questions we’re interested in for this paper are questions of provenance. What are the causes of this output? Provenance has been studied in the context of relational databases and dataflow systems, but here we’re interested in general distributed systems. (Strictly, those where the behaviour of each node can be modelled by a deterministic state machine: non-deterministic behaviour is left to future work).

Why why-provenance doesn’t work

Relational databases have why-provenance, which sounds on the surface exactly like what we’re looking for.

Given a relational database, a query issued against the database, and a tuple in the output of the query, why-provenance explains why the output tuple was produced. That is, why -provenance produces the input tuples that, if passed through the relational operators of the query, would produce the output tuple in question.

One reason that won’t work in our distributed systems setting is that Continue reading

ApproxJoin: approximate distributed joins

ApproxJoin: approximate distributed joins Le Quoc et al., SoCC’18

GitHub: https://ApproxJoin.github.io

The join is a fundamental data processing operation and has been heavily optimised in relational databases. When you’re working with large volumes of unstructured data though, say with a data processing framework such as Flink or Spark, joins become distributed and much more expensive. One of the reasons for this is the amount of data that needs to be moved over the network. In many use cases, approximate results would be acceptable, and as we’ve seen before, likely much faster and cheaper to compute. Approximate computing with joins is tricky though: if you sample datasets before the join you reduce data movement, but also sacrifice up to an order of magnitude in accuracy; if you sample results after the join you don’t save on any data movement and the process is slow.

This paper introduces an approximate distributed join technique, ApproxJoin, which is able to sample before data shuffling without loss of end result accuracy. Compared to unmodified Spark joins with the same sampling ratio it achieves a speedup of 9x while reducing the shuffled data volume by 82x.

The following charts show ApproxJoin’s latency Continue reading

ASAP: fast, approximate graph pattern mining at scale

ASAP: fast, approximate graph pattern mining at scale Iyer et al., OSDI’18

I have a real soft spot for approximate computations. In general, we waste a lot of resources on overly accurate analyses when understanding the trends and / or the neighbourhood is quite good enough (do you really need to know it’s 78.763895% vs 78 ± 1%?). You can always drill in with more accuracy if the approximate results hint at something interesting or unexpected.

Approximate analytics is an area that has gathered attention in big data analytics, where the goal is to let the user trade-off accuracy for much faster results.

(See e.g. ApproxHadoop which we covered on The Morning Paper a while back).

In the realm of graph processing, graph pattern mining algorithms, which discover structural patterns in a graph, can reveal very interesting things in our data but struggle to scale to larger graphs. This is in contrast to graph analysis algorithms such as PageRank which typically compute properties of a graph using neighbourhood information.

Today, a deluge of graph processing frameworks exist, both in academia and open-source… a vast majority of the existing graph processing frameworks however have focused on graph Continue reading

Sharding the shards: managing datastore locality at scale with Akkio

Sharding the shards: managing datastore locality at scale with Akkio Annamalai et al., OSDI’18

In Harry Potter, the Accio Summoning Charm summons an object to the caster of the spell, sometimes transporting it over a significant distance. In Facebook, Akkio summons data to a datacenter with the goal of improving data access locality for clients. Central to Akkio is the notion of microshards (μ-shards), units of data much smaller than a typical shard. μ-shards are defined by the client application, and should exhibit strong access locality (i.e., the application tends to read/write the data in a μ-shard together in a small window of time). Sitting as a layer between client applications and underlying datastores, Akkio has been in production at Facebook since 2014, where it manages around 100PB of data.

Measurements from our production environment show that Akkio reduces latencies by up to 50%, cross-datacenter traffic by up to 50%, and storage footprint by up to 40% compared to reasonable alternatives.

Akkio can support trillions of μ-shards and many 10s of millions of data access requests per second.

Motivation

Our work in this area was initially motivated by our aim to reduce service response times and resource Continue reading

The FuzzyLog: a partially ordered shared log

The FuzzyLog: a partially ordered shared log Lockerman et al., OSDI’18

If you want to build a distributed system then having a distributed shared log as an abstraction to build upon — one that gives you an agreed upon total order for all events — is such a big help that it’s practically cheating! (See the “Can’t we all just agree” mini-series of posts for some of the background on consensus).

Services built over a shared log are simple, compact layers that map a high-level API to append/read operations on the shared log, which acts as the source of strong consistency, durability, failure atomicity, and transactional isolation. For example, a shared log version of ZooKeeper uses 1K lines of code, an order of magnitude lower than the original system.

There’s a catch of course. System-wide total orders are expensive to maintain. Sometimes it may be impossible (e.g. in the event of a network partition). But perhaps we don’t always need a total ordering. Oftentimes for example causal consistency is strong enough. FuzzyLog aims to provide the simplicity of a shared log without imposing a total order: it provides partial ordering instead. It’s designed for a world Continue reading

Moment-based quantile sketches for efficient high cardinality aggregation queries

Moment-based quantile sketches for efficient high cardinality aggregation queries Gan et al., VLDB’18

Today we’re temporarily pausing our tour through some of the OSDI’18 papers in order to look at a great sketch-based data structure for quantile queries over high-cardinality aggregates.

That’s a bit of a mouthful so let’s jump straight into an example of the problem at hand. Say you have telemetry data from millions of heterogenous mobile devices running your app. Each device tracks multiple metrics such as request latency and memory usage, and is associated with dimensional metadata (categorical variables) such as application version and hardware model.

In applications such as A/B testing, exploratory data analysis, and operations monitoring, analysts perform aggregation queries to understand how specific user cohorts, device types, and feature flags are behaving.

We want to be able to ask questions like “what’s the 99%-ile latency over the last two weeks for v8.2 of the app?

SELECT percentile(latency, 99) FROM requests
WHERE time > date_sub(curdate(), 2 WEEK)
AND app_version = "v8.2"

As well as threshold queries such as “what combinations of app version and hardware platform have a 99th percentile latency exceeding 100ms?

SELECT app_version, hw_model, PERCENTILE(latency,  Continue reading

Noria: dynamic, partially-stateful data-flow for high-performance web applications

Noria: dynamic, partially-stateful data-flow for high-performance web applications Gjengset, Schwarzkopf et al., OSDI’18

I have way more margin notes for this paper than I typically do, and that’s a reflection of my struggle to figure out what kind of thing we’re dealing with here. Noria doesn’t want to fit neatly into any existing box!

We’ve seen streaming data-flow engines that maintain state and offer SQL interfaces and even transactions (e.g. Apache Flink, and data Artisan’s Streaming Ledger for Flink). The primary model here is data-flow, and SQL is bolted on as an interface to the state. The title of this paper sets me off thinking along those lines, but from the end user perspective, Noria looks and feels more like a database. The SQL interface is primary, not ancillary, and it maintains relational data in base tables (using RocksDB as the storage engine). Noria makes intelligent use of data-flow beneath the SQL interface (i.e., dataflow is not exposed as an end-user programming model) in order to maintain a set of (semi-)materialized views. Noria itself figures out the most efficient data-flows to maintain those views, and how to update the data-flow graphs in the face of Continue reading

RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor

RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor Berger et al., OSDI’18

It’s time to rethink everything you thought you knew about caching! My mental model goes something like this: we have a set of items that probably follow a power-law of popularity.

We have a certain finite cache capacity, and we use it to cache the most frequently requested items, speeding up request processing.

Now, there’s a long tail of less frequently requested items, and if we request one of these that’s not in the cache the request is going to take longer (higher latency). But it makes no sense whatsoever to try and improve the latency for these requests by ‘shifting our cache to the right.’

Hence the received wisdom that unless the full working set fits entirely in the cache, then a caching layer doesn’t address tail latency.

So far we’ve been talking about one uniform cache. But in a typical web application one incoming request might fan out to many back-end service requests processed in parallel. The OneRF page rendering framework at Microsoft (which serves msn.com, microsoft.com and xbox.com among others) relies on more than 20 backend Continue reading

1 8 9 10 11 12 15