Archive

Category Archives for "High Scalability"

How Uber Scales Their Real-time Market Platform

Reportedly Uber has grown an astonishing 38 times bigger in just four years. Now, for what I think is the first time, Matt Ranney, Chief Systems Architect at Uber, in a very interesting and detailed talk--Scaling Uber's Real-time Market Platform---tells us a lot about how Uber’s software works.

If you are interested in Surge pricing, that’s not covered in the talk. We do learn about Uber’s dispatch system, how they implement geospatial indexing, how they scale their system, how they implement high availability, and how they handle failure, including the surprising way they handle datacenter failures using driver phones as an external distributed storage system for recovery.

The overall impression of the talk is one of very rapid growth. Many of the architectural choices they’ve made are a consequence of growing so fast and trying to empower recently assembled teams to move as quickly as possible. A lot of technology has been used on the backend because their major goal has been for teams to get the engineering velocity as high as possible.

After a understandably chaotic (and very successful) start it seems Uber has learned a lot about their business and what they really need to Continue reading

Stuff The Internet Says On Scalability For September 11th, 2015

Hey, it's HighScalability time:


Need a challenge? Solve the code on this 17.5 feet tall 11,000 year old wooden statue!

  • $100 million: amount Popcorn could have made from criminal business offers; 3.2-gigapixel: World’s Most Powerful Digital Camera; $17.3 trillion: US GDP in 2014;  700 million: Facebook time series database data points added per minute; 300PB: Facebook data stored in Hive; 5,000: Airbnb EC2 instances.

  • Quotable Quotes:
    • @jimmydivvy: NASA: Decade long flight across the solar system. Arrives within 72 seconds of predicted. No errors. Me: undefined is not a function
    • Packet Pushers~ Everyone has IOPS now. We are heading towards invisible consumption being the big deal going forward. 
    • Randy Medlin: Gonna drop $1000+ on a giant iPad, $100 on a stylus, then whine endlessly about $4.99 drawing apps.
    • Anonymous: Circuit Breaker + Real-time Monitoring + Recovery = Resiliency
    • Astrid Atkinson: I used to get paged awake at two in the morning. You go from zero to Google is down. That’s a lot to wake up to.
    • Todd Waters~ In 1979, 200MB weighed 30 lbs and took up the space of a washing machine
    • Todd Waters~ CERN spends more compute Continue reading

Trade Stimulators and the Very Old Idea of Increasing User Engagement

Very early in my web career I was introduced to the almost mystical holy grail of web (and now app) properties: increasing user engagement.

The reason is simple. The more time people spend with your property the more stuff you can sell them. The more stuff you can sell the more value you have. Your time is money. So we design for addiction.

Famously Facebook, through the ties that bind, is the engagement leader with U.S. adults spending a stunning average of 42.1 minutes per day on Facebook. Cha-ching.

Immense resources are spent trying to make websites and apps sticky. Psychological tricks and gamification strategies are deployed with abandon to get you not to leave a website or to keep playing an app.

It turns out this is a very old idea. Casinos are designed to keep you gambling, for example. And though I’d never really thought about it before, I shouldn’t have been surprised to learn retail stores of yore used devices called trade stimulators to keep customers hanging around and spending money.

Never heard of trade stimulators? I hadn’t either until, while watching American Pickers, one of my favorite shows, they talked about this whole Continue reading

Want IoT? Here’s How a Major US Utility Collects Power Data from Over 5.5 Million Meters

I serendipitously found this fascinating reply by Richard Farley, your friendly neighborhood meter reader, in a local email list giving a rare first-hand account of how the Advanced Metering Infrastructure works in California. This is real Internet of Things territory. So if it doesn't have a typical post structure that is why. He generously allowed it to be reposted with a few redactions. When you see “A Major US Utility”, please replace it with the most likely California power company.

Old mechanical meters had bearings that over time wore out and caused friction that threw off readings. That friction would cause the analog gauge to spin slower than it should, resulting in lower readings than actual usage -- hence "free power". It's like a clock falling behind over time as the gears wear down.

For A Major US Utility "estimated billing" happens when your meter, for whatever reason, was not able to be read. The algorithms approved by the CPUC and are almost always favorable to the consumer. A Major US Utility hates to have to do estimated billing because they almost always have to underestimate based on the algorithms and CPUC rules. Not 100% sure about this, but if they Continue reading

Stuff The Internet Says On Scalability For September 4th, 2015

Hey, it's HighScalability time:


An astonishing 300 billion stars in our galaxy have planets. Take a look in the Eyes on Exoplanets app.
  • 1 billion: people who used Facebook in a single day; 2.8 million: sq. ft. in new Apple campus (with drone pics);  1.1 trillion: Apache Kafka messages per day; 2,000 years: age of termite mounds in Central Africa; 30: # of times better the human brain is better than the best supercomputers; 4 billion: requests it took to trigger an underflow bug.

  • Quotable Quotes:
    • Sara Seager: If an Earth 2.0 exists, we have the capability to find and identify it by the 2020s.
    • Android Dick: But you’re my friend, and I’ll remember my friends, and I’ll be good to you. So don’t worry, even if I evolve into Terminator, I’ll still be nice to you. I’ll keep you warm and safe in my people zoo, where I can watch you for ol’ times sake.
    • @viktorklang: "If the conversation is typically “scale out” versus “scale up” if we’re coordination-free, we get to choose “scale out” while “scaling up.”
    • Amir Najmi: At Google, data scientists are just Continue reading

How Agari Uses Airbnb’s Airflow as a Smarter Cron

This is a guest repost by Siddharth Anand, Data Architect at Agari, on Airbnb's open source project Airflow, a workflow scheduler for data pipelines. Some think Airflow has a superior approach.

Workflow schedulers are systems that are responsbile for the periodic execution of workflows in a reliable and scalable manner. Workflow schedulers are pervasive - for instance, any company that has a data warehouse, a specialized database typically used for reporting, uses a workflow scheduler to coordinate nightly data loads into the data warehouse. Of more interest to companies like Agari is the use of workflow schedulers to reliably execute complex and business-critical "big" data science workloads! Agari, an email security company that tackles the problem of phishing, is increasingly leveraging data science, machine learning, and big data practices typically seen in data-driven companies like LinkedIn, Google, and Facebook in order to meet the demands of burgeoning data and dynamicism around modeling.

In a previous post, I described how we leverage AWS to build a scalable data pipeline at Agari. In this post, I discuss our need for a workflow scheduler in order to improve the reliablity of our data pipelines, providing the previous post's pipeline Continue reading

Building Globally Distributed, Mission Critical Applications: Lessons From the Trenches Part 2

This is Part 2 of a guest post by Kris Beevers, founder and CEO, NSONE, a purveyor of a next-gen intelligent DNS and traffic management platform. Here's Part 1.

Integration and functional testing is crucial

Unit testing is hammered home in every modern software development class.  It’s good practice. Whether you’re doing test-driven development or just banging out code, without unit tests you can’t be sure a piece of code will do what it’s supposed to unless you test it carefully, and ensure those tests keep passing as your code evolves.

In a distributed application, your systems will break even if you have the world’s best unit testing coverage. Unit testing is not enough.

You need to test the interactions between your subsystems. What if a particular piece of configuration data changes – how does that impact Subsystem A’s communication with Subsystem B? What if you changed a message format – do all the subsystems generating and handling those messages continue to talk with each other? Does a particular kind of request that depends on results from four different backend subsystems still result in a correct response after your latest code changes?

Unit tests don’t answer these questions, Continue reading

Sponsored Post: Microsoft , Librato, Surge, Redis Labs, Jut.io, VoltDB, Datadog, MongoDB, SignalFx, InMemory.Net, Couchbase, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Microsoft’s Visual Studio Online team is building the next generation of software development tools in the cloud out in Durham, North Carolina. Come help us build innovative workflows around Git and continuous deployment, help solve the Git scale problem or help us build a best-in-class web experience. Learn more and apply.

  • VoltDB's in-memory SQL database combines streaming analytics with transaction processing in a single, horizontal scale-out platform. Customers use VoltDB to build applications that process streaming data the instant it arrives to make immediate, per-event, context-aware decisions. If you want to join our ground-breaking engineering team and make a real impact, apply here.  

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, Continue reading

Building Globally Distributed, Mission Critical Applications: Lessons From the Trenches Part 1

This is Part 1 of a guest post by Kris Beevers, founder and CEO, NSONE, a purveyor of a next-gen intelligent DNS and traffic management platform.

Every tech company thinks about it: the unavoidable – in fact, enviable – challenge of scaling its applications and systems as the business grows. How can you think about scaling from the beginning, and put your company on good footing, without optimizing prematurely? What are some of the key challenges worth thinking about now, before they bite you later on? When you’re building mission critical technology, these are fundamental questions. And when you’re building a distributed infrastructure, whether for reliability or performance or both, they’re hard questions to answer.

Putting the right architecture and processes in place will enable your systems and company to withstand the common hiccups distributed, high traffic applications face. This enables you to stay ahead of scaling constraints, manage inevitable network and system failures, stay calm and debug production issues in real-time, and grow your company and product successfully.

Who is this guy?

I’ve been building globally distributed, large scale applications for a long time.  Way back in the first dot-com boom, I bailed on college classes for Continue reading

Stuff The Internet Says On Scalability For August 28th, 2015x

Hey, it's HighScalability time:


The oldest known fossil of a flowering plant. 130 million years old. What digital will last so long?
  • 32.6: Ashley Madison password cracks per hour; 1 million: cores in the Human Brain Project's silicon brain; 54,000: tennis balls used at Wimbledon; 4 kB: size of first web page; 1.2 million: million messages per second Apache Samza performance on a single node; 27%: higher conversion for sites loading one second faster; 

  • Quotable Quotes:
    • @adrianco: Apple first read about Mesos on http://highscalability.com  and for a year have run Siri on the worlds biggest cluster 
    • @Besvinick: Interesting recurring sentiment from recent grads: We lived most of our college lives on Snapchat—now we don't have any "tangible" memories.
    • Robin Hobb: For most moments of our lives, we have forgotten almost all of the world around us, except for what currently claims our interest.
    • @Carnage4Life: I'd like to thank all the Amazon employees who cried at their desks to make this possible ?? ????? 
    • Jim Handy: The single most interesting thing I learned at the 2015 Flash Memory Summit was that 3D NAND doesn’t have a natural limit, after Continue reading

7 Strategies for 10x Transformative Change

Peter Thiel, VC, PayPal co-founder, early Facebook investor, and most importantly, the supposed inspiration for Silicon Valley's intriguing Peter Gregory character, argues in his book Zero to One that a successful business needs to make a product that is 10 times better than its closest competitor

The title Zero to One refers to the idea of progress as either horizontal/extensive or vertical/intensive. For a more detailed explanation take a look at Peter Thiel's CS183: Startup - Class 1 Notes Essay.

Horizontal/extensive progress refers to copying things that work. Observe, imitate, and repeat.  The one word summary for the concept is  "globalization.” For more on this PAYPAL MAFIA: Reid Hoffman & Peter Thiel's Master Class in China is an interesting watch.

Vertical/intensive progress means doing something genuinely new, that is going from zero to one, as apposed to going from one to N, which is merely globalization. This is the creative spark. The hero's journey of over coming obstacles on the way to becoming the Master of the Universe you were always meant to be.

We see this pattern with Google a lot. Google often hits scaling challenges long before anyone else and because they have a systematizing culture they Continue reading

Ask HighScalability: Choose an Async App Server or Multiple Blocking Servers?

Jonathan Willis, software developer by day and superhero by night, asked an interesting question via Twitter on StackOverflow

tl;dr Many Rails apps or one Vertx/Play! app?


I've been having discussions with other members of my team on the pros and cons of using an async app server such as the Play! Framework (built on Netty) versus spinning up multiple instances of a Rails app server. I know that Netty is asynchronous/non-blocking, meaning during a database query, network request, or something similar an async call will allow the event loop thread to switch from the blocked request to another request ready to be processed/served. This will keep the CPUs busy instead of blocking and waiting.

I'm arguing in favor or using something such as the Play! Framework or Vertx.io, something that is non-blocking... Scalable. My team members, on the other hand, are saying that you can get the same benefit by using multiple instances of a Rails app, which out of the box only comes with one thread and doesn't have true concurrency as do apps on the JVM. They are saying just use enough app instances to match the performance of one Play! application (or however many Play! apps Continue reading

Stuff The Internet Says On Scalability For August 21st, 2015

Hey, it's HighScalability time:


Hunter-Seeker? Nope. This is the beauty of what a Google driverless car sees. Great TED talk.
  • $2.8 billion: projected Instagram ad revenue in 2017; 1 trillion: Azure event hub events per month; 10 million: Stack Overflow questions asked; 1 billion: max volts generated by a lightening strike; 850: apps downloaded every second from the AppStore; 2000: years data can be stored in DNA; 60: # of robots needed to replace 600 humans; 1 million: queries per second with Nginx, Ubuntu, EC2

  • Quotable Quotes:
    • Tales from the Lunar Module Guidance Computer: we landed on the moon with 152 Kbytes of onboard computer memory.
    • @ijuma: Included in JDK 8 update 60 "changes GHASH internals from using byte[] to long, improving performance about 10x
    • @ErrataRob: I love the whining over the Bitcoin XT fork. It's as if anarchists/libertarians don't understand what anarchy/libertarianism means.
    • Network World: the LHC Computing Grid has 132,992 physical CPUs, 553,611 logical CPUs, 300PB of online disk storage and 230PB of nearline (magnetic tape) storage. It's a staggering amount of processing capacity and data storage that relies on having no single point of failure.
    • @petereisentraut: Chef is Continue reading

The Microsoft Take on Containers and Docker

This is a guest repost by Mark Russinovich, CTO of Microsoft Azure (and novelist!). We all benefit from a vibrant competitive cloud market and Microsoft is part of that mix. Here's a good container overview along with Microsoft's plan of attack. Do you like their story? Is it interesting? Is it compelling?

You can’t have a discussion on cloud computing lately without talking about containers. Organizations across all business segments, from banks and major financial service firms to e-commerce sites, want to understand what containers are, what they mean for applications in the cloud, and how to best use them for their specific development and IT operations scenarios.

From the basics of what containers are and how they work, to the scenarios they’re being most widely used for today, to emerging trends supporting “containerization”, I thought I’d share my perspectives to better help you understand how to best embrace this important cloud computing development to more seamlessly build, test, deploy and manage your cloud applications.

Containers Overview

In abstract terms, all of computing is based upon running some “function” on a set of “physical” resources, like processor, memory, disk, network, etc., to accomplish a task, whether a Continue reading

Sponsored Post: Surge, Redis Labs, Jut.io, VoltDB, Datadog, MongoDB, SignalFx, InMemory.Net, Couchbase, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • VoltDB's in-memory SQL database combines streaming analytics with transaction processing in a single, horizontal scale-out platform. Customers use VoltDB to build applications that process streaming data the instant it arrives to make immediate, per-event, context-aware decisions. If you want to join our ground-breaking engineering team and make a real impact, apply here.  

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and Continue reading

How Autodesk Implemented Scalable Eventing over Mesos

This is a guest post by Olivier Paugam, SW Architect for the Autodesk Cloud. I really like this post because it shows how bits of infrastructure--Mesos, Kafka, RabbitMQ, Akka, Splunk, Librato, EC2--can be combined together to solve real problems. It's truly amazing how much can get done these days by a small team.

I was tasked a few months ago to come up with a central eventing system, something that would allow our various backends to communicate with each other. We are talking about activity streaming backends, rendering, data translation, BIM, identity, log reporting, analytics, etc.  So something really generic with varying load, usage patterns and scaling profile.  And oh, also something that our engineering teams could interface with easily.  Of course every piece of the system should be able to scale on its own.

I obviously didn't have time to write too much code and picked up Kafka as our storage core as it's stable, widely used and works okay (please note I'm not bound to using it and could switch over to something else).  Now I of course could not expose it directly and had to front-end it with some API. Without thinking much I also Continue reading

Stuff The Internet Says On Scalability For August 14th, 2015

Hey, it's HighScalability time:


Being Google CEO: Nice. Becoming Tony Stark: Priceless (Alphabet)

 

  • $7: WeChat's revenue per user and there are 549 million of them; 60%: Etsy users using mobile; 10: times per second a self-driving car makes a decision; 900: calories in a litre of blood, vampires have very efficient metabolisms; 5 billion: the largest feature in the universe in light years

  • Quotable Quotes:
    • @sbeam: they finally had the Enigma machine. They opened the case. A card fell out. Turing picked it up. "Damn. They included a EULA." #oraclefanfic
    • kordless: compute and storage continue to track with Moore's Law but bandwidth doesn't. I keep wondering if this isn't some sort of universal limitation on this reality that will force high decentralization.
    • @SciencePorn: If you were to remove all of the empty space from the atoms that make up every human on earth, all humans would fit into an apple.
    • @adrianco: Commodity server with 1.4TB of RAM running a mix of 16GB regular DRAM and 128GB Memory1 modules.
    • @JudithNursalim: "One of the most scalable structure in history was the Roman army. Its unit: eight guys; the number of guys that Continue reading

Why My Water Droplet Is Better Than Your Hadoop Cluster

We’ve had computation using slime mold and soap film, now we have computation using water droplets. Stanford bioengineers have built a “fully functioning computer that runs like clockwork - but instead of electrons, it operates using the movement of tiny magnetised water droplets.”

 

By changing the layout of the bars on the chip it's possible to make all the universal logic gates. And any Boolean logic circuit can be built by moving the little magnetic droplets around. Currently the chips are about half the size of a postage stamp and the droplets are smaller than poppy seeds.

What all this means I'm not sure, but pavo6503 has a comment that helps understand what's going on:

Logic gates pass high and low states. Since they plan to use drops of water as carriers and the substances in those drops to determine what the high/low state is they could hypothetically make a filter that sorts drops of water containing 1 to many chemicals. Pure water passes through unchanged. water with say, oil in it, passes to another container, water with alcohol to another. A "chip" with this setup could be used to purify water where there are many contaminants you want separated.

1 27 28 29