Hey, it's HighScalability time:
This is guest post by Sergei Sheinin, creator of the 2DX Web UI Database Cluster Framework, a low latency big data cluster with in-memory noSQL DBMS Web Browser client.
When I began working in the field of data management the disconnect between rigid structure of relational database tables and free form of documents managed by end users and their businesses stood out as a technical and managerial hurdle. On the one hand there were strict definitions of normalized relational database models and unstructured document formats on the other. Often the users in charge of changing document structures held organizational responsibilities far removed from database modeling or programming. On one occasion I was involved in a project where call center operators made on the fly decisions to update a document structure based on phone conversations with customers. Such updates had to be streamed into a relational back-end creating havoc in database structure and build of table columns.
In seeking a permanent solution I researched merits of Entity-Attribute-Value database schema and its applications. This technique proved successful in enabling front end users to modify relational-bound documents through performing updates to structure described in their metadata. However application of EAV raised Continue reading
This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1.
Job processing at scale at high concurrency across a distributed infrastructure is a complicated feat. There are many components involvement — servers and controllers to process and monitor jobs, controllers to autoscale and manage servers, controllers to distribute jobs across the set of servers, queues to buffer jobs, and whole host of other components to ensure jobs complete and/or are retried, and other critical tasks that help maintain high service levels. This section peels back the layers a bit to provide insight into important aspects within the workings of a serverless platform.
Throughput has always been the coin of the realm in computer processing — how quickly can events, requests, and workloads be processed. In the context of a serverless architecture, I’ll break throughput down further when discussing both latency and concurrency. At the base level, however, a serverless architecture does provide a more beneficial architecture than legacy applications and large web apps when it comes to throughput because it provide for far better resource utilization.
In a post by Travis Reeder on What is Serverless Computing and Why is Continue reading
Hey, it's HighScalability time:
This is a guest post by Tony Branson.
Performance, scalability, and HA are often used interchangeably, and any confusion about them can result in unrealistic metrics and deployment delays. It is important to invest your time and understand the differences among these three approaches before you invest your money in resilient systems.
Performance
This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io.
First I should mention that of course there are servers involved. I’m just using the term that popularly describes an approach and a set of technologies that abstracts job processing and scheduling from having to manage servers. In a post written for ReadWrite back in 2012 on the future of software and applications, I described “serverless” as the following.
The phrase “serverless” doesn’t mean servers are no longer involved. It simply means that developers no longer have to think that much about them. Computing resources get used as services without having to manage around physical capacities or limits. Service providers increasingly take on the responsibility of managing servers, data stores and other infrastructure resources…Going serverless lets developers shift their focus from the server level to the task level. Serverless solutions let developers focus on what their application or system needs to do by taking away the complexity of the backend infrastructure.
At the time of that post, the term “serverless” was not all that well received, as evidenced by the comments on Hacker News. With the introduction of a number Continue reading
Hey, it's HighScalability time:
In this article, I want to share with you how I solved a very interesting problem of synchronizing data between IoT devices and a cloud application.
I’ll start by outlining the general idea and the goals of my project. Then I’ll describe my implementation in greater detail. This is going to be a more technically advanced part, where I’ll be talking about the Contiki OS, databases, protocols and the like. In the end, I’ll summarize the technologies I used to implement the whole system.
So, let’s talk about the general idea first.
Here’s a scheme illustrating the final state of the whole system:
I have a user who can connect to IoT devices via a cloud service or directly (that is over Wi-Fi).
Also, I have an application server somewhere in the cloud and the cloud itself somewhere on the Internet. This cloud can be anything — for example, an AWS or Azure instance or it could be a dedicated server, it could be anything :)
The application server is connected to IoT devices over some protocol. I need this connection to exchange data between the application server and the IoT devices.
The IoT devices are connected to each other in Continue reading
Hey, it's HighScalability time:
After many days of rain one lane of this two lane road collapsed into the canyon. It's been out for a month and it will be many more months before it will be fixed. Thanks to Google maps way too many drivers take this once sleepy local road.
How do you think drivers go through this chokepoint?
One hundred experience points to you if you answered one at a time.
One at a time! Through a half-duplex pipe following a first in first out discipline takes forever!
Yes, there is a stop sign. And people default to this mode because it appeals to our innate sense of fairness. What could be fairer than alternating one at a time?
The problem is it's stupid.
While waiting, stewing, growing angrier, I often think if people just knew a little queueing theory we could all be on our way a lot faster.
We can't make the pipe full duplex, so that's out. Let's assume there's no priority involved, vehicles are roughly the same size and take roughly the same time to transit the network. Then what do you do?
Why can't people figure out its faster to drive through Continue reading
Hey, it's HighScalability time:
Hey, it's HighScalability time:
Have you ever read a book and wondered how any human could have written something so brilliant? For me it was Lord of the Rings. I despaired that in a hundred lifetimes I could never write a book so rich, so deep, so beautiful. Since then I've learned a few things about about how LoTr was created that has made me reconsider. The kick-in-the-head is that it's the same lesson I learned long ago about writing software.
I've always been amazed how a program can start as a single source file and after years of continued effort turn into a working system that is so large no human can come close to understanding it. If you had tried from the start to build the system you ended up with you would have never ever got there. That's just not how it works. Software is path dependent.
I've experienced this growth from a single cell to a Cambrian explosion many times so I know it's a thing. What I hadn't considered is how it's also a thing for writing books too.
Creating good software is a process of evolution through the mechanism of constant iteration for the purpose of survival. Continue reading
As the Russian rouble exchange rate slumped two years ago, it drove us to think of cutting hardware and hosting costs for the Mail.Ru email service. To find ways of saving money, let’s first take a look at what emails consist of.
Indexes and bodies account for only 15% of the storage size, whereas 85% is taken up by files. So, files (that is attachments) are worth exploring in more detail in terms of optimization. At that time, we didn’t have file deduplication in place, but we estimated that it could shrink the total storage size by 36% since many users receive the same messages, such as price lists from online stores or newsletters from social networks containing images and so on. In this article, I’m going to describe how we implemented a deduplication system under the supervision of Albert Galimov.
Hey, it's HighScalability time:
Steve Jobs is notorious for hating buttons. Here's Jobs explaining the foulness of buttons during his famous iPhone introduction:
What's wrong with their [other phones] user interface? The problem with them is really sort of in the bottom 40. They all have these keyboard that are there whether you need them or not to be there. And they all have these control buttons that are fixed in plastic and are the same for every application. Well every application wants a slightly different user interface, a slightly optimized set of buttons just for it. And what happens if you think of a great idea six months from now? You can't run around and add a button to these things. They're already shipped. So what do you do? It doesn't work because the buttons and the controls can't change.
The iPhone solved the button problem with a new multi-touch screen and by using your finger as the pointing device (not a nasty nasty stylus). We all know how this works now, but it was novel back in the olden days.
The iPhone was one of three new products based on revolutionary user interface development: the mouse and the Macintosh; the click-wheel Continue reading