Todd Hoff

Author Archives: Todd Hoff

Sponsored Post: zanox Group, Varnish, LaunchDarkly, Swrve, Netflix, Aerospike, TrueSight Pulse, Redis Labs, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • The zanox Group are looking for a Senior Architect. We're looking for someone smart and pragmatic to help our engineering teams build fast, scalable and reliable solutions for our industry leading affiliate marketing platform. The role will involve a healthy mixture of strategic thinking and hands-on work - there are no ivory towers here! Our stack is diverse and interesting. You can apply for the role in either London or Berlin.

  • Swrve -- In November we closed a $30m funding round, and we’re now expanding our engineering team based in Dublin (Ireland). Our mobile marketing platform is powered by 8bn+ events a day, processed in real time. We’re hiring intermediate and senior backend software developers to join the existing team of thirty engineers. Sound like fun? Come join us.

  • Senior Service Reliability Engineer (SRE): Drive improvements to help reduce both time-to-detect and time-to-resolve while concurrently improving availability through service team engagement.  Ability to analyze and triage production issues on a web-scale system a plus. Find details on the position here: https://jobs.netflix.com/jobs/434

  • Manager - Performance Engineering: Lead the world-class performance team in charge of both optimizing the Netflix cloud stack and developing the performance observability capabilities Continue reading

A Journey Through How Zapier Automates Billions of Workflow Automation Tasks

This is a guest repost by Bryan Helmig, ‎Co-founder & CTO at Zapier, who makes it easy to automate tasks between web apps.

 

Zapier is a web service that automates data flow between over 500 web apps, including MailChimp, Salesforce, GitHub, Trello and many more.

Imagine building a workflow (or a "Zap" as we call it) that triggers when a user fills out your Typeform form, then automatically creates an event on your Google Calendar, sends a Slack notification and finishes up by adding a row to a Google Sheets spreadsheet. That's Zapier. Building Zaps like this is very easy, even for non-technical users, and is infinitely customizable.

As CTO and co-founder, I built much of the original core system, and today lead the engineering team. I'd like to take you on a journey through our stack, how we built it and how we're still improving it today!

The Teams Behind the Curtains

It takes a lot to make Zapier tick, so we have four distinct teams in engineering:

  • The frontend team, which works on the very powerful workflow editor.
  • The full stack team, which is cross-functional but focuses on the workflow engine.
  • The Continue reading

When Should Approximate Query Processing Be Used?

This is a guest repost by Barzan Mozafari, who is part of a new startup, www.snappydata.io, that recently launched an open source OLTP + OLAP Database built on Spark.

The growing market for Big Data has created a lot of interest around approximate query processing (AQP) as a means of achieving interactive response times (e.g., sub-second latencies) when faced with terabytes and petabytes of data. At the same time, there is a lot of misinformation about this technology and what it can or cannot do.

Having been involved in building a few academic prototypes and industrial engines for approximate query processing, I have heard many interesting statements about AQP and/or sampling techniques (from both DB vendors and end-users):

Myth #1. Sampling is only useful when you know your queries in advance
Myth #2. Sampling misses out on rare events or outliers in the data
Myth #3. AQP systems cannot handle join queries
Myth #4. It is hard for end-users to use approximate answers
Myth #5. Sampling is just like indexing
Myth #6. Sampling will break the BI tools
Myth #7. There is no point approximating if your data fits in memory

Although there is a Continue reading

Google’s Transition from Single Datacenter, to Failover, to a Native Multihomed Architecture

 

Making a system work in one datacenter is hard. Now imagine you move to two datacenters. Now imagine you need to support multiple geographically distributed datacenters. That’s the journey described in another excellent and thought provoking paper from Google: High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads.

The main idea of the paper is that the typical failover architecture used when moving from a single datacenter to multiple datacenters doesn’t work well in practice. What does work, where work means using fewer resources while providing high availability and consistency, is a natively multihomed architecture:

Our current approach is to build natively multihomed systems. Such systems run hot in multiple datacenters all the time, and adaptively move load between datacenters, with the ability to handle outages of any scale completely transparently. Additionally, planned datacenter outages and maintenance events are completely transparent, causing minimal disruption to the operational systems. In the past, such events required labor-intensive efforts to move operational systems from one datacenter to another

The use of “multihoming” in this context may be confusing because multihoming usually refers to a computer connected to more than one network. At Google scale perhaps it’s just as natural Continue reading

Stuff The Internet Says On Scalability For February 26th, 2016


Wonderful diagram of @adrianco Microservices talk at #OOP2016 by @remarker_eu  

 

If you like this sort of Stuff then please consider offering your support on Patreon.

  • 350,000: new Telegram users per day; 15 billion: messages delivered by Telegram per day; 50 billion suns: max size of a black hole; 10,000x: lower power for Wi-Fi; 400 hours: video uploaded to YouTube every minute;

  • Quotable Quotes:
    • sharemywin: I don't think consensus scales. So, I think they'll be an ecosystem of block chains.
    • @aneel: "There is no failover process other than the continuous dynamic load balancing." 
    • Jono MacDougall: If you are happy hosting your own solution, use Cassandra. If you want the ease of scaling and operations, Use DynamoDB.
    • @plamere: Google’s BigQuery is *da bomb* - I can start with 2.2Billion ‘things’ and compute/summarize down to 20K in < 1 min.
    • Haifa Moses: We’re evaluating a totally new software model that allows us to automatically diagnose if a failure occurs during a mission and for messages to be displayed for flight controllers on the ground
    • @fmbutt: IBM abstracted analog calculation. MS abstracted HW. Goog abstracted SW. Powerful Mobile AI could abstract clouds. 
    • Jon Grall Continue reading

Sponsored Post: Swrve, Netflix, Macmillan Learning, Aerospike, TrueSight Pulse, LaunchDarkly, Robinhood, Redis Labs, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Swrve -- In November we closed a $30m funding round, and we’re now expanding our engineering team based in Dublin (Ireland). Our mobile marketing platform is powered by 8bn+ events a day, processed in real time. We’re hiring intermediate and senior backend software developers to join the existing team of thirty engineers. Sound like fun? Come join us.

  • Macmillan Learning, a premier e-learning institute, is looking for VP of DevOps to manage the DevOps teams based in New York and Austin. This is a very exciting team as the company is committed to fully transitioning to the Cloud, using a DevOps approach, with focus on CI/CD, and using technologies like Chef/Puppet/Docker, etc. Please apply here.

  • DevOps Engineer at Robinhood. We are looking for an Operations Engineer to take responsibility for our development and production environments deployed across multiple AWS regions. Top candidates will have several years experience as a Systems Administrator, Ops Engineer, or SRE at a massive scale. Please apply here.

  • Senior Service Reliability Engineer (SRE): Drive improvements to help reduce both time-to-detect and time-to-resolve while concurrently improving availability through service team engagement.  Ability to analyze and triage production issues on a web-scale system a plus. Continue reading

Egnyte Architecture: Lessons Learned in Building and Scaling a Multi Petabyte Distributed System

This is a guest post by Kalpesh Patel, an Architect, who works from home. He and his colleagues spends their productive hours scaling one of the largest distributed file-system out there. He works at Egnyte, an Enterprise File Synchronization Sharing and Analytics startup and you can reach him at @kpatelwork.

Your Laptop has a filesystem used by hundreds of processes, it is limited by the disk space, it can’t expand storage elastically, it chokes if you run few I/O intensive processes or try sharing it with 100 other users. Now take this problem and magnify it to a file-system used by millions of paid users spread across world and you get a roller coaster ride scaling the system to meet monthly growth needs and meeting SLA requirements.

Egnyte is an Enterprise File Synchronization and Sharing startup founded in 2007, when Google drive wasn't born and AWS S3 was cost prohibitive. Our only option was to roll our sleeves and build an object store ourselves, overtime costs for S3 and GCS became reasonable and because our storage layer was based on a plugin architecture, we can now plug-in any storage backend that is cheaper. We have re-architected many of Continue reading

Stuff The Internet Says On Scalability For February 12th, 2016


Maybe this year's Mavericks can ride some gravitational waves?  

 

If you like this sort of Stuff then please consider offering your support on Patreon.

  • 3.96 Million: viewers streaming the Super Bowl; 1000 kilometers: roads made of solar panels in France; 500mg: amount of chlorophyll absorbing photons in a tree;

  • Quotable Quotes:
    • cmyr: soundcloud actually represents a very important cultural document; there is tons of music that has been created in the past 5+ years that exists exclusively there, and it would be a tremendous cultural loss if it were to disappear. 
    • aback: we will continue to endure substantial cultural losses for so long as people continue to believe that content can & should be distributed and consumed for free.
    • @dotemacs: - We’re moving to Java + Spring. - Why? - Cos of threads… and scaling… - OK … what exactly…? - I’m just going by what I was told.
    • @asolove: Chaos Monkey People: every week, one randomly-selected person must take the whole week off regardless of their current work. Org must adapt.
    • @waynejwerner: we almost started using them, but couldn't find examples of not Google / Netflix using containers Continue reading

How to build your Property Management System integration using Microservices

This is a guest post by Rafael Neves, Head of Enterprise Architecture at ALICE, a NY-based hospitality technology startup. While the domain is Property Management, it's also a good microservices intro.

In a fragmented world of hospitality systems, integration is a necessity. Your system will need to interact with different systems from different providers, each providing its own Application Program Interface (API). Not only that, but as you integrate with more hotel customers, the more instances you will need to connect and manage this connection. A Property Management System (PMS) is the core system of any hotel and integration is paramount as the industry moves to become more connected.

 

To provide software solutions in the hospitality industry, you will certainly need to establish a 2-way integration with the PMS providers. The challenge is building and managing these connections at scale, with multiple PMS instances across multiple hotels. There are several approaches you can leverage to implement these integrations. Here, I present one simple architectural design to building an integration foundation that will increase ROI as you grow. This approach is the use of microservices.

What are microservices? 

A Smallish List of Parse Migration Guides

Since Parse's big announcement it looks like the release of migration guides from various alternative services has died down. 

The biggest surprise is the rise of Parse's own open source Parse Server. Check out its commit velocity on GitHub. It seems to be on its way to becoming a vibrant and viable platform.

The immediate release of Parse Server with the announcement of the closing of Parse was surprising. How could it be out so soon? That's a lot of work. Some options came to mind. Maybe it's a version of an on-premise system they already had in the works? Maybe it's a version of the simulation software they use for internal testing? Or maybe they had enough advanced notice they could make an open source version of Parse? 

The winner is...

Charity Majors, formerly of Parse/Facebook, says in How to Survive an Acquisition, tells all:

Massive props to Kevin Lacker and those who saw the writing on the wall and did an amazing job preparing to open up the ecosystem.

That's impressive. It seems clear the folks at Parse weren't on board with Facebook's decision, but they certainly did everything possible to make the best Continue reading

What’s Next? The NFL’s Magic Yellow Line Shows the Way to Augmented Reality

 

What’s next? Mobile is entering its comforting middle age period of development. Conversational commerce is a thing, a good thing, but is it really a great thing?

What’s next may be what has been next for decades: Augmented reality (AR) (and VR). AR systems will be here sooner than you might think. A matter of years, not decades. Robert Scoble, for example, thinks Meta, an early startup in AR industry, will be bigger than the Macintosh. More on that in a later post. Magic Leap has no product and $1.3 billion in funding. Facebook has Oculus. Microsoft has HoloLens. Google may be releasing a VR system later this year. Apple is working on VR. Becoming the next iPhone is up for grabs.

AR is a Huge Opportunity for Programmers and Startups 

This is a technological revolution that will be bigger than mobile. Opportunities in mobile for developers have largely played out. Experience shows the earlier you get in on a revolution the better the opportunity will be. Do you want to be writing free iOS apps forever?

It’s so early we don’t really have an idea what AR is or what the market will be Continue reading

Stuff The Internet Says On Scalability For February 5th, 2016


We have an early entry for the best vacation photo of the century. 

 

If you like this sort of Stuff then please consider offering your support on Patreon.
  • 1 billion: WhatsApp users; 3.5 billion: Facebook users in 2030; $3.5 billion: art sold online; $150 billion: China's budget for making chips; 37.5MB: DNA information in a single sperm; 

  • Quotable Quotes:
    • @jeffiel: "But seriously developers, trust us next time your needs temporarily overlap our strategic interests. And here's a t-shirt."
    • @feross: Modern websites are the epitome of inefficiency. Using giant multi-MB javascript files to do what static HTML could do in 1999.
    • Rob Joyce (NSA): We put the time in …to know [that network] better than the people who designed it and the people who are securing it,' he said. 'You know the technologies you intended to use in that network. We know the technologies that are actually in use in that network. Subtle difference. You'd be surprised about the things that are running on a network vs. the things that you think are supposed to be there.
    • @MikeIsaac: i just realized how awkward Facebook's f8 conference is Continue reading

Sponsored Post: Netflix, Macmillan Learning, Aerospike, TrueSight Pulse, LaunchDarkly, Robinhood, StatusPage.io, Redis Labs, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Manager - Site Reliability Engineering: Lead and grow the the front door SRE team in charge of keeping Netflix up and running. You are an expert of operational best practices and can work with stakeholders to positively move the needle on availability. Find details on the position here: https://jobs.netflix.com/jobs/398

  • Macmillan Learning, a premier e-learning institute, is looking for VP of DevOps to manage the DevOps teams based in New York and Austin. This is a very exciting team as the company is committed to fully transitioning to the Cloud, using a DevOps approach, with focus on CI/CD, and using technologies like Chef/Puppet/Docker, etc. Please apply here.

  • DevOps Engineer at Robinhood. We are looking for an Operations Engineer to take responsibility for our development and production environments deployed across multiple AWS regions. Top candidates will have several years experience as a Systems Administrator, Ops Engineer, or SRE at a massive scale. Please apply here.

  • Senior Service Reliability Engineer (SRE): Drive improvements to help reduce both time-to-detect and time-to-resolve while concurrently improving availability through service team engagement.  Ability to analyze and triage production issues on a web-scale system a plus. Find details on the position here: https://jobs. Continue reading

The Big List of Alternatives to Parse

Parse is not going away. It’s going to get better.
Ilya Sukhar — April 25th, 2013 on the Future of Parse

 

Parse is dead. The great diaspora has begun. The gold rush is on. There’s a huge opportunity for some to feed and grow on Parse’s 600,000 fleeing customers.

Where should you go? What should you do? By now you’ve transitioned through all five stages of grief and ready for stage six: doing something about it. Fortunately there are a lot of options and I’ve gathered as many resources as I can here in one place.

There is a Lot Pain Out There

Parse closing is a bigger deal than most shutterings. There’s even a petition: Don't Shut down Parse.com. That doesn’t happen unless you’ve managed to touch people. What could account for such an outpouring of emotion.

Parse and the massive switch to mobile computing grew up at the same time. Mobile is by definition personal. Many programmers capable of handling UI programming challenge were not as experienced with backend programming and Parse filled that void. When a childhood friend you grew to depend on dies, it hurts. That hurt is deep. It goes into the very Continue reading

A Patreon Architecture Short

Patreon recently snagged $30 Million in funding. It seems the model of pledging $1 for individual feature releases or code changes won't support fast enough growth. CEO Jack Conte says: We need to bring in so many people so fast. We need to keep up with hiring and keep up with making all of the things.

Since HighScalability is giving Patreon a try I've naturally wondered how it's built. Modulo some serious security issues Patreon has always worked well. So I was interested to dig up this nugget in a thread on the funding round where the Director of Engineering at Patreon shares a little about how Patreon works:

  • Server is in Python using Flask and SQLAlchemy, 
  • Runs on AWS (EC2, RDS (MySQL), and some Redis, Celery, SQS, etc. to boot). 
  • A few microservices here and there in other languages too (e.g. real time chat server with Node & Firebase)
  • Web code is written in React (with some legacy code in Angular). We tend to use Redux for the non-component pieces, but are still trying out new React-compatible libraries here and there.
  • iOS and Android code are written in Objective-C and Java, respectively. 
  • We use Realm on both platforms for Continue reading

Stuff The Internet Says On Scalability For January 29th, 2016

Hey, it's HighScalability time:


This is a trace of a Google search query. A single query might touch a couple thousand machines.

 

If you like this Stuff then please consider supporting me on Patreon.
  • 88: the too short life of Marvin Minsky; $18.4 billion: profit made by Apple in 3 months; 100M: hours of video watched on Facebook each day; 1.59 billion: Facebook users; $115B: size of game market by 2020; 12 years: Mars rover still going strong; 96.3m: barrels of oil produced per day; 570 Billion: object brighter than the Sun; 134 pounds: carried by drones;  $2.4 billion: AWS Q4 sales; 2.5 million: advertisers on the Facebook;

  • Quotable Quotes:
    • @ptaoussanis: Real-world scaling 101: be in the habit of routinely, objectively asking what parts of your system could stand to be simplified or removed
    • @Carnage4Life: Azure revenue up 140%. Search revenue from #BingAds up 21%. Microsoft is killing it in the cloud
    • @gabriel_boya: Scaling up a Cloud Service on @azure takes so many hours that your customers may be gone by the time your instances are allocated...
    • AJ007: Facebook is the Continue reading

Design of a Modern Cache

This is a guest post by Benjamin Manes, who did engineery things for Google and is now doing engineery things for a new load documentation startup, LoadDocs.

Caching is a common approach for improving performance, yet most implementations use strictly classical techniques. In this article we will explore the modern methods used by Caffeine, an open-source Java caching library, that yield high hit rates and excellent concurrency. These ideas can be translated to your favorite language and hopefully some readers will be inspired to do just that.

Eviction Policy

A cache’s eviction policy tries to predict which entries are most likely to be used again in the near future, thereby maximizing the hit ratio. The Least Recently Used (LRU) policy is perhaps the most popular due to its simplicity, good runtime performance, and a decent hit rate in common workloads. Its ability to predict the future is limited to the history of the entries residing in the cache, preferring to give the last access the highest priority by guessing that it is the most likely to be reused again soon...

Stuff The Internet Says On Scalability For January 22nd, 2016

Hey, it's HighScalability time:


The Imaginary Kingdom of Aurullia. A completely computer generated fractal. Stunning and unnerving.

 

If you like this Stuff then please consider supporting me on Patreon.
  • 42,000: drones from China securing the South China Sea; 1 billion: WhatsApp active users; 2⁻¹²²: odds of a two GUIDs with 122 random bits colliding; 25,000 to 70,000: memory chip errors per billion hours per megabit; 81,500: calories in a human body; 62: people as wealthy as half of world's population; 1.66 million: App Economy jobs in the US; 521 years: half-life of DNA; 0.000012%: air passenger fatalities; $1B: Microsoft free cloud resources for nonprofits; 4000-7000+: BBC stats collected per second; $1 billion: Google's cost to taste Apple's pie;

  • Quotable Quotes:
    • @mcclure111: 1995: Every object in your home has a clock & it is blinking 12:00 / 2025: Every object in your home has a IP address & the password is Admin
    • @notch: Coming soon to npm: tirefire.js, an asynchronous framework for implementing helper classes for reinventing the wheel. Based on promises.
    • @ayetempleton: Fun fact: You are MORE likely to win a million or Continue reading

Why does Unikernel Systems Joining Docker Make A Lot of Sense?

Unikernel Systems Joins Docker. Now this is an interesting match. The themes are security and low overhead, though they do seem to solve the same sort of problem.

So, what's going on?

In FLOSS WEEKLY 302 Open Mirage, starting at about 10 minutes in, there are a series of possible clues. Dr. Anil Madhavapeddy, former CTO of Unikernel Systems, explains their motivation behind the creation of unikernels. And it's a huge and exciting vision...

1 21 22 23 24 25 27