Archive

Category Archives for "High Scalability"

Give Meaning to 100 Billion Events a Day — The Shift to Redshift

In part one, we described our Analytics data ingestion pipeline, with BigQuery sitting as our data warehouse. However, having our analytics events in BigQuery is not enough. Most importantly, data needs to be served to our end-users.

TL;DR — Teads Analytics big picture

In this article, we will detail:

  • Why we chose Redshift to store our data marts,
  • How it fits into our serving layer,
  • Key learnings and optimization tips to make the most out of it,
  • Orchestration workflows,
  • How our data visualization apps (Chartio, web apps) benefit from this data.

Data is in BigQuery, now what?

Design Of A Modern Cache—Part Deux

This is a guest post by Benjamin Manes, who did engineery things for Google and is now doing engineery things as CTO of Vector.

The previous article described the caching algorithms used by Caffeine, in particular the eviction and concurrency models. Since then we’ve made improvements to the eviction algorithm and explored a new approach towards expiration.

Eviction Policy

Window TinyLFU (W-TinyLFU) splits the policy into three parts: an admission window, a frequency filter, and the main region. By using a compact popularity sketch, the historic frequencies are cheap to retain and lookup. This allows for quickly discarding new arrivals that are unlikely to be used again, guarding the main region from cache pollution. The admission window provides a small region for recency bursts to avoid consecutive misses when an item is building up its popularity.

 

 

This structure works surprisingly well for many important workloads like database, search, and analytics. These cases are frequency-biased where a small admission window is desirable to filter aggressively...

Stuff The Internet Says On Scalability For February 22nd, 2019

Wake up! It's HighScalability time:

 

Isn't inetd a better comp? (link)

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. Know anyone who needs cloud? I wrote Explain the Cloud Like I'm 10 just for them. It has 39 mostly 5 star reviews. They'll learn a lot and love you forever.

 

  • 2%: of sales spent by consumer packaged goods companies on R&D (14% for tech); 272 million: metric tons of plastic are produced each year around the globe; 100+ fps: Google's Edge TPU; 6,000: bugs per million lines of code; 2.2 GB/sec: SIMD JSON parser; 20-30%: fall in DRAM prices; 8x: Russian hackers faster than North Korean hackers; 50%: EV car sales in China by 2025;

  • Quoteable Quotes:
    • @davygreenberg: If I do a job in 30 minutes it’s because I spent 10 years learning how to do that in 30 minutes. You owe me for the years, not the minutes.
    • @PaulDJohnston: Lambda done badly is still better than Kubernetes done well
    • Ross Mcilroy: we now believe that speculative vulnerabilities on today's hardware defeat all language-enforced confidentiality with no known Continue reading

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Join Etleap, an Amazon Redshift ETL tool to learn the latest trends in designing a modern analytics infrastructure. Learn what has changed in the analytics landscape and how to avoid the major pitfalls which can hinder your organization from growth. Watch a demo and learn how Etleap can save you on engineering hours and decrease your time to value for your Amazon Redshift analytics projects. Register for the webinar today.

  • Advertise your event here!

Cool Products and Services

  • Shape the future of software in your industry. The Software Buyers Council is a panel of engineers and managers who want to share expert knowledge, contribute to improvement of software, and help startups in their industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. No obligations, no marketing emails or sales calls. Upcoming topics include infrastructure and application monitoring, AI/ML platforms, and more. Learn Continue reading

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

Redis Cluster is the native sharding implementation available within Redis that allows you to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. At ScaleGrid, we recently added support for Redis Clusters on our platform through our fully managed Redis hosting plans. In this post, we’re going to introduce you to the advanced Redis Cluster sharding opportunities, discuss its advantages and limitations, when you should deploy, and how to connect to your Redis Cluster.

Sharding with Redis Cluster

Stuff The Internet Says On Scalability For February 15th, 2019

Wake up! It's HighScalability time:

 

Opportunity crossed over the rainbow bridge after 15 years of loyal service. "Our beloved Opportunity remains silent." 

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. Know anyone who needs cloud? I wrote Explain the Cloud Like I'm 10 just for them. It has 39 mostly 5 star reviews. They'll learn a lot and love you forever.

 

  • 200 million: per day YouTube videos recommended on home page; $9.3 billion: 27% increase in AI funding; 70%: Microsoft security bugs are memory safety issues; 11: new version of Perl; 24%: serverless users are new to cloud computing; 1 million: SpaceX satellite uplinks; $500K: ticket to mars; $13 billion: Google's new datacenter construction; 59%: increase in Tesla Autosteer accidents; $.30: reddit per user revenue; 38%: Airbnb bugs preventable by using types; 60K: data breaches reported since GDPR; 350: theoretical max rock stone skips;

  • Quoteable Quotes:
    • @gchaslot: Brian's hyper-engagement slowly biases YouTube: 1/ People who spend their lives on YT affect recommendations more 2/ So the content they watch gets more views 3/ Continue reading

Stuff The Internet Says On Scalability For February 8th, 2019

Wake up! It's HighScalability time:

 

Change is always changing. What will the next 5 years look like?

 

Do you like this sort of Stuff? I'd greatly appreciate your support on Patreon. Know anyone who needs cloud? I wrote Explain the Cloud Like I'm 10 just for them. It has 35 mostly 5 star reviews. They'll learn a lot and love you forever.

 

  • 16,000: Chrome bugs found with ClusterFuzz;  $2,000,000: for Apple iOS remote jailbreak; $1 million: think twice when profiting from a bug; 0: clicks to over the air explotation of Marvell Avastar Wi Fi; $300: cost for a bounty hunter to track your phone's location; 321M: Twitter MAUs; 3: years of falling smartphone shipments; 50%: new development uses microservices; 8 inches: big difference in cell phone radiation; ...
  • Quoteable Quotes:
    • @pczarkowski: As I keep telling people, if you have a kubernetes strategy you've already failed. Kubernetes should be an implementation detail at the tactical level to deal with the strategic imperative of solving the problems that are halting the flow of money.
    • EFF: EU countries that do not have zero rating practices enjoyed a double digit drop Continue reading

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Advertise your event here!

Cool Products and Services


  • Shape the future of software in your industry. The Software Buyers Council is a panel of engineers and managers who want to share expert knowledge, contribute to improvement of software, and help startups in their industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. No obligations, no marketing emails or sales calls. Upcoming topics include infrastructure and application monitoring, AI/ML platforms, and more. Learn more and join today.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Advertise your event here!

Cool Products and Services


  • Shape the future of software in your industry. The Software Buyers Council is a panel of engineers and managers who want to share expert knowledge, contribute to improvement of software, and help startups in their industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. No obligations, no marketing emails or sales calls. Upcoming topics include infrastructure and application monitoring, AI/ML platforms, and more. Learn more and join today.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

Stuff The Internet Says On Scalability For February 1st, 2019

Wake up! It's HighScalability time:

 

Memory module for the Apollo Guidance Computer (Mike Stewart). The AGC weighed 70 pounds and had 2048 words of RAM in erasable core memory and 36,864 words of ROM in core rope memory. It flew to the moon.

 

Do you like this sort of Stuff? Please go to Patreon and do what comes natural. Need cloud? Stand under Explain the Cloud Like I'm 10 (35 nearly 5 star reviews).

 

  • $10.9B: Apple's Q1 services revenue;  5 million: homes on Airbnb; 2.5-5%: base64 gzipped files close to original; 60MB/s: Dropbox per Kafka broker throughput limit; 12%: Microsoft's increased revenues; 9: new datasets; 900 million: installed iPhones; $5.7B: 2018 game investment; way down: chip growth; 

  • Quotable Quotes:
    • Daniel Lemire: Most importantly, I claim that most people do not care whether they work on important problems or not. My experience is that more than half of researchers are not even trying to produce something useful. They are trying to publish, to get jobs and promotions, to secure grants and so forth, but advancing science is a secondary concern.
    • @da_667: The moral of Continue reading

This is a guest post from Ryan Averill at FraudGuard.io.

At FraudGuard.io we are a team of just a few developers; all working with our customers to try to make their applications as safe as possible. We have been working on FraudGuard for about 3 years and we’ve had paying customers for more than 2 years now. The main idea behind FraudGuard is for us to get attacked so you don’t have to. In other words; reduce the overall number of attacks your application receives each day by leveraging our threat data. We do this by by taking our attack data from our network of honeypots and share that data via API direct to you. Instead of some businesses just running services like Maxmind, that update occasionally, we actually run the entire process in house so we can immediately share real-time attack data from around the world....

Stuff The Internet Says On Scalability For January 25th, 2019

Wake up! It's HighScalability time:

 

My god, it's full of synapses! (3D map of a fly's brain)

 

Do you like this sort of Stuff? Please go to Patreon and do what comes natural. Need cloud? Stand under Explain the Cloud Like I'm 10 (35 nearly 5 star reviews).

 

  • 10%: Netflix captured screen time in US; 8.3 million: concurrent Fortnite players;  773 Million: Record "Collection #1" Data Breach; 284M+: Reddit monthly views; 1 billion: people impacted by data breaches; 1st: seed germinated on the moon; 4x: k8s api growth from v1 to v1.4; 7x: faster PyPy python; 9B: gallons of water/day for lawns; 2.6 terabytes: largest data leak in history; $14B: serverless market by 2024; 100 million: Alexas sold; 51%: mobile games share of global market; 160 TB: total data transfer during re:Invent 2018; 100+ million: stackoverflow users; 40%: increase in median data usage; 3%: drop in Comcast's network spending; 1 billion: tweets about gaming in 2018; 53%: investment of Baidu, Alibaba, and Tencent in China's 190 major AI companies; 104,954: hard drives Continue reading

Stuff The Internet Says On Scalability For January 18th, 2019

Sorry, Stuff The Internet Says On Scalability has been called on the account of wind, rain, power outages and general mayhem. We're all safe, but it's hard to write a post using stone knives and bear skins. See you next week.

 

 

Stuff The Internet Says On Scalability For January 11th, 2019

Wake up! It's HighScalability time:

 

The modern day inner sanctum revealed for all to experience. Nausea no extra charge.

 

Do you like this sort of Stuff? Please support me on Patreon. Need cloud? Consume Explain the Cloud Like I'm 10 (35 nearly 5 star reviews).

 

  • 8x: V8 Promiss.all parallel performance improvement;  1.3%: print sales increase; 11%: over 65 shared a hoax; 40%: add jobs after deploying AI; 3%: Eventbot's revenue pledged to open source; 51%: successful Ethereum attack; $308,620: cost of a Bitcoin 51% attack; $30: Apple services revenue per device per year; .3 cents: earnings from selling private data; 2,000: baguettes a day produced on French aircraft carrier; 5.6 nm: future smallest grains on a magnetic disk; 11,000: free books from 1923; 

  • Quotable Quotes:
    • @mekkaokereke: He joined SpaceX as a "founding employee." He designed the Merlin engine. He's CTO of Propulsion. His name is Tom Mueller. Everyone knows Elon Musk. No one knows Tom Mueller, even though Tom is the one currently designing a rocket that will put humans on Mars. ??‍♂️
    • Dr. Rachael Tatman: My universal advice Continue reading

MySQL High Availability Framework Explained – Part II

In Part I, we introduced a High Availability (HA) framework for MySQL hosting and discussed various components and their functionality. Now in Part II, we will discuss the details of MySQL semisynchronous replication and the related configuration settings that help us ensure redundancy and consistency of the data in our HA setup. Make sure to check back in for Part III where we will review various failure scenarios that could arise and the way the framework responds and recovers from these conditions. What is MySQL Semisynchronous Replication? Simply put, in a MySQL semisynchronous replication configuration, the master commits transactions to the storage engine only after receiving acknowledgement from at least one of the slaves. The slaves would provide acknowledgement only after the events are received and copied to the relay logs and also flushed to the disk. This guarantees that for all transactions committed and returned to the client, the data exists on at least 2 nodes. The term ‘semi’ in semisynchronous (replication) is due to the fact that the master commits the transactions once the events are received and flushed to relay log, but not necessarily committed to the data files on the slave. This is in contrast to Continue reading

Slow MySQL Start Time in GTID mode? Binary Log File Size May Be The Issue

Have you been experiencing slow MySQL startup times in GTID mode? We recently ran into this issue on one of our MySQL hosting deployments and set out to solve the problem. In this blog, we break down the issue that could be slowing down your MySQL restart times, how to debug for your deployment, and what you can do to decrease your start time and improve your understanding of GTID-based replication.

How We Found The Problem

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Who's Hiring? 


  • Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Make your job search O(1), not O(n). Apply here.

  • Need excellent people? Advertise your job here! 

Fun and Informative Events

  • Advertise your event here!

Cool Products and Services


  • Shape the future of software in your industry. The Software Buyers Council is a panel of engineers and managers who want to share expert knowledge, contribute to improvement of software, and help startups in their industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. No obligations, no marketing emails or sales calls. Upcoming topics include infrastructure and application monitoring, AI/ML platforms, and more. Learn more and join today.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

How DevOps Should Use DBaaS (Database-as-a-Service) To Optimize Their Application Development

This post was written by Wendy Dessler of The Blog Frog.

Database-as-a-Service (DBaaS) is quickly gaining in popularity across the tech world. These software platform solutions helps users easily manage their database operations without having to really understand any of the abstractions. This allows developers, DBA’s and DevOps engineers to quickly automate their backups, create new SQL and NoSQL clusters, and monitor the performance of their databases for their application without requiring any internal database expertise.

DBaaS falls under the umbrella of Platform-as-a-Service (PaaS) where the platform itself is actually a database or several databases. This is a great choice for DevOps in particular because it allows for more developer agility, productivity, and also security.

Flexibility and scalability are becoming more important in the world of DevOps and technology in general, and we all know how fast this world moves. Businesses need new ways to keep up with the competition, and developers are looking for an easy, self-service model for managing their databases in order to optimize their app development. Let’s break down the individual benefits so you can decide if DBaaS is right for your DevOps team.

1. Outsourced Security and Administration

Stuff The Internet Says On Scalability For January 4th, 2019

Wake up! It's HighScalability time:

 

Solar system? Nope, the beauty is in your head—neural art.

 

Do you like this sort of Stuff? Please support me on Patreon. Need cloud? Explain the Cloud Like I'm 10 (34 almost 5 star reviews).

 

  • 45%: learned scheduler improves average job completion time; 61%: apps share data with Facebook; 45,037,125: people who watched Bird Box on Netflix in first week; 32,368: color images collected by the Curiosity rover on Mars between August 2012 and November 2018; $36.1B: AI  healthcare market by 2025; 20%: object recognition failure in light rain; 350: pages in Donald Knuth's new book;  $1,000: price needed to deactivate Facebook account; $3B: Epic games profit; 1: bitcoin mined from body heat of 44,000 people; 

  • Quotable Quotes:
    • @Hannah_Chutzpah: What are the technical terms, in your field, for 'dunno'? In medicine there's 'idoeopathic' In archeology/anthropology there's 'ritual purposes' How do you professionally term 'we haven't got a clue'?
      • @peterseibel: In programming we don't have a special term for it, just like fish don't have a special term for 'water'.
    • @robn: everyone seems to want stateless services and stateless protocols but Continue reading

Stuff The Internet Says On Scalability For December 21st, 2018

Wake up! It's HighScalability time:

 

Have a very scalable Xmas everyone! See you in the New Year.

 

Do you like this sort of Stuff? Please support me on Patreon. I'd really appreciate it. Still looking for that perfect xmas gift? What could be better than a book on the cloud? Explain the Cloud Like I'm 10. And if you know someone with hearing problems they might find Live CC useful.

 

  • 33.5 billion: Pornhub visits; 122 million: miles traveled by Santa; 32,342: government requests to Apple for user data; 10x: faster helicopter design using VR instead of physical models and mockups; 4403: petabytes transferred by Pornhub; 59%: dropped leads on Google AMP; 160: streaming shows now outnumber their traditional-TV counterparts; 80%: machine learning engineers work at Google or Facebook; 25%: adults check phone immediately on waking; 164: iPhone apps made $1 million through in-app subscriptions; 750 petabytes: Backblaze storage; 

  • Quotable Quotes:
    • @ flight radar24: Yesterday was the busiest day of the year in the skies so far and our busiest day ever. 202,157 flights tracked! The first time we've tracked more than 200,000 flights in a single day Continue reading
1 7 8 9 10 11 30