Five things we’ve learned about monitoring containers and their orchestrators
This is a guest post by Apurva Davé, who is part of the product team at Sysdig.
Having worked with hundreds of customers on building a monitoring stack for their containerized environments, we’ve learned a thing or two about what works and what doesn’t. The outcomes might surprise you - including the observation that instrumentation is just as important as the application when it comes to monitoring.
In this post, I wanted to cover some details around what it takes to build a scale-out, highly reliable monitoring system to work across tens of thousands of containers. I’ll share a bit about what our infrastructure looks like, the design choices we made, and tradeoffs. The five areas I’ll cover:
-
Instrumenting the system
-
Relating your data to your applications, hosts, and containers.
-
Leveraging orchestrators
-
Deciding what to data to store
-
How to enable troubleshooting in containerized environments
For context, Sysdig is the container monitoring company. We’re based on the open source Linux troubleshooting project by the same name. The open source project allows you to see every single system call down to process, arguments, payload, and connection on a single host. The commercial offering turns all this data into thousands of Continue reading