This article describes how use the instrumentation built into ConnectX SmartNICs for data center wide network visibility. Real-time network telemetry for automation provides some background, giving an overview of the sFlow industry standard with an example of troubleshooting a high performance GPU compute cluster.
Linux as a network operating system describes how standard Linux APIs are used in NVIDIA Spectrum switches to monitor data center network performance. Linux Kernel Upstream Release Notes v5.19 describes recent driver enhancements for ConnectX SmartNICs that extend visibility to servers for end-to-end visibility into the performance of high performance distributed compute infrastructure.
The open source Host sFlow agent uses standard Linux APIs to configure instrumentation in switches and hosts, streaming the resulting measurements to analytics software in real-time for comprehensive data center wide visibility.
Packet sampling provides detailed visibility into traffic flowing across the network. Hardware packet sampling makes it possible to monitor 400 gigabits per second interfaces on the server at line rate with minimal CPU/memory overhead.psample { Continue reading
We discuss how Kolide tools engage the user to improve end-point security. Monitoring devices and then contacting the user to gather more information and provide contextual questions is a novel approach.
The post HS031 Kolide and Honest Security appeared first on Packet Pushers.
It is a common design to have an internet Edge router connected to two different internet service providers to protect against the failure of an ISP bringing the office down. The topology may look something like this:
The two ISPs are used in an active/standby fashion using static routes. This is normally implemented by using two default routes where one of the routes is a floating static route. It will look something like this:
ip route 0.0.0.0 0.0.0.0 203.0.113.1 name PRIMARY ip route 0.0.0.0 0.0.0.0 203.0.113.9 200 name SECONDARY
With this configuration, if the interface to ISP1 goes down, the floating static route which has an administrative distance (AD) of 200 will be installed and traffic will flow via ISP2. The drawback to this configuration is that it only works if the physical interface goes down. What happens if ISP1’s CPE has the interface towards the customer up but the interface towards the ISP Core goes down? What happens if there is a failure in another part of the ISP’s network? What if all interfaces are up but Continue reading
Hello my friend,
Recently we’ve been working on an interesting (at least for me) project, which is an MVP of the highly available infrastructure for web services. There are multiple approaches existing to create such a solution including “simply” putting everything in Kubernetes. However, in our case we are building a solution for a telco cloud, which is traditionally not the best candidate for a cloud native world. Moreover, putting it to Kubernetes will require to build a Kubernetes cluster first, which is completely separate magnitude of the problem. Originally we were planning to write this blogpost the last weekend, but it took us a little bit longer to put everything together properly. Let’s see, what we are to share with you.
1
2
3
4
5 No part of this blogpost could be reproduced, stored in a
retrieval system, or transmitted in any form or by any
means, electronic, mechanical or photocopying, recording,
or otherwise, for commercial purposes without the
prior permission of the author.
Yes, today’s blogpost is dedicated to the network technologies (to a huge mix of different network and infrastructure technologies, to be honest). That’s why there Continue reading