Considerations for using IaC with Cluster API

In other posts on this site, I’ve talked about both infrastructure-as-code (see my posts on Terraform or my posts on Pulumi) and somewhat separately I’ve talked about Cluster API (see my posts on Cluster API). And while I’ve discussed the idea of using existing AWS infrastructure with Cluster API, in this post I wanted to try to think about how these two technologies play together, and provide some considerations for using them together.

I’ll focus here on AWS as the cloud provider/platform, but many of these considerations would also apply—in concept, at least—to other providers/platforms.

In no particular order, here are some considerations for using infrastructure-as-code and Cluster API (CAPI)—specifically, the Cluster API Provider for AWS (CAPA)—together:

  • If you’re going to need the CAPA workload clusters to have access to other AWS resources, like applications running on EC2 instances or managed services like RDS, you’ll need to use the additionalSecurityGroups functionality, as I described in this blog post.
  • The AWS cloud provider requires certain tags to be assigned to resources (see this post for more details), and CAPI automatically provisions new workload clusters with the AWS cloud provider when running on AWS. Thus, you’ll want to make Continue reading

Culture at AnsibleFest 2020

At Red Hat, we’ve long recognized that the power of collaboration enables communities to achieve more together than individuals can accomplish on their own. Developing an organizational culture that empowers communities to flourish and collaborate -- whether in an open source community or for an internal community of practice -- isn’t always straightforward. This year at AnsibleFest, the Culture topic aims to demystify some of these areas by sharing the stories, practices, and examples that can get you on your path to better collaboration. 

 

Culture at AnsibleFest: “Open” for participation

Because we recognize that culture is not a “one size fits all” topic, we’ve made sure to sprinkle nearly every track at AnsibleFest with relevant content to help every type of Ansible user (or manager of Ansible users!) participate in developing healthy cultures and communities of automation inside their organizations. 

Whether you’re interested in contributing to open source communities, learning how others have grown the use of Ansible inside their departments or organizations, or if you’re simply interested in building healthy, diverse, inclusive communities, inside or outside the workplace -- the Culture (cross) Channel at AnsibleFest has you covered. 

 

Be a Cultural Catalyst for Continue reading

Network Automation Products for Brownfield Deployments

Got this question from one of my long-time readers:

I am looking for commercial SDN solutions that can be deployed on top of brownfield networks built with traditional technologies (VPC/MLAG, STP, HSRP) on lower-cost networking gear, where a single API call could create a network-wide VLAN, or apply that VLAN to a set of ports. Gluware is one product aimed at this market. Are there others?

The two other solutions that come to mind are Apstra AOS and Cisco NSO. However, you probably won’t find a simple solution that would do what you want to do without heavy customization as every network tends to be a unique snowflake. 

Network Automation Products for Brownfield Deployments

Got this question from one of my long-time readers:

I am looking for commercial SDN solutions that can be deployed on top of brownfield networks built with traditional technologies (VPC/MLAG, STP, HSRP) on lower-cost networking gear, where a single API call could create a network-wide VLAN, or apply that VLAN to a set of ports. Gluware is one product aimed at this market. Are there others?

The two other solutions that come to mind are Apstra AOS and Cisco NSO. However, you probably won’t find a simple solution that would do what you want to do without heavy customization as every network tends to be a unique snowflake. 

Extend Your Fortinet FortiManager to Kubernetes

Companies are leveraging the power of Kubernetes to accelerate the delivery of resilient and scalable applications to meet the pace of business. These applications are highly dynamic, making it operationally challenging to securely connect to databases or other resources protected behind firewalls.

Visibility into Kubernetes Infrastructure is Essential

Lack of visibility has compliance implications. Like any on-premises or cloud-based networked services, Kubernetes production containers must address both organizational and regulatory security requirements. If compliance teams can’t trace the history of incidents across the entire infrastructure, they can’t adequately satisfy their audit requirements. To enable the successful transition of Kubernetes pilot projects to enterprise-wide application rollouts, companies must be able to extend their existing enterprise security architecture into the Kubernetes environment.

In response, Fortinet and Tigera jointly developed a suite of Calico Enterprise solutions for the Fortinet Security Fabric that deliver both north-south and east-west visibility and help ensure consistent control, security, and compliance. Key among these integrations is the FortiManager Calico Kubernetes Controller, which enables Kubernetes cluster management from the FortiManager centralized management platform in the Fortinet Fabric Management Center.

View and Control the Kubernetes Environment with FortiManager

The FortiManager Calico Kubernetes Controller translates FortiManager policies into granular Kubernetes network Continue reading

Pluribus goes big to support larger, multi-vendor data center networks

Pluribus has fine-tuned its switch fabric software to support larger, distributed multi-vendor data centers. Specifically, the company has enabled its Adaptive Cloud Fabric to scale from its current level of support for 64 nodes to up to 1,024 switches in a unified fabric. The scale-up is part of the company's recently upgraded core network operating system, Netvisor One, which is a virtualized Linux-based NOS that provides Layer 2 and Layer 3 networking and distributed fabric intelligence. The NOS virtualizes switch hardware and implements the company's Adaptive Cloud Fabric. Adaptive Cloud Fabric operates without a controller and can be deployed across a single data center, or targeted to specific racks, pods, server farms or hyperconverged infrastructures, the company said.To read this article in full, please click here

Pluribus goes big to support larger, multi-vendor data center networks

Pluribus has fine-tuned its switch fabric software to support larger, distributed multi-vendor data centers. Specifically, the company has enabled its Adaptive Cloud Fabric to scale from its current level of support for 64 nodes to up to 1,024 switches in a unified fabric. The scale-up is part of the company's recently upgraded core network operating system, Netvisor One, which is a virtualized Linux-based NOS that provides Layer 2 and Layer 3 networking and distributed fabric intelligence. The NOS virtualizes switch hardware and implements the company's Adaptive Cloud Fabric. Adaptive Cloud Fabric operates without a controller and can be deployed across a single data center, or targeted to specific racks, pods, server farms or hyperconverged infrastructures, the company said.To read this article in full, please click here

Intel, Nvidia launch new networking processor initiatives

In recent days Intel and Nvidia have introduced or announced new networking products with a common goal of offloading networking traffic to the network processor, thus freeing up the CPU for computational work.Intel announced a new networking initiative to capitalize on what it calls “a perfect storm of 5G, edge buildout and pervasive artificial intelligence” with an expanded lineup of hardware, software and solutions for network infrastructure.This includes enhancements to Intel’s software reference architecture, FlexRAN; Intel virtualized radio access network (vRAN) dedicated accelerator; network-optimized next-generation Intel Xeon Scalable and D processors (codenamed “Ice Lake”); and upgraded Intel Select Solutions for Network Function Virtualization Infrastructure (NFVI).To read this article in full, please click here

Intel, Nvidia launch new networking processor initiatives

In recent days Intel and Nvidia have introduced or announced new networking products with a common goal of offloading networking traffic to the network processor, thus freeing up the CPU for computational work.Intel announced a new networking initiative to capitalize on what it calls “a perfect storm of 5G, edge buildout and pervasive artificial intelligence” with an expanded lineup of hardware, software and solutions for network infrastructure.This includes enhancements to Intel’s software reference architecture, FlexRAN; Intel virtualized radio access network (vRAN) dedicated accelerator; network-optimized next-generation Intel Xeon Scalable and D processors (codenamed “Ice Lake”); and upgraded Intel Select Solutions for Network Function Virtualization Infrastructure (NFVI).To read this article in full, please click here

Broadcom Mirror on Drop (MoD)

Networking Field Day 23 included a presentation by Bhaskar Chinni describing Broadcom's Mirror-on-Drop (MOD) capability. MOD capable hardware can generate a notification whenever a packet is dropped by the ASIC, reporting the packet header and the reason that the packet was dropped. MOD is supported by Trident 3, Tomahawk 3,  and Jericho 2 or later ASICs that are included in popular data center switches and widely deployed in data centers.

The recently published sFlow Dropped Packet Notification Structures specification adds drop notifications to industry standard sFlow telemetry export, complementing the existing push based counter and packet sampling measurements. The inclusion of drop monitoring in sFlow will allow the benefits of MOD to be fully realized, ensuring consistent end-to-end visibility into dropped packets across multiple vendors and network operating systems.

Using Advanced Telemetry to Correlate GPU and Network Performance Issues demonstrates how packet drop notifications from NVIDA Mellanox switches forms part of an integrated sFlow telemetry stream that provides the system wide observability needed to drive automation.

MOD instrumentation on Broadcom based switches provides the foundation needed for network vendors to integrate the Continue reading

Using Advanced Telemetry to Correlate GPU and Network Performance Issues


The image above was captured from the recent talk Using Advanced Telemetry to Correlate GPU and Network Performance Issues [A21870] presented at the NVIDIA GTC conference. The talk includes a demonstration of monitoring a high performance GPU compute cluster in real-time. The real-time dashboard provides an up to the second view of key performance metrics for the cluster.

This diagram shows the elements of the GPU compute cluster that was demonstrated. Cumulus Linux running on the switches reduces operational complexity by allowing you to run the same Linux operating system on the network devices as is run on the compute servers. sFlow telemetry is generated by the open source Host sFlow agent that runs on the servers and the switches, using standard Linux APIs to enable instrumentation and gather measurements. On switches, the measurements are offloaded to the ASIC to provide line rate monitoring.

Telemetry from all the switches and servers in the cluster is streamed to an sFlow-RT analyzer, which builds a real-time view of performance that can be used to drive operational dashboards and automation.

The Real-time GPU and network telemetry dashboard combines measurements from all the devices to provide view of cluster performance. Each of the three Continue reading

AnsibleFest 2020 Live Q&A

We are less than a week away from AnsibleFest 2020! We can’t wait to connect with you and help you connect with other automation lovers. We have some great content lined up for this year’s virtual experience and that includes some amazing Live Q&A Sessions. This year, you will be able to get your questions answered from Ansible experts, Red Hatters and Ansible customers. Let’s dive into what you can expect. 

 

Tuesday, October 13

11am

Live Q&A: Get all your network automation questions answered with Brad Thornton, Iftikhar Khan and Sean Cavanaugh

In this session, a panel of experts discuss a wide range of use cases around network automation.  They will talk about the Red Hat Ansible Automation Platform and the product direction including Ansible Network Collections, resource modules and managing network devices in a GitOps model. Bring your questions for the architects and learn more about how Red Hat is helping organizations operationalize automation in their network while bridging gaps between different IT infrastructure teams.

 

Live Q&A: Bridging traditional, container, and edge platforms through automation with Joe Fitzgerald, Ashesh Badani, and Stefanie Chiras

Join this panel discussion, moderated by Kelly Fitzpatrick (Redmonk), to hear from Continue reading

Day Two Cloud 069: The Life Of A Site Reliability Engineer (SRE)

On today's Day Two Cloud podcast we talk with a real-live SRE, or Site Reliability Engineer, who works in an IT group that delivers applications using DevOps principles as part of their day-to-day work. Our guest is James Quigley, SRE at Bloomberg. He and his team builds infrastructure and tooling for application and infrastructure teams to develop for the public cloud.

The post Day Two Cloud 069: The Life Of A Site Reliability Engineer (SRE) appeared first on Packet Pushers.

New Collab, Support and Vulnerability Scanning Enhance Docker Pro and Team Subscriptions

Last March, we laid out our commitment to focus on developer experiences to help build, share, and run applications with confidence and efficiency. In the past few months we have delivered new features for the entire Docker platform that have built on the tooling and collaboration experiences to improve the development and app delivery process.

During this time, we have also learned a lot from our users about ways Docker can help improve developer confidence in delivering apps for more complicated use cases and how we can help   larger teams improve their ability to deliver apps in a secure and repeatable manner. Over the next few weeks, you will see a number of new features delivered to Docker subscribers at the free, Pro and Team level that deliver on that vision for our customers. 

Today, I’m excited to announce the first set of features: vulnerability scanning in Docker Hub for Pro and Team subscribers. This new release enables individual and team users to automatically monitor, identify and ultimately resolve security issues in their applications. We will also preview Desktop features that will rollout over the next several months.   

We’ve heard in numerous interviews with team managers that Continue reading