Category Archives for "Life As A Network Engineer – Rakesh"

A simple BPFTrace to see TCP SendBytes as a Histogram


A significant difference between BCC and BPF is that BCC is used for complex analysis while BPF programs are mostly one-liners and are ad-hoc based. BPFTrace is an open-source tracer, reference below — Excellent introduction to EBPF — Excellent Resource.

Let me keep this short, we will try to use BPFTrace and capture TCP

We will need

  1. Netcat
  2. DD for generating a dummy 1GB File
  3. bpftrace installed

To understand the efficiency of this, let’s attach a Tracepoint, a Kernel Static Probe to capture all of the new processes that get triggered, imagine an equivalent of a TOP utility with means of reacting to the event at run-time if required — Lists out type of probes and their utility

We can clearly see we invoked a BPFTrace for tracepoint system calls which takes execve privilege, I executed the ping command and various other commands and you can see that executing an inbound SSH captured invoke of execve-related commands and the system banner.

sudo bpftrace -e 'tracepoint:syscalls:sys_enter_execve { join(args->argv); }'

Attaching 1 probe...

ping -c 1
/usr/bin/clear_console -q
/usr/sbin/sshd -D -o AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %u Continue reading

FlameGraph Htop — Benchmarking CPU— Linux


I have written a small post on what happens at a Process-Level, now let’s throw some flame into it with flame-graphs

Am a fan of Brendan Gregg’s work and his writings and flame graph tool are his contribution to the open-source community —

Before moving into Flamegraph, let’s understand some Benchmarking concepts.

Benchmarking in general is a methodology to test resource limits and regressions in a controlled environment. Now there are two types of benchmarking

  • Micro-Benchmarking — Uses small and artificial workloads
  • Macro-Benchmarking — Simulates client in part or total client workloads

Most Benchmarking scenario results boil down to the price/performance ratio. It can slowly start with an intention to provide proof-of-concept testing to test application/system load to identify bottlenecks in the system for troubleshooting or enhancing the system or to know about the maximum stress system simply is capable of taking.

Enterprise / On-premises Benchmarking: let’s take a simple scenario to build out a data centre which has huge racks of networking and computing equipment. As Data-centre builds are mostly identical and mirrored, benchmarking before going for Purchase-order is critical.

Cloud-based Benchmarking: This is a really in-expensive setup. While Continue reading

AWS Direct Connect Site-Link — A very excellent service


Site-link is really a nice extension to the DX Gateway’s offering. Let me simplify it.

Reference: — I Can’t Recommend this more, this is a very very nice read.

Few Important Points

  1. AWS Direct Connect Site Link is a private connection between your on-premises network and your AWS Direct Connect location.
  2. Site Link provides high bandwidth and low latency connection between your on-premises network and AWS.
  3. Site Link uses industry standard 802.1q VLANs to provide a secure connection between your on-premises network and AWS.
  4. Site Link is available in 1 Gbps and 10 Gbps speeds.
  5. You can use Site Link to connect to multiple AWS Direct Connect locations.
  6. The site Link is available in all AWS Regions.

Problem — I want to connect my two Data-Centres to Direct Connect Gateway through AWS Backbone.

Let’s see a reference Architecture

Image Credits — AWS

Replicating the above scenario

Few important aspects

  • Connect DC1-DC2 via AWS Global Backbone Network
  • If both DCs use the same BGP ASN 65001 in this case, use allowas-in to allow looping in AS-PATH
  • When you enable site-link BGP session won’t flap but it Continue reading

Transit Gateway — a one-stop shop!


I like Transit Gateway on so many levels, truly an NG service integrating many different points of ingress in a way with VPCs

Few important points to start with

  1. AWS Transit Gateway is a service that enables customers to connect their Amazon Virtual Private Clouds (VPCs) and on-premises networks to a single gateway.
  2. Transit Gateway is a hub that controls traffic routed among all the connected networks.
  3. Transit Gateway supports both IPv4 and IPv6 traffic.
  4. Transit Gateway is highly scalable and can support thousands of VPCs and on-premises networks.
  5. Transit Gateway uses route tables to determine how traffic is routed.
  6. Transit Gateway supports VPC peering and VPN connections.
  7. Transit Gateway can be used with AWS Direct Connect to create a private connection between an on-premises network and a VPN

Scenario 1 — Connect your VPCs

Interconnecting VPCs’s typically done through VPC-Peering, now while that is still valid you can easily interconnect VPCs through the transit gateway attachments feature, while VPC peering does only well VPC, transit gateway can connect VPCs, DX-Gateways and you can terminate IPSEC-VPN’s directly onto the transit gateway.

  • Routing tables are not auto-propagated, meaning you have to add static routes individually in Continue reading

Direct Connect — Part 2 — Public VIF


First Post ( Direct Connect – Part 1 )-

Direct Connect offering though it connects to AWS has a difference in operation depending on the VIF we connect.

Public VIF

→ So when we have this setup, this is in no way related to VPC at all, all this does is advertise Amazon-owned Public Prefixes for services like S3/EC2(Elastic-IP only, not your Private IP), and that’s all to it.

→ There is flexibility at the customer end to scope the advertisement propagation t LOCAL, CONTINENT, and GLOBAL levels within AWS in an outbound direction and has the flexibility to filter inbound updates which are advertised toward him.

Here is by default, how the Community scope looks like, you also have the flexibility to filter routes inbound to customers.

Note: Outbound communities restrict the advertisement of prefixes to region/continent/global scope for any sort of Any-cast implementations.

if the Customer sends a route with a community

7224:9100 → This will be local to the region

7224:9200 → This will be local to the continent, the scope is till the EU

7224:9300Global, by default its global even if you don’t export Continue reading

Direct Connect — Part 1


AWS Advanced Networking Prep and General focus

Notion —

What is the Direct Connect product trying to solve?

We have seen IPSEC Site-to-Site VPN, a nice extension to that is Direct Connect offering. In IPSEC VPN, we connected to AWS VPC securely over the internet, in Direct Connect we have a cable termination onto our Data Center premises which directly connects to AWS Infrastructure and no internet service providers are needed for this to happen.

AWS Direct Connect — Image Credits: :
AWS Direct Connect — Image Credits: :


  • Bypasses Internet and thereby secure
  • Low Latency to AWS services
  • Consistent Performance with up to speeds of 1/10/100 and support for jumbo frames > 9k

What are my building blocks?

  • We basically start with a Connection, pretty much self-explanatory
  • A Connection has the below requirements


Functional Building Block?


So, once we have a connection setup, everything revolves around VIF — Virtual Interface.

Direct Connect can be divided into two parts

a. Public VIF — we are speaking about public IP addresses routable on the internet.

AWS Advanced Networking — IPSEC Vpn with BGP (FRR and Docker)


The previous post covered IPSEC Vpn implementation with Static Routing and also had some points about IPSEC Vpn Implementation, this post aims at building IPSEC Vpn with Dynamic routing offered by VGW which is BGP.

Article on FRR, Docker —

We will re-use the same concept and will start a BGP route exchange over IPSEC VPN. — Notes and Topology

Lab Video —

Few points to note:

  • BGP ASN support is both for 2-byte and 4-byte
  • ASN Range is from 64512–65534
  • BGP Peering will happen over Tunnel endpoints with address 169.254.x.y/z which amazon usually initiates by default
  • If you are extending the strong-swan use case, you need to have a configuration reference for the static tunnel as there is no dynamic configuration generated for Strong-swan/Open-swan use case
  • In static and dynamic routings, VGW Route propagation needs to be done.
  • I have observed that left-subnet and right-subnet should be 0/0 in AWS for communicating BGP-TCP messages for session establishment.
  • This needs to be tested further and there is no BGP authentication that the user can define, as the user won’t have any control Continue reading

IPSEC VPN Site-to-Site — How to and notes for Advanced Networking Certification

< MEDIUM: > — This will be updated frequently and has the entire notes on the topics

Lab / Part 1—

Part 1 —

Lab / Part 2 —

Part 2 —

Lab / Part 3. —

Part 3 —


  • VPN — Virtual Private Network, often used to communicate securely over untrusted networks like the internet.
  • IPSEC is the protocol which is used for securing the data. Some other tunnelling protocols and frameworks are GRE, DMVPN, Wireguard etc
  • Two types of VPNs — Site-to-Site other is Client-to-site /Remote Access VPN, this lab will be a site-to-site VPN.
  • Site-to-Site, as the name suggests usually connects two sites and a Site is typically referred to as a group of devices in a Data-Center. Site-to-Site will enable two sites separated from the internet to communicate privately and securely over the internet.


  • Think along the lines of two boundary devices which encrypt and decrypt LAN traffic
  • Design Redundancy and Scalability along these lines for these two end-points
  • It is important to note that you can have VPN to access any services within Continue reading

AWS IPSEC Site-to-Site VPN

Notes — This will be updated frequently and has the entire notes on the topics


  • VPN — Virtual Private Network, often used to communicate securely over untrusted networks like the internet.
  • IPSEC is the protocol which is used for securing the data. Some other tunnelling protocols and frameworks are GRE, DMVPN, Wireguard etc
  • Two types of VPNs — Site-to-Site other is Client-to-site /Remote Access VPN, this lab will be a site-to-site VPN.
  • Site-to-Site, as the name suggests usually connects two sites and a Site is typically referred to as a group of devices in a Data-Center. Site-to-Site will enable two sites separated from the internet to communicate privately and securely over the internet.


  • Think along the lines of two boundary devices which encrypt and decrypt LAN traffic
  • Design Redundancy and Scalability along these lines for these two end-points
  • It is important to note that you can have VPN to access any services within your VPC as VPC can be visualised as a virtual Data-Center and thus you can not have a VPN for a service like S3 which is a public offering and can be reached via the Internet

Let’s imagine you have built your Continue reading

Tshark Packet Analysis


Commands used in the below post. If you wish for a quick reference instead of going through the post 

sudo tshark -f "tcp port 80" -F pcap -w /var/tmp/port_80_cap.pcap -c 10

sudo tshark -r /var/tmp/port_80_cap.pcap

sudo tshark -r /var/tmp/port_80_cap.pcap -Tfields -e ip.src -e tcp.port -e ip.ttl -e ip.dst

sudo tshark -f "tcp port 80" -F pcap -w /var/tmp/port_80_cap.pcap -c 10

sudo tshark -r /var/tmp/port_80_cap.pcap -Tfields -Y 
ip.dst== -e ip.dst -e tcp.dstport

sudo tshark -r capture_ospf.cap

sudo tshark -r capture_ospf.cap -Y "frame.number == 4"

sudo tshark -r capture_ospf.cap -Y "frame.number == 4" -V

Wireshark is famous for packet capture and analysis of various packet-capture files. Basically, if you never used Wireshark before it’s a sophisticated and popular GUI tool for doing packet captures and analysis.

While not every time you need a GUI tool or most importantly you don’t have access to a GUI environment, eg: you are running an EC2 cloud instance of ubuntu, typically you would not install a GUI extension to this, it is meant to run server workloads.

This is where Tshark Continue reading

Cleanup/Delete Transit-Gateway and Transit-Gateway-attachments


Am pasting my notes on cleaning up Transit-Gateway and Transit-gateway attachments, this is readily available on AWS documentation but thought I will paste it here if anyone wants to quickly copy and paste the steps instead of going through the documentation. We can be more sophisticated using Python / Ansible / Terraform and parse the outputs for now this is what I did to clean up some practice, do not forget this as it incurred good cost for but got saved by AWS credits!

1. list out available transit-gateway attachments as they are to be deleted first before deleting transit-gateway

aws ec2 describe-transit-gateway-attachments --region us-east-1 | egrep -i TransitGatewayAttachmentI -> This will list out TGW attachments in us-east-1

➜ ~ aws ec2 describe-transit-gateway-attachments --region us-east-1 | egrep -i TransitGatewayAttachmentId
"TransitGatewayAttachmentId": "tgw-attach-01b7c8d7d3bd4e2ca",
"TransitGatewayAttachmentId": "tgw-attach-050c87ef9fb703c98",
"TransitGatewayAttachmentId": "tgw-attach-079921a8810f490ab",

2. Delete the available attachments

aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-01b7c8d7d3bd4e2ca --region us-east-1

aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-050c87ef9fb703c98 --region us-east-1

aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-079921a8810f490ab --region us-east-1

3. ➜ List available Transit gateways

~ aws ec2 describe-transit-gateways --region us-east-1 | egrep -i "Transitgatewayid"
"TransitGatewayId": "tgw-08dfd0c519456953d"

4. Delete transit-gateway

aws ec2 delete-transit-gateway \
--transit-gateway-id tgw-08dfd0c519456953d --region us-east-1

"TransitGateway": Continue reading

AWS — Setting up Kinesis Video Stream is extremely easy with AWS Deeplens


Note : I wanted to quickly demo out Kinesis video streaming, I initially thought a local-mac was a good candidate, installation was extremely painful, then I created a ubuntu VM 22.04 had errors, then went to ubuntu 20.04, everything was fine but vmware abstraction was very poor for reason and camera was extremely slow, then finally I have decided that I will compile everything on AWS deeplens (which is inherently had ubuntu as base-os) but looks like deeplens has already covered installation of KVS module which I will write up below.

I was reading about Kinesis and power of Kinesis for huge pipelines of inbound data from various sources, what impressed me was Kinesis could be integrated into AWS Rekognition and AWS sagemaker as well to analyze various points of data streamed in real time video or saved video, apart from that AWS-IOT youtube channel is also planning out new series on integration of AWS Kinesis with S3 for image collection completely from Kinesis streaming video itself.

Related URLs

This is the URL that I have used to follow the process and it was pretty straight forward

AWS Continue reading

Linux — Debugging a Python Process with gdb


Have you anytime used py-bt inside gdb or used gdb to trace a linux process, if so you can skip this post

GDB is awesome! Am not a everyday programmer but I have been studying internal of linux and operating systems and its fascinating. I wrote about process last time <> and continuing towards threads, Its now making sense when we import modules in python lets examine below snippet which is typically used in Python.

>>> from concurrent.futures import ThreadPoolExecutor


>>> from concurrent.futures import ProcessPoolExecutor

One of the distinctions that I learnt when studying these two functions for multiprocessing is that typical use of ThreadPoolExecutor is when you have IO based activity while ProcessPool is for CPU bound compute activity, I left that

What I now understood is that thread is an integral part of process, there can be multiple threads with in a process and there can be multiple processes, each process needs to be scheduled for CPU cycles and you have different locking mechanism for single process. Now, scheduler algorithm schedules these process as linux is time-sharing based operating system, effectively schedules processes on many factors, Continue reading

Linux — Process, what happens under the hood?


I have taken operating systems for granted but looking internally it’s an excellent experience to understand how everything gets glued together, this post will be an overview of a process and what happens to a process.

A program is nothing but a set of instructions that the user writes for the desired behavior after execution. when the program gets into a running state that’s when you have a process which gets created and gets associated with the program, in other words, this is how to program is abstracted by the process.

Let’s consider the below python program, we will create some Child processes and Parent processes from the program and see how the underlying operating sees when we execute the program.

You don’t have to be a python expert to know what is going on in this program, basically, we can see the python program has an imported OS module and then it calls on fork() operation to create various child processes.

import os
def child():
    print('\nA new child', os.getpid())

def parent():
    while True:
        newpid = os.fork()
        if newpid == 0:
            pids = (os.getpid(), newpid)
            print("Parent: %d,  Continue reading

Configuring BGP and open-source FRR docker on AWS — Advanced Networking


What is FRR?
a. License based AWS internal routing platform 
b. Only supports static routing and IPSEC vpn 
c. Open-source internet routing protocol suite for *nix platforms
d. Support BGP along with ISIS,OSPF networking protocols

Answer is at the end of the post, feel free to skip it, I just did not want to make a spoiler residing just below the question

Before I write anything on implementation, I can vouch for FRR stability. It’s an open source internet routing protocol suite and used by many organisations on bare-metal and cloud instances as well, its very stable and

Simply put, FRR can make your bare metal or a cloud instance a routing platform to connect various networks together. The reason why we explore this is that this setup builds onto other posts on how AWS interacts with various routing platforms hosted from on-premises and to show the possibility if someone is considering FRR as an alternative.

Setup is extremely simple but there is one caveat which consumed almost a day for me to figure out and at last it was an answer to a known problem. FRR builds on Quagga which provided Continue reading

Traffic Mirroring- Interesting one — AWS Advanced Networking


What is Traffic Mirroring ? 
a. Used for Content Inspection,Threat Monitoring,Troubleshooting 
b. Can only be implemented with a Load Balancer 
c. Needs Elastic Fabric Adapter 
d. Flow logs capture mirrored traffic   

Answer is at the end of the post, feel free to skip it, I just did not want to  make a spoiler residing just below the question

Traffic Mirroring is an awesome concept which can now be implemented with an AWS VPC. You can mirror the traffic and send packets to a EC2 instance or specific appliances for further processing.

  • Used for Content Inspection, Threat Monitoring and Troubleshooting.
  • An interesting as aspect is Packet-Format

*So when a packet gets mirrored it gets VXLAN encapsulated, end host/appliance should be able to decapsulate VXLAN header( we will see a PCAP ).

[ packet-formats.html]

* Two encapsulations – outer GENEVE(from LB if used) and inner VX-LAN

* Source (which should be monitored — Network Interface)

*Target (Destination of mirrored Traffic)

*Filter (What traffic types should be Continue reading

Transit VPC — AWS — Advanced Networking

What is Transit Gateway in AWS used for ?
a. Interconnect One or more VPC's eliminating need for full mesh 
b. customer gateway in only one region
c. Enhanced NAT gateway 
d. Can be used to Connect SD-Wan with VPC's Answer is at the end of the post, feel free to skip it, I just did not want to make a spoiler residing just below the question

The post from transitive routing in AWS had a few different solutions at the end, the one which is most efficient and future-proof would be transit-gateway implementation for inter-VPC communication without needing a full mesh.

We will first explore an example and then come back to some of the concepts

Consider below VPCs, by default, there is no VPC peering and if we want to achieve connectivity we need to do n*(n-1)/2 number of peerings, this will quickly get out of hand as the VPCs increase.

The easiest way to achieve connectivity will be in 3 steps

  1. Create transit gateway
  2. Attach all the VPCs as attachments in the Transit gateway
  3. Most Importantly, create a route in the sub-net table for the destination sub-net via Transit gateway else connectivity will never work.

Continue reading

Transitive Routing — AWS — Advanced Networking

Before understanding the way AWS does transitive routing, let us try to wrap our head on transitive property in mathematics

What is Transitive Property?A property is called transitive property, if x, y and z 
are the three quantities, and if x is related to y by some rule, 
and y is related to z by the same rule, then we can say x is related to z by the same rule.

Alright, now let’s look at the following scenario

So Connectivity from VPC3-VPC1 would work just fine, VPC2-VPC1 will also work just fine while VPC2-VPC3/VPC3-VPC2 via VPC1 will never work in AWS, this is the first thing that we should remember.

I see only downsides! — well not everything is lost in this case, there are security benefits as well, large part of it plays a role in IP Address spoofing. Imagine someone is trying to send a packet to your VPC, check to make sure that the instance won’t accept the packet as that is not locally configured and also instance cannot send any of the packets with any source IP as well, that is one of the preliminary reasons why Source and Destination checks are turned off.

Continue reading

What is eBPF? How is it used?

This will be a Series of Posts on eBPF extensively covering XDP and its usage.

New technology, implemented in Linux, extends kernel functionalities without having to modify the kernel, Safe to execute with a verification engine, JIT compiler and LLVM (Virtual Environment) basically a safe and secure tiny VM.

Medium –

Some Background

As my career is mainly in Network Engineering, when some talks about Network performance my initial thoughts jump to increase network throughput, Port-Density, High speed and secure interconnect, I recently came across Systems Performance by Brendan Gregg. I have to say I have never ever imagined that the role is sought out, I went through the book ( and I was indeed mind blown by the granularity that one can look into an individual system.

I definitely would recommend anyone in Networking/Cloud/Systems Engineering to go through this book if you haven’t t already, it exposes a whole new level of Linux Kernel and E-BPF and Performance methodologies (Chapter 2) which I instantly fell in love with.

What Inspired me?

When I first saw the book I was under the initial impression that this was meant for Linux system Continue reading

VPC Endpoints – AWS

Who Should Read: If you are interested in VPC Endpoints or if you want to know more about AWS VPC services please continue.

I have been trying to understand endpoint services and thought I will write up a few posts on it, here are some posts I have written on medium(if you have access), I will port them to the blog by the weekend.

Again, these will be ported here as well along with an audio version.


1 2 3 5