Author Archives: Rakesh M
Author Archives: Rakesh M
Image sent to Telegram
I have a small greenhouse which was in the pipeline for over 2 years and I finally decided to build it. Whoever is in gardening will agree that anything grows better in the greenhouse at least it appears to be so.
Now, the initial impression is all good but I have plans to learn and explore both the plant sides of things and also some using some part of image analysis for a predictive action, for all that to happen I need a camera and a picture to start with.
The reason I choose to go with Event-bridge Pipe is to put this more into practice and from there on connect more Lambda and step-functions for future expansion of the project.
Architecture Diagram for sending Images Continue reading
< MEDIUM: https://raaki-88.medium.com/home-automation-finally-roller-curtains-and-nightmares-b8ef1fc473d9 >
Am a fan and enthusiast of home automation, tried various things in the past and now settled with few things which I would like to share.
. . .
Lets get to the Curtain Rollers — So for these here is the catch, I have a remote for these and thats about it, nothing more nothing less, My ideas were mostly around having network connectivity and manipulating them.
< MEDIUM: https://raaki-88.medium.com/enabling-nested-virtualisation-on-google-cloud-platform-instance-7f80f3120834 >
Important Excerpt from the below page.
https://cloud.google.com/compute/docs/instances/nested-virtualization/overview
You must run Linux-based OSes that can run QEMU; you can’t use Windows Server images.
You can’t use E2, N2D, or N1 with attached GPUs, and A2 machine types.
You must use Intel Haswell or later processors; AMD processors are not supported. If the default processor for a zone is Sandy Bridge or Ivy Bridge, change the minimum CPU selection for the VMs in that zone to Intel Haswell or later. For information about the processors supported in each zone, see Available regions and zones.
Though there are many use cases, I will speak from a networking standpoint. Let us say you need to do some sort of lab based on popular vMX Juniper or Cisco or any other vendor, if you have a bare metal instance, you have the ability to access the virtualised CPU cores and allocate them to the Qemu which will be the underlying emulator.
Almost by default most of the cloud providers will disable access to VT-x because of various reasons and some instances are not capable of supporting this by default. So either choose a custom instance with Continue reading
< MEDIUM: https://raaki-88.medium.com/buffer-overflow-linux-process-stack-creation-and-i-d6f28b0239dc >
Process and what happens during process creation have been discussed in this post previously — https://medium.com/@raaki-88/linux-process-what-happens-under-the-hood-49e8bcf6173c
Now, let’s understand what is buffer overflow:
A buffer overflow is a type of software vulnerability that occurs when a program tries to store more data in a buffer (a temporary storage area) than it can hold. This can cause the program to overwrite adjacent memory locations, potentially leading to the execution of malicious code or the crashing of the program. Buffer overflow attacks are a common method used by hackers to gain unauthorized access to a system.
Generally, C and C++ languages are more vulnerable to Buffer Overflow while programming languages like Python and Go have implementations which protect stack.
I have written the program in Python but had to use underlying C functionality to achieve similarly.
#!/usr/bin/python3
import ctypes
import pdb
buffer = ctypes.create_string_buffer(8)
ctypes.memmove(buffer, b"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",1000)
print('end of the program')
This is a very simple implementation where we created a buffer which can hold 8 bytes of memory, next we will create a new object which moves from one block of the memory to another but with a newer size, which will Continue reading
One biggest problem with my google drive is that it’s flooded with a lot of documents, images and everything which seems really important during that instant of time with names which are almost impossible to search later.
I tried various Google APIs and Python programs with Oauth2.0 and its integration is, not something easy and needs tinkering for the OAuth consent page.
I wanted something easier, a workflow when I scan documents in the scanner-pro app on IPAD/iPhone and upload them to storage it should then be organised with certain rules which can be easily searchable and also listable. What I mean by listable is that I need some sort of Google Sheet integration which can just enter the filename and date once it’s uploaded to S3.
When there is an excel sheet even if the search is available it gives me so much pleasure to fire up pandas and analyse or search for it, just makes me happy
Note: I am a Paid user of Zapier and using S3 is a premium app, Am not an advertiser for Zapier in any way, I found the service useful
Moving on, here is the workflow
With so much focus on serverless in Re-Invent 2022 and the advantages of Step Functions, I have started to transform some of my code from Lambda to Step Functions.
Step-Function was hard until I figured out how data values can be mapped for input and how data can be passed and transformed between Lambda functions. I have made a small attempt for someone who is starting in step functions for understanding the various steps involved.
Basically, Step Functions can be used to construct business logic and Lambda can be used to transform the data instead of transporting with Lambda-Invokes from Lambda Functions.
Let’s take the following example
I have step_function_1 which has the requirement to invoke another lambda if my_var is 1 else do not do anything.
This is a simple if-else logic followed by the lambda-invoke function
Now, the power of step-functions will come into play to write these conditional and also pass data from Lambda to Other making it super scalable for editing in future and all of the code will seem very logical and pictorial, best part is this can be designed instead of learning Amazon’s State Language.
let’s try to do Continue reading
< MEDIUM: https://medium.com/@raaki-88/lambda-sync-async-invocations-29e12a47ce85 >
A short note on Lambda Sync and Async Invocations. After Reinvent 2022, most of us started to think around Event-Driven architectures, especially using Event-Bridge, and Step-Functions at the core of state changes and function data pass.
I like these ideas very much. For me, before step-functions and event-bridge Lambda had this beautiful feature of Event/Request-Response knobs which served the purpose. With Step-Functions in place, you remove the complexity of maintaining state and time-delay logic and connectivity to different AWS services without relying on BOTO3 API connectivity. As one of the talks in Reinvent 2022 iterated that Lambda should be used to transform the data but not transfer the data.
https://www.youtube.com/watch?v=SbL3a9YOW7s
This is by far the best video that I have seen around the topic, this guy has nailed it to perfection! Please watch it if you are interested in these architectures.
For those who were looking out for using Lambda Request-Response/Event-based invocations few things that I have not seen anyone else write about some nitty gritty details
Let’s say
def call_other_lambda(lambda_payload):
lambda_client = boto3.client('lambda')
lambda_client.invoke(FunctionName='function_2',
InvocationType='Event', Payload=lambda_payload)
def lambda_handler(event, context):
print(event.keys())
get_euiid = event['end_device_ids']['device_id']
lambda_payload = json.dumps( {json. Continue reading
< MEDIUM: https://raaki-88.medium.com/a-simple-bpftrace-to-see-tcp-sendbytes-as-a-histogram-f6e12355b86c >
A significant difference between BCC and BPF is that BCC is used for complex analysis while BPF programs are mostly one-liners and are ad-hoc based. BPFTrace is an open-source tracer, reference below
https://ebpf.io/ — Excellent introduction to EBPF
https://github.com/iovisor/bpftrace — Excellent Resource.
Let me keep this short, we will try to use BPFTrace and capture TCP
We will need
To understand the efficiency of this, let’s attach a Tracepoint, a Kernel Static Probe to capture all of the new processes that get triggered, imagine an equivalent of a TOP utility with means of reacting to the event at run-time if required
https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md#probes — Lists out type of probes and their utility
We can clearly see we invoked a BPFTrace for tracepoint system calls which takes execve privilege, I executed the ping command and various other commands and you can see that executing an inbound SSH captured invoke of execve-related commands and the system banner.
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_execve { join(args->argv); }'
Attaching 1 probe...
clear
ping 1.1.1.1 -c 1
/usr/bin/clear_console -q
/usr/sbin/sshd -D -o AuthorizedKeysCommand /usr/share/ec2-instance-connect/eic_run_authorized_keys %u Continue reading
< MEDIUM: https://raaki-88.medium.com/flamegraph-htop-benchmarking-cpu-linux-e0b8a8bb6a94 >
I have written a small post on what happens at a Process-Level, now let’s throw some flame into it with flame-graphs
Am a fan of Brendan Gregg’s work and his writings and flame graph tool are his contribution to the open-source community —
https://www.brendangregg.com/flamegraphs.html
Before moving into Flamegraph, let’s understand some Benchmarking concepts.
Benchmarking in general is a methodology to test resource limits and regressions in a controlled environment. Now there are two types of benchmarking
Most Benchmarking scenario results boil down to the price/performance ratio. It can slowly start with an intention to provide proof-of-concept testing to test application/system load to identify bottlenecks in the system for troubleshooting or enhancing the system or to know about the maximum stress system simply is capable of taking.
Enterprise / On-premises Benchmarking: let’s take a simple scenario to build out a data centre which has huge racks of networking and computing equipment. As Data-centre builds are mostly identical and mirrored, benchmarking before going for Purchase-order is critical.
Cloud-based Benchmarking: This is a really in-expensive setup. While Continue reading
< MEDIUM: https://raaki-88.medium.com/aws-direct-connect-site-link-a-very-excellent-service-10c13a389c8d >
Site-link is really a nice extension to the DX Gateway’s offering. Let me simplify it.
Reference: https://aws.amazon.com/blogs/networking-and-content-delivery/introducing-aws-direct-connect-sitelink/ — I Can’t Recommend this more, this is a very very nice read.
Few Important Points
Problem — I want to connect my two Data-Centres to Direct Connect Gateway through AWS Backbone.
Let’s see a reference Architecture
Replicating the above scenario
Few important aspects
< MEDIUM: https://towardsaws.com/transit-gateway-a-one-stop-shop-e520d2f0afe3 >
I like Transit Gateway on so many levels, truly an NG service integrating many different points of ingress in a way with VPCs
Scenario 1 — Connect your VPCs
Interconnecting VPCs’s typically done through VPC-Peering, now while that is still valid you can easily interconnect VPCs through the transit gateway attachments feature, while VPC peering does only well VPC, transit gateway can connect VPCs, DX-Gateways and you can terminate IPSEC-VPN’s directly onto the transit gateway.
< MEDIUM: https://towardsaws.com/direct-connect-part-2-public-vif-5bc0a2d2c478 >
First Post ( Direct Connect – Part 1 )- https://raaki-88.medium.com/direct-connect-part-1-dc3e9369933
Direct Connect offering though it connects to AWS has a difference in operation depending on the VIF we connect.
Public VIF
→ So when we have this setup, this is in no way related to VPC at all, all this does is advertise Amazon-owned Public Prefixes for services like S3/EC2(Elastic-IP only, not your Private IP), and that’s all to it.
→ There is flexibility at the customer end to scope the advertisement propagation t LOCAL, CONTINENT, and GLOBAL levels within AWS in an outbound direction and has the flexibility to filter inbound updates which are advertised toward him.
Here is by default, how the Community scope looks like, you also have the flexibility to filter routes inbound to customers.
Note: Outbound communities restrict the advertisement of prefixes to region/continent/global scope for any sort of Any-cast implementations.
if the Customer sends a route with a community
7224:9100 → This will be local to the region
7224:9200 → This will be local to the continent, the scope is till the EU
7224:9300 → Global, by default its global even if you don’t export Continue reading
< MEDIUM: https://raaki-88.medium.com/direct-connect-part-1-dc3e9369933 >
AWS Advanced Networking Prep and General focus
Notion — https://meteor-honeycup-16b.notion.site/Direct-Connect-a61557d18e784e778b4500197168454c
What is the Direct Connect product trying to solve?
We have seen IPSEC Site-to-Site VPN, a nice extension to that is Direct Connect offering. In IPSEC VPN, we connected to AWS VPC securely over the internet, in Direct Connect we have a cable termination onto our Data Center premises which directly connects to AWS Infrastructure and no internet service providers are needed for this to happen.
Advantages:
What are my building blocks?
Functional Building Block?
Ref:https://docs.aws.amazon.com/directconnect/latest/UserGuide/WorkingWithVirtualInterfaces.html
So, once we have a connection setup, everything revolves around VIF — Virtual Interface.
Direct Connect can be divided into two parts
a. Public VIF — we are speaking about public IP addresses routable on the internet.
< MEDIUM: https://raaki-88.medium.com/aws-advanced-networking-ipsec-vpn-with-bgp-frr-and-docker-ae29a3ec6d85 >
The previous post covered IPSEC Vpn implementation with Static Routing and also had some points about IPSEC Vpn Implementation, this post aims at building IPSEC Vpn with Dynamic routing offered by VGW which is BGP.
Article on FRR, Docker — https://towardsaws.com/configuring-bgp-and-open-source-frr-docker-on-aws-advanced-networking-d21fd0d76b33
We will re-use the same concept and will start a BGP route exchange over IPSEC VPN.
https://meteor-honeycup-16b.notion.site/Site-2-Site-VPN-BGP-FRR-Docker-d818267a1041401481554e6f30764dfb — Notes and Topology
Lab Video — https://youtu.be/PmLkHRAMfMU
Few points to note:
https://meteor-honeycup-16b.notion.site/Site-to-Site-VPN-144441a6ac0b4e39a514adc67a8348d5 — This will be updated frequently and has the entire notes on the topics
Lab / Part 1— https://meteor-honeycup-16b.notion.site/Part-1-Building-Customer-VPN-Server-and-a-Client-688eed381f2849dfbe02f5eed740a573
Part 1 — https://youtu.be/h8zFEkVXV24
Lab / Part 2 — https://meteor-honeycup-16b.notion.site/Part-2-Setting-up-VGW-on-AWS-9055cd53a0174f51bd064bb2e3c1f3ac
Part 2 — https://youtu.be/PxJ04myIGJs
Lab / Part 3. — https://meteor-honeycup-16b.notion.site/Part-3-Configuring-Routing-and-verifying-Connectivity-0f2d03eae3474bb897a0f897c927786a
Part 3 — https://youtu.be/mf-Qymz-_Hg
Notes
https://meteor-honeycup-16b.notion.site/Site-to-Site-VPN-144441a6ac0b4e39a514adc67a8348d5 — This will be updated frequently and has the entire notes on the topics
Let’s imagine you have built your Continue reading
<MEDIUM:https://raaki-88.medium.com/tshark-packet-analysis-5d0dcc96e56a >
Commands used in the below post. If you wish for a quick reference instead of going through the post sudo tshark -f "tcp port 80" -F pcap -w /var/tmp/port_80_cap.pcap -c 10 sudo tshark -r /var/tmp/port_80_cap.pcap sudo tshark -r /var/tmp/port_80_cap.pcap -Tfields -e ip.src -e tcp.port -e ip.ttl -e ip.dst sudo tshark -f "tcp port 80" -F pcap -w /var/tmp/port_80_cap.pcap -c 10 sudo tshark -r /var/tmp/port_80_cap.pcap -Tfields -Y ip.dst==172.31.33.25 -e ip.dst -e tcp.dstport sudo tshark -r capture_ospf.cap sudo tshark -r capture_ospf.cap -Y "frame.number == 4" sudo tshark -r capture_ospf.cap -Y "frame.number == 4" -V
Wireshark is famous for packet capture and analysis of various packet-capture files. Basically, if you never used Wireshark before it’s a sophisticated and popular GUI tool for doing packet captures and analysis.
While not every time you need a GUI tool or most importantly you don’t have access to a GUI environment, eg: you are running an EC2 cloud instance of ubuntu, typically you would not install a GUI extension to this, it is meant to run server workloads.
This is where Tshark Continue reading
Am pasting my notes on cleaning up Transit-Gateway and Transit-gateway attachments, this is readily available on AWS documentation but thought I will paste it here if anyone wants to quickly copy and paste the steps instead of going through the documentation. We can be more sophisticated using Python / Ansible / Terraform and parse the outputs for now this is what I did to clean up some practice, do not forget this as it incurred good cost for but got saved by AWS credits!
1. list out available transit-gateway attachments as they are to be deleted first before deleting transit-gateway
aws ec2 describe-transit-gateway-attachments --region us-east-1 | egrep -i TransitGatewayAttachmentI -> This will list out TGW attachments in us-east-1
➜ ~ aws ec2 describe-transit-gateway-attachments --region us-east-1 | egrep -i TransitGatewayAttachmentId
"TransitGatewayAttachmentId": "tgw-attach-01b7c8d7d3bd4e2ca",
"TransitGatewayAttachmentId": "tgw-attach-050c87ef9fb703c98",
"TransitGatewayAttachmentId": "tgw-attach-079921a8810f490ab",
2. Delete the available attachments
aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-01b7c8d7d3bd4e2ca --region us-east-1
aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-050c87ef9fb703c98 --region us-east-1
aws ec2 delete-transit-gateway-vpc-attachment \
--transit-gateway-attachment-id tgw-attach-079921a8810f490ab --region us-east-1
3. ➜ List available Transit gateways
~ aws ec2 describe-transit-gateways --region us-east-1 | egrep -i "Transitgatewayid"
"TransitGatewayId": "tgw-08dfd0c519456953d"
4. Delete transit-gateway
aws ec2 delete-transit-gateway \
--transit-gateway-id tgw-08dfd0c519456953d --region us-east-1
{
"TransitGateway": Continue reading
Note : I wanted to quickly demo out Kinesis video streaming, I initially thought a local-mac was a good candidate, installation was extremely painful, then I created a ubuntu VM 22.04 had errors, then went to ubuntu 20.04, everything was fine but vmware abstraction was very poor for reason and camera was extremely slow, then finally I have decided that I will compile everything on AWS deeplens (which is inherently had ubuntu as base-os) but looks like deeplens has already covered installation of KVS module which I will write up below.
I was reading about Kinesis and power of Kinesis for huge pipelines of inbound data from various sources, what impressed me was Kinesis could be integrated into AWS Rekognition and AWS sagemaker as well to analyze various points of data streamed in real time video or saved video, apart from that AWS-IOT youtube channel is also planning out new series on integration of AWS Kinesis with S3 for image collection completely from Kinesis streaming video itself.
Related URLs
This is the URL that I have used to follow the process and it was pretty straight forward
https://docs.aws.amazon.com/deeplens/latest/dg/deeplens-kinesis-video-streams-api.html
AWS Continue reading
<MEDIUM: https://raaki-88.medium.com/linux-debugging-a-python-process-with-gdb-b9ea67173aff>
Have you anytime used py-bt inside gdb or used gdb to trace a linux process, if so you can skip this post
GDB is awesome! Am not a everyday programmer but I have been studying internal of linux and operating systems and its fascinating. I wrote about process last time <https://raaki-88.medium.com/linux-process-what-happens-under-the-hood-49e8bcf6173c> and continuing towards threads, Its now making sense when we import modules in python lets examine below snippet which is typically used in Python.
>>> from concurrent.futures import ThreadPoolExecutor vs >>> from concurrent.futures import ProcessPoolExecutor One of the distinctions that I learnt when studying these two functions for multiprocessing is that typical use of ThreadPoolExecutor is when you have IO based activity while ProcessPool is for CPU bound compute activity, I left that
What I now understood is that thread is an integral part of process, there can be multiple threads with in a process and there can be multiple processes, each process needs to be scheduled for CPU cycles and you have different locking mechanism for single process. Now, scheduler algorithm schedules these process as linux is time-sharing based operating system, effectively schedules processes on many factors, Continue reading