[{"content":"A complete microservices platform running on Amazon EKS — built from scratch to demonstrate what production infrastructure actually looks like, not what tutorials pretend it looks like.\nThe Problem Every DevOps portfolio project looks the same. Terraform creates a VPC, spins up an EKS cluster, deploys nginx, and calls it \u0026ldquo;production-ready.\u0026rdquo; The README has 30 bullet points of AWS services. The repo has one commit that adds 200 files. Nobody learned anything building it, and interviewers can tell.\nI wanted to build something different. Something that answers the questions that actually come up when you run infrastructure for a team:\nHow do pods get database passwords without anyone putting credentials in Git? What happens when your Spot instances get a 2-minute termination notice at 3 AM? How do you know if a service is slow because of a bad query or because a node is running out of memory? How do you make sure a junior dev can\u0026rsquo;t accidentally kubectl delete namespace production? These aren\u0026rsquo;t theoretical questions. They\u0026rsquo;re the difference between a portfolio project and a real platform.\nWhat I Built Five microservices — UI, Catalog, Cart, Orders, Checkout — each with its own database, deployed on EKS across three availability zones. The kind of setup you\u0026rsquo;d find at a mid-size company running retail services on AWS, with the same constraints around cost, security, and operational readiness.\nInfrastructure That Actually Makes Decisions The VPC has three subnet tiers, not two. Public subnets for ALBs. Private app subnets for EKS nodes. Private data subnets for RDS and ElastiCache.\nWhy? Because when an EKS node gets compromised, the attacker lands in the app tier. They can reach the database port, but they\u0026rsquo;re in a different subnet with a different security group — they still need credentials from Secrets Manager, which requires a Pod Identity role they don\u0026rsquo;t have.\nEvery security group uses standalone rules instead of inline blocks. This matters because Terraform destroys and recreates a security group when you modify an inline rule — cascading through every resource that references it. Standalone rules are additive. No downtime, no cascading destroys.\nKMS has three separate keys — one for S3, one for EKS etcd, one for RDS. Revoking one doesn\u0026rsquo;t break the others. All toggled off in dev to save $3/month, on in prod. Same code, different variable.\nSecrets Nobody Ever Sees Terraform generates database passwords with random_password. Stores them in AWS Secrets Manager. The Secrets Store CSI Driver fetches them at pod startup using the service account\u0026rsquo;s Pod Identity role and mounts them as Kubernetes Secrets.\nNo passwords in Git. No passwords in Helm charts. No passwords typed into kubectl. No human ever sees the password.\nIf a pod tries to access a secret it shouldn\u0026rsquo;t have, it gets AccessDeniedException — not a silent fallback to an overpermissioned node role. That\u0026rsquo;s why I chose Pod Identity over IRSA. IRSA fails silently. Pod Identity fails loudly.\nAutoscaling That Thinks HPA watches CPU utilisation and scales pods. But more pods need more nodes. Karpenter watches for unschedulable pods and provisions a right-sized node in about 60 seconds — not 3-5 minutes like Cluster Autoscaler. If a pod needs 256 MB, Karpenter picks a t3.micro, not a t3.large.\nTwo NodePools:\nOn-demand for baseline stability Spot for burst capacity at 60-70% savings An SQS queue catches spot interruption warnings via EventBridge. Karpenter drains the node before AWS reclaims it. The pod moves to another node. Users don\u0026rsquo;t notice.\nGitOps, Not \u0026ldquo;Push and Pray\u0026rdquo; ArgoCD watches the Helm values in Git. Push a change, ArgoCD syncs it to the cluster.\nselfHeal: true — if someone kubectl edits a deployment in production, ArgoCD reverts it within seconds prune: true — if you delete a resource from Git, ArgoCD deletes it from the cluster Rollback is git revert, not \u0026ldquo;find the last working image tag\u0026rdquo; GitHub Actions builds the image, pushes to ECR, updates the Helm values file. ArgoCD picks it up. No kubectl apply in CI pipelines. No imperative commands. Git is the single source of truth.\nObservability From Day One ADOT collectors run as DaemonSets. Traces go to X-Ray — you can follow a single request from the ALB through the UI service, into the Catalog API, down to the MySQL query.\nSignal Destination Dev Retention Prod Retention Traces AWS X-Ray 14 days 90 days Logs CloudWatch 14 days 90 days Metrics Amazon Managed Prometheus → Grafana Real-time Real-time When latency spikes on checkout, you don\u0026rsquo;t guess. You look at the trace, find the slow span, check if it\u0026rsquo;s the Redis connection or the SQS publish, and fix it.\nThe Numbers Dev Prod Monthly cost ~$293 ~$465 KMS encryption Off On WAF Off On Multi-AZ RDS Off On NAT Gateways 1 3 VPC endpoints Off On Same code. Same modules. Different terraform.tfvars.\nStack Layer Technology Cloud AWS (EKS, RDS, ElastiCache, SQS, Secrets Manager, KMS) IaC Terraform (reusable modules, per-environment configs) Containers Docker (multi-stage, non-root, multi-arch) Orchestration Kubernetes on EKS, Helm charts, ArgoCD Autoscaling HPA + Karpenter (on-demand + spot) Security KMS, WAF, NetworkPolicies, Pod Identity, IMDSv2 Observability OpenTelemetry, X-Ray, CloudWatch, Prometheus, Grafana CI/CD GitHub Actions → ECR → ArgoCD Key Architectural Decisions Pod Identity over IRSA — fails loudly instead of silently falling back to node role 3-tier subnets — isolates app and data layers so compromised nodes can\u0026rsquo;t reach databases directly Karpenter over Cluster Autoscaler — right-sizes nodes in 60 seconds, not 3-5 minutes Standalone security group rules — prevents Terraform cascading destroys on rule changes Separate KMS keys per service — revoking one doesn\u0026rsquo;t break the others Full decision log: DECISIONS.md\nLinks GitHub: devops-eks-ausmart Based on: AWS Retail Store Sample App Architecture Decisions: DECISIONS.md ","permalink":"https://manjubhandari.com/projects/ausmart-eks/","summary":"\u003cp\u003eA complete microservices platform running on Amazon EKS — built from scratch to demonstrate what production infrastructure actually looks like, not what tutorials pretend it looks like.\u003c/p\u003e","title":"AusMart — Production-Grade Retail Platform on AWS EKS"},{"content":"A bare-metal Kubernetes cluster on a Raspberry Pi 5 that shows what managed Kubernetes hides from you.\nWhy I Built This When you only use managed Kubernetes, you miss what\u0026rsquo;s actually happening underneath. On a Pi with 4GB RAM, every resource request matters. You learn about memory limits, CPU scheduling, and container efficiency in a way that EKS abstracts away.\nThe constraint IS the lesson. When I go back to EKS after debugging something on the Pi, I understand every abstraction layer better.\nWhat It Is A self-hosted Kubernetes learning environment on physical hardware:\nHardware: Raspberry Pi 5, ARM64, 4GB RAM, 64GB USB 3.0 SSD K8s Distribution: k3s (lightweight, perfect for constrained hardware) Container Runtime: containerd OS: Raspberry Pi OS (Debian-based) GitOps: FluxCD for automated deployments Secrets: HashiCorp Vault Observability: Prometheus + Grafana Access: Cloudflare Tunnels for secure remote access Architecture Raspberry Pi 5 (ARM64, 4GB RAM, 64GB SSD) │ └── k3s (lightweight Kubernetes) ├── apps/ │ └── deployed applications ├── monitoring/ │ ├── Prometheus │ └── Grafana └── clusters/staging/ └── cluster configs GitOps: FluxCD watches Git → auto-deploys changes Secrets: HashiCorp Vault (not K8s secrets) Access: Cloudflare Tunnels (no port forwarding) What I Learned Running Kubernetes on constrained hardware teaches you things no certification can:\nResource limits matter. When you have 4GB total, you learn exactly how much memory Prometheus needs versus what the docs say. ARM64 is different. Not every container image supports ARM. You learn to build multi-arch images and check manifests. Storage on bare metal is real. No EBS to auto-provision. You understand what CSI drivers actually do. Networking without a cloud VPC. You configure MetalLB, understand ARP, and deal with DHCP leases. The cloud hides all of this. GitOps on real hardware. When FluxCD reconciles on a Pi, you see the actual resource pressure of the reconciliation loop. Planned Multi-node cluster expansion ArgoCD alongside FluxCD comparison Longhorn persistent storage Network policies CI/CD pipeline integration Backup and recovery Links GitHub Repository ","permalink":"https://manjubhandari.com/projects/pi-kubernetes/","summary":"\u003cp\u003eA bare-metal Kubernetes cluster on a Raspberry Pi 5 that shows what managed Kubernetes hides from you.\u003c/p\u003e","title":"Production Kubernetes Homelab"},{"content":"Proving dual-cloud capability. Built specifically because Perth\u0026rsquo;s mining and resources sector runs heavily on Azure.\nWhy Azure I\u0026rsquo;ve spent 6+ years deep in AWS. Perth\u0026rsquo;s mining sector — Fortescue, BHP, Woodside, Rio Tinto — runs heavily on Azure. This project is my way of proving I can think in both clouds. Not just translate services one-to-one, but understand Azure-native patterns like Managed Identity, Key Vault integration, and AKS-specific networking.\nThe interesting part is seeing where the clouds diverge. Azure\u0026rsquo;s identity model is fundamentally different from AWS IAM, and that changes how you architect everything downstream.\nWhat I\u0026rsquo;m Building Azure Enterprise Infrastructure Platform using Terraform IaC:\nNetworking: Azure Virtual Network (hub-and-spoke) Compute: Azure Kubernetes Service (AKS) Registry: Azure Container Registry (ACR) Secrets: Azure Key Vault with Managed Identity (zero-credential auth) Observability: Azure Monitor + Log Analytics Database: Azure SQL Ingress: Azure Application Gateway CI/CD: GitHub Actions with Terraform plan/apply State: Azure Storage backend Environments: Dev/prod separation with same Terraform code Architecture GitHub Actions CI → Terraform Plan/Apply ↓ Azure VNet (Hub-and-Spoke) └── AKS Cluster ├── App Services → Azure SQL ├── Key Vault (Managed Identity) └── ACR (container images) Observability: Azure Monitor → Log Analytics → Dashboards Auth: Managed Identity (zero credentials) State: Azure Storage backend Cross-Cloud Thinking Concept AWS Azure Identity IAM Roles / Pod Identity Managed Identity / Workload Identity Secrets Secrets Manager Key Vault K8s EKS AKS Registry ECR ACR IaC State S3 + DynamoDB Azure Storage Observability CloudWatch + X-Ray Azure Monitor + Log Analytics Same architectural patterns. Different native implementations.\nStatus In progress. Architecture designed, Terraform modules being built.\n","permalink":"https://manjubhandari.com/projects/azure-platform/","summary":"\u003cp\u003eProving dual-cloud capability. Built specifically because Perth\u0026rsquo;s mining and resources sector runs heavily on Azure.\u003c/p\u003e","title":"Azure Enterprise Platform"},{"content":"When I was building the AusMart EKS platform, I had to decide how pods would authenticate to AWS services. The two options: IRSA (IAM Roles for Service Accounts) and the newer EKS Pod Identity.\nThe Problem Every microservice needs AWS credentials. The catalog service reads from DynamoDB. The orders service writes to SQS. The checkout service fetches secrets from Secrets Manager. Each service should only have access to what it needs — nothing more.\nWhy Not IRSA? IRSA has been the standard for years. It works by annotating Kubernetes service accounts with an IAM role ARN, and the OIDC provider handles the token exchange.\nThe problem is what happens when it breaks.\nIf the OIDC provider is misconfigured, or the service account annotation is wrong, or the trust policy doesn\u0026rsquo;t match — the pod doesn\u0026rsquo;t get an error. It falls back to the node\u0026rsquo;s instance profile. Suddenly your catalog pod has the same permissions as the EC2 node it\u0026rsquo;s running on. You won\u0026rsquo;t notice until an audit, or worse, an incident.\nIRSA fails silently.\nWhy Pod Identity EKS Pod Identity takes a different approach. You create an association between the service account and the IAM role at the EKS API level. The EKS Pod Identity Agent (a DaemonSet) intercepts credential requests and provides the right role.\nIf the association is wrong or missing — the pod gets AccessDeniedException. Immediately. Loudly. Your monitoring picks it up. Your logs show it. The developer knows something is wrong before the code even runs.\nPod Identity fails loudly.\nThe Decision For a production platform where security matters — and especially in environments with compliance requirements like PCI-DSS — I want failures to be obvious. Silent fallbacks are how privilege escalation bugs hide in plain sight for months.\nPod Identity also has simpler configuration. No OIDC provider to manage. No trust policy JSON to get right. The association is a single API call:\naws eks create-pod-identity-association \\ --cluster-name ausmart \\ --namespace catalog \\ --service-account catalog-sa \\ --role-arn arn:aws:iam::role/catalog-role Less configuration means fewer places to make mistakes.\nWhat I Learned The best security decisions aren\u0026rsquo;t always about adding more controls. Sometimes they\u0026rsquo;re about choosing the tool that tells you loudest when something is wrong.\nEvery security group, every IAM role, every encryption key in the AusMart platform follows this principle: fail loud, fail fast, fail safe.\n","permalink":"https://manjubhandari.com/blogs/why-i-chose-pod-identity-over-irsa/","summary":"\u003cp\u003eWhen I was building the AusMart EKS platform, I had to decide how pods would authenticate to AWS services. The two options: IRSA (IAM Roles for Service Accounts) and the newer EKS Pod Identity.\u003c/p\u003e","title":"Why I Chose EKS Pod Identity Over IRSA"},{"content":"Cloud and DevOps Engineer with 6+ years building DevOps culture from the ground up across FinTech, healthcare, and education. I architect scalable cloud platforms, build automation that gets code from commit to production safely, and own what I build — including on-call, incident response, and reliability.\nMy Journey My DevOps journey started from a university capstone project that won best project award at Edith Cowan University. We built an ultrasound reporting system for radiologists. It got commercialised into MedReport360, and I joined as the first engineer — no infrastructure existed. I built everything from the ground up across 4 environments.\nWhen Prime Radiology acquired the company, I stayed on and owned the CI/CD, Docker containerisation, infrastructure reliability, and production incident response for a 24/7 healthcare system.\nThen I moved to ShopSe, a FinTech in India, where I took ownership of a Kubernetes platform handling 5 million+ API transactions annually for bank payment integrations. Over 3 years, I migrated their monolith to 15+ microservices, built the CI/CD pipelines, set up centralised logging and monitoring from scratch, and ensured PCI-DSS compliance across every deployment.\nNow I\u0026rsquo;m back in Perth at UWA, where I designed and delivered the entire ML cloud platform from scratch — 21 Terraform modules, ~200 resources, multi-region ready with data sovereignty controls.\nEvery role, I\u0026rsquo;ve walked into a problem and left it automated, monitored, and secure.\nSkills Terraform \u0026 Ansible AWS (EKS, EC2, Lambda, RDS, S3) Docker \u0026 Kubernetes (EKS) Jenkins \u0026 GitHub Actions ArgoCD \u0026 GitOps Prometheus \u0026 Grafana ELK Stack \u0026 CloudWatch PCI-DSS \u0026 ISO 27001 SonarQube, Trivy, OWASP ZAP Python \u0026 Bash BCP/DR \u0026 Incident Response Git, Jira, Confluence Education Master of Management Information Systems — Edith Cowan University (2018-2020), Perth\nBachelor of Engineering, Computer Science — Global Academy of Technology (2010-2014), India\nCertifications AWS Certified Cloud Practitioner — 2022-2027 HashiCorp Terraform Associate (in progress) ","permalink":"https://manjubhandari.com/about/","summary":"\u003cp\u003eCloud and DevOps Engineer with 6+ years building DevOps culture from the ground up across FinTech, healthcare, and education. I architect scalable cloud platforms, build automation that gets code from commit to production safely, and own what I build — including on-call, incident response, and reliability.\u003c/p\u003e\n\u003ch2 id=\"my-journey\"\u003eMy Journey\u003c/h2\u003e\n\u003cp\u003eMy DevOps journey started from a university capstone project that won best project award at Edith Cowan University. We built an ultrasound reporting system for radiologists. It got commercialised into MedReport360, and I joined as the first engineer — no infrastructure existed. I built everything from the ground up across 4 environments.\u003c/p\u003e","title":"About Me"},{"content":" What's Next? Get In Touch Looking for my next role in Perth. DevOps, Cloud, Platform, or Infrastructure Engineer. Contract or permanent. My inbox is always open.\nSay Hello Email: mbcloud15@gmail.com\nLinkedIn: linkedin.com/in/manjunathbhandari\nGitHub: github.com/iammanjubhandari\nLocation: Perth, Western Australia\n","permalink":"https://manjubhandari.com/contact/","summary":"\u003cdiv class=\"contact-section\" style=\"margin: 2rem auto;\"\u003e\n  \u003cspan class=\"number\"\u003eWhat's Next?\u003c/span\u003e\n  \u003ch2\u003eGet In Touch\u003c/h2\u003e\n  \u003cp\u003eLooking for my next role in Perth. DevOps, Cloud, Platform, or Infrastructure Engineer. Contract or permanent. My inbox is always open.\u003c/p\u003e\n  \u003ca href=\"mailto:mbcloud15@gmail.com\" class=\"btn-contact\"\u003eSay Hello\u003c/a\u003e\n\u003c/div\u003e\n\u003chr\u003e\n\u003cp\u003e\u003cstrong\u003eEmail:\u003c/strong\u003e \u003ca href=\"mailto:mbcloud15@gmail.com\"\u003embcloud15@gmail.com\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLinkedIn:\u003c/strong\u003e \u003ca href=\"https://linkedin.com/in/manjunathbhandari\"\u003elinkedin.com/in/manjunathbhandari\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eGitHub:\u003c/strong\u003e \u003ca href=\"https://github.com/iammanjubhandari\"\u003egithub.com/iammanjubhandari\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eLocation:\u003c/strong\u003e Perth, Western Australia\u003c/p\u003e","title":"Contact"}]