Deployment & Operations
Streamlining deployments and ensuring robust, efficient operations.
Flawless Production Launch: Zero-Incident HazDV Deployment
Ensured 100% reliability and business continuity by deploying the critical HazDV application into production with zero incidents. Java, Springboot,Reactive programming, Mutation and Unit Tests, MongoDB, Temporal, Azure, Docker,Github Actions
Increased Data Observability by 20% Across 35 Terminals
Designed, developed, and deployed the CDH Telemetry tool, rolling it out to 35 terminals to proactively identify and resolve data lag issues, significantly improving data pipeline integrity. Proactive, Python, Kafka, Docker, Java, Springboot,Azure,Github Actions, Prometheus, Grafana, Pensieve,Loki
High-Availability Global Observability Service
Ensured 99.9% uptime by developing and onboarding an Observability Service into Stargate with cross-region dynamic routing for EMEA and NAM. Python, Kafka, Docker, Java, Springboot,Azure,Github Actions, Prometheus, Grafana, Pensieve,Loki
Reduced Deployment Time with Automated CI/CD
Engineered a streamlined CI/CD pipeline for the CDH telemetry consumer using Gradle, enabling single-touch deployments and eliminating manual errors. Git Actions pipeline
Proactive Vulnerability Management with Harbor
Led the first Harbor onboarding for CDH Telemetry to ensure secure images with periodic vulnerability scanning, addressing critical framework vulnerabilities before they reached production. Springboot upgrade, Git actions
Reduced L1 Support Dependency via Self-Service Dashboards
Empowered terminal users by creating self-service Grafana dashboards for HazDV and CDH, providing real-time insights and reducing support-team workload. Grafana, Loki, Pensieve
Cost Savings: Optimized Repository & Image Size
Created JLink image which dramatically reducing the image size from 900MB to 130MB. Python images to 30MB which can be deployed in Terminal Private Cloud. This optimization led to significant storage cost savings and faster deployment times. Docker, JLink
Reduced Risk by 10% with Proactive DR & DDoS Testing
Actively participated in multiple Disaster Recovery (DR) exercises and DDoS testing, enhancing platform health and ensuring business continuity. Grafana, Loki,Pensieve, Documentation, Confluence, Team Discussions