Staff Site Reliability Engineer
I design, build, and operate large-scale distributed systems. 15+ years in infrastructure engineering, from C++ R&D to multi-region Kubernetes platforms.
- Multi-region SaaS platform reliability (AWS, GCP)
- Infrastructure as Code with Terraform, Pulumi (Go & Python)
- Kubernetes operators and platform tooling
- CI/CD pipeline architecture and GitOps workflows
- Observability, SLI/SLO frameworks, and incident response
- Chaos engineering and resilience testing
- np4ns - A Kubernetes operator for namespace-scoped network policy enforcement
- test-infra-go - Demo Infra written in Pulumi Go
- IaC-Copilot - AI-assisted infrastructure code review and generation
- SRE Agent - An agentic AI assistant for SRE operations
- crashloopback - SRE Diagnostic Tools
Go Python Terraform Pulumi Kubernetes AWS GCP Docker Istio Consul Datadog Grafana Jenkins GitHub Actions
- Certified Kubernetes Administrator (CKA)
- AWS Solutions Architect - Associate
- AWS Developer - Associate



