Web Analytics Made Easy - Statcounter
Senior Director, Cloud Infrastructure

Building Resilient
Cloud Systems

I lead high-performing SRE teams to architect scalable, secure, and automated infrastructure for the AI era.

calvin@cloud:~/portfolio
whoami
Calvin Pang
cat current_role.json
{ "role": "Senior Director", "focus": ["SRE", "Cloud Architecture", "AI/ML Ops"], "experience": "25+ Years", "mission": "Automate everything" }
_

About Me

With over 25 years in technology infrastructure, I've evolved from foundational on-premises systems to leading SRE for global cloud operations.

I currently lead SRE teams optimizing cloud infrastructure for modern workloads, including microservices, containerized applications, and AI/ML services. My expertise spans AWS, Microsoft Azure, and multi-cloud architectures.

I specialize in designing resilient systems using Infrastructure as Code, Python-driven automation, and modern DevOps workflows. From managing 24x7 global infrastructure to implementing disaster recovery strategies, I leverage both traditional SRE practices and emerging technologies.

25+ Years Exp
100% Uptime Record
Global Team Leadership
class EngineeringLeader:
    def __init__(self):
        self.focus = "Reliability"
        self.passion = "Automation"
        
    def lead_teams(self):
        return "High Performance"
        
    def solve_problems(self):
        # Scalable solutions only
        return True

Technical Arsenal

Leadership

  • Team Leadership & Mentoring
  • Stakeholder Management
  • Cross-Functional Teams
  • Global Team Leadership
  • Agile & Scrum Methodologies

Cloud & Serverless

  • AWS (EC2, ECS, EKS, Lambda)
  • Database & Storage (RDS, S3)
  • Container Orchestration
  • Microsoft Azure & 365
  • Multi-Cloud Architecture

SRE & Operations

  • Site Reliability Engineering
  • Incident Response & DR
  • CloudWatch & Monitoring
  • Performance Optimization
  • Technical Documentation

Infrastructure

  • Terraform (Expert)
  • Infrastructure as Code
  • CI/CD Pipeline Design
  • DevOps & Automation
  • Configuration Management

Security

  • Cloud Security Best Practices
  • AWS Security & Compliance
  • Zero Trust Architecture
  • VPC Design & Networking
  • Load Balancing & CDN

Development

  • Python (Boto3, Automation)
  • PowerShell & Bash
  • Linux Administration
  • Git & Version Control
  • Docker & Containerization

Key Achievements

SRE Team Leadership

Directing global Site Reliability Engineering teams across multi-cloud environments. Established core SRE best practices, implemented Agile workflows, and fostered a culture of reliability. Mentored senior engineers in cloud-native technologies to build high-performing, autonomous squads.

Global Scale
High Impact
SRE Leadership Mentorship

Enterprise Cloud Migration

Spearheaded a comprehensive migration of global legacy infrastructure to AWS for a major financial services firm. Architected a secure, multi-account Landing Zone using Terraform and CI/CD, achieving zero-downtime cutovers and modernizing the entire technology stack.

99.99% Uptime
Zero Downtime
AWS Terraform Migration

Disaster Recovery Leadership

Orchestrated mission-critical disaster recovery strategies that ensured 100% uptime during major crises, including Hurricane Sandy. Designed multi-region AWS architectures with automated failover, proving the value of proactive resilience engineering and robust business continuity planning.

100% Uptime
Multi Region
AWS DR Resilience

IaC Platform Engineering

Engineered a self-service Infrastructure as Code platform using Terraform and GitOps. Enabled development teams to provision standardized, compliant AWS resources in minutes rather than weeks, enforcing policy-as-code and drastically increasing deployment velocity.

100x Faster
1000+ Resources
Terraform GitOps Automation

Security & Compliance

Architected an enterprise-grade security framework implementing Zero Trust principles and IAM best practices. Established automated threat detection and incident response pipelines, ensuring continuous compliance with rigorous financial industry standards across global infrastructure.

100% Compliant
Zero Trust
Security Compliance IAM

Observability Ecosystem

Designed a holistic observability ecosystem using Datadog and CloudWatch across multi-region environments. Implemented distributed tracing and synthetic monitoring to reduce MTTR by 60%, shifting the operational focus from reactive troubleshooting to proactive performance optimization.

-60% MTTR
Proactive Ops
CloudWatch Datadog APM

Let's Connect

I'm always eager to connect with fellow engineers to discuss SRE best practices, explore cloud architecture challenges, and share insights on building reliable systems.

Open to: Advisory roles • Mentorship opportunities