Kubernetes has become the de facto standard for container orchestration, but running it in production at scale is a different challenge from running it in dev. This guide shares our battle-tested patterns for deploying, scaling, securing, and observing production workloads on Azure Kubernetes Service (AKS).
Production Cluster Architecture
Enterprise AKS cluster topology
NGINX / App GW
API / Web / Worker
DB / Cache / Queue
AAD-integrated
Prometheus
Fluentbit
Key Vault CSI
Node Pool Strategy
Production clusters should use multiple node pools to isolate workloads by resource requirements and criticality:
Standard_D4s_v5
3 nodes (fixed)
Standard_D8s_v5
3-20 nodes (auto)
Standard_NC6s_v3
0-5 nodes (spot)
Autoscaling: Three Layers Deep
Horizontal Pod Autoscaler (HPA)
Scale pods based on CPU, memory, or custom metrics. Set target utilisation to 70% for CPU-bound workloads. Use KEDA (Kubernetes Event-Driven Autoscaling) for queue-based workloads - scale to zero when idle, scale up based on message count.
Vertical Pod Autoscaler (VPA)
Automatically right-size pod resource requests based on actual usage. Run VPA in "recommendation" mode first to understand resource patterns, then enable "auto" mode. This prevents over-provisioning and reduces cluster costs by 20-35%.
Cluster Autoscaler
Automatically adds or removes nodes based on pending pod scheduling. Configure scale-down delay to 10 minutes to avoid thrashing. Use spot instances for non-critical workloads to cut compute costs by 60-80%.
Security Hardening Checklist
- Azure AD integration - Use AAD for cluster authentication, no static credentials
- Pod Security Standards - Enforce "restricted" profile, block privileged containers
- Network Policies - Default-deny all traffic, explicitly allow required flows (Calico or Azure CNI)
- Image scanning - Scan all images in ACR with Defender for Containers before deployment
- Secrets management - Mount secrets from Azure Key Vault using the CSI driver, never store in YAML
- Private cluster - Disable public API server endpoint; access only via VNet or Private Link
- Workload Identity - Use Azure Workload Identity (federated credentials) instead of service principals
GitOps Deployment Pipeline
Flux CD-based GitOps deployment flow
Helm values
Test + Image
Container Registry
Auto-deploy
Sample HPA Configuration
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-gateway-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-gateway
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 30
policies:
- type: Percent
value: 100
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
"Kubernetes gives you superpowers, but with great power comes great YAML. Invest in your platform engineering team and they'll make every application team faster."
Need help with Kubernetes in production?
Our platform engineers design, deploy, and operate AKS clusters for enterprise workloads.
Talk to Our Platform Team