Kubernetes Deployment Best Practices: Production-Ready Guide
These articles are AI-generated summaries. Please check the original sources for full details.
Kubernetes Deployment Best Practices: Production-Ready Guide
Deploying applications to Kubernetes is more than just writing a Deployment YAML. This comprehensive guide covers everything you need to know to run production-grade applications on Kubernetes successfully.
Table of Contents
- Deployment Fundamentals
- Resource Management
- Health Checks and Probes
- Deployment Strategies
- Configuration Management
- Scaling and Autoscaling
- Security Best Practices
- Monitoring and Logging
- Production Checklist
Deployment Fundamentals {#fundamentals}
Basic Deployment Structure
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: production
labels:
app: myapp
version: v1.0.0
environment: production
spec:
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v1.0.0
spec:
containers:
- name: myapp
image: myregistry.com/myapp:1.0.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
name: http
protocol: TCP
Key Components Explained
Deployment Spec:
replicas: Number of pod instancesrevisionHistoryLimit: Number of old ReplicaSets to retainselector: Identifies pods managed by this deploymenttemplate: Pod template specification
Resource Management {#resources}
1. Resource Requests and Limits
Critical for production stability:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
image: myapp:1.0.0
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
Understanding Resources:
- Requests: Guaranteed resources (used for scheduling)
- Limits: Maximum resources (hard caps)
- CPU: Measured in millicores (m)
- Memory: Measured in bytes (Mi, Gi)
Best Practices:
# Good: Requests = Limits for critical workloads (Guaranteed QoS)
resources:
requests:
memory: "1Gi"
cpu: "1000m"
limits:
memory: "1Gi"
cpu: "1000m"
# Good: Slightly higher limits for burstable workloads
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
# Bad: No limits (risk of resource exhaustion)
resources:
requests:
memory: "256Mi"
cpu: "250m"
# No limits defined!
2. Quality of Service (QoS) Classes
Guaranteed:
# Highest priority, won't be evicted
resources:
requests:
memory: "1Gi"
cpu: "1"
limits:
memory: "1Gi"
cpu: "1"
Burstable:
# Medium priority, can use extra resources
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "2"
BestEffort:
# Lowest priority, first to be evicted
# No resources specified
3. ResourceQuota and LimitRange
Namespace ResourceQuota:
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "100"
requests.memory: "200Gi"
limits.cpu: "200"
limits.memory: "400Gi"
pods: "50"
services: "20"
LimitRange for defaults:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
- default:
memory: "512Mi"
cpu: "500m"
defaultRequest:
memory: "256Mi"
cpu: "250m"
type: Container
Health Checks and Probes {#health-checks}
1. Liveness Probe
Determines if container is running. Restarts on failure.
spec:
containers:
- name: myapp
image: myapp:1.0.0
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
Alternative Probe Types:
Exec Probe:
livenessProbe:
exec:
command:
- /bin/sh
- -c
- "pgrep java || exit 1"
initialDelaySeconds: 30
periodSeconds: 10
TCP Socket Probe:
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 10
gRPC Probe (Kubernetes 1.24+):
livenessProbe:
grpc:
port: 9090
initialDelaySeconds: 30
2. Readiness Probe
Determines if container can accept traffic.
spec:
containers:
- name: myapp
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
3. Startup Probe (for slow-starting apps)
spec:
containers:
- name: myapp
startupProbe:
httpGet:
path: /actuator/health/startup
port: 8080
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 30 # Allow up to 5 minutes for startup
timeoutSeconds: 5
Complete Health Check Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
image: myapp:1.0.0
ports:
- containerPort: 8080
# Startup probe - gives app time to start
startupProbe:
httpGet:
path: /health/startup
port: 8080
failureThreshold: 30
periodSeconds: 10
# Readiness probe - determines traffic eligibility
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 3
# Liveness probe - restarts if unhealthy
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
failureThreshold: 3
Deployment Strategies {#deployment-strategies}
1. Rolling Update (Default)
Gradually replaces old pods with new ones.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1 # Max pods unavailable during update
maxSurge: 2 # Max extra pods during update
template:
spec:
containers:
- name: myapp
image: myapp:2.0.0
Characteristics:
- ✅ Zero downtime
- ✅ Gradual rollout
- ⚠️ Both versions run simultaneously
- ⚠️ Slower deployment
2. Blue-Green Deployment
Run new version alongside old, switch traffic when ready.
# Blue deployment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-blue
labels:
app: myapp
version: blue
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: blue
template:
metadata:
labels:
app: myapp
version: blue
spec:
containers:
- name: myapp
image: myapp:1.0.0
---
# Green deployment (new)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-green
labels:
app: myapp
version: green
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: green
template:
metadata:
labels:
app: myapp
version: green
spec:
containers:
- name: myapp
image: myapp:2.0.0
---
# Service (switch selector to green when ready)
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
version: blue # Change to 'green' when ready to switch
ports:
- port: 80
targetPort: 8080
Characteristics:
- ✅ Instant rollback
- ✅ Full testing before switch
- ⚠️ Requires 2x resources
- ⚠️ Database migrations complex
3. Canary Deployment
Gradually shift traffic to new version.
Using Istio:
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: myapp
spec:
hosts:
- myapp
http:
- match:
- headers:
user-type:
exact: beta-tester
route:
- destination:
host: myapp
subset: v2
- route:
- destination:
host: myapp
subset: v1
weight: 90
- destination:
host: myapp
subset: v2
weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: myapp
spec:
host: myapp
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
Using Argo Rollouts:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
replicas: 10
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
- setWeight: 25
- pause: {duration: 5m}
- setWeight: 50
- pause: {duration: 5m}
- setWeight: 75
- pause: {duration: 5m}
template:
spec:
containers:
- name: myapp
image: myapp:2.0.0
Configuration Management {#configuration}
1. ConfigMaps
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
namespace: production
data:
application.yaml: |
server:
port: 8080
spring:
datasource:
url: jdbc:postgresql://db:5432/mydb
log-level: "INFO"
cache-size: "1000"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
image: myapp:1.0.0
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: myapp-config
key: log-level
volumeMounts:
- name: config
mountPath: /config
volumes:
- name: config
configMap:
name: myapp-config
2. Secrets
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
namespace: production
type: Opaque
stringData:
database-password: "supersecret"
api-key: "abc123def456"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: myapp-secrets
key: database-password
- name: API_KEY
valueFrom:
secretKeyRef:
name: myapp-secrets
key: api-key
Best Practice: Use External Secrets Operator
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secrets
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: myapp-secrets
data:
- secretKey: database-password
remoteRef:
key: prod/myapp/db
property: password
Scaling and Autoscaling {#scaling}
1. Horizontal Pod Autoscaler (HPA)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
2. Vertical Pod Autoscaler (VPA)
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: myapp
minAllowed:
cpu: "250m"
memory: "256Mi"
maxAllowed:
cpu: "2"
memory: "4Gi"
3. Custom Metrics (KEDA)
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: myapp-scaler
spec:
scaleTargetRef:
name: myapp
minReplicaCount: 2
maxReplicaCount: 100
triggers:
- type: rabbitmq
metadata:
queueName: tasks
queueLength: "10"
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
threshold: '100'
query: sum(rate(http_requests_total[1m]))
Security Best Practices {#security}
1. Pod Security Standards
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
2. Security Context
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: myapp
image: myapp:1.0.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
3. Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-network-policy
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
role: database
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
Monitoring and Logging {#monitoring}
1. Prometheus Monitoring
apiVersion: v1
kind: Service
metadata:
name: myapp
labels:
app: myapp
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/actuator/prometheus"
spec:
selector:
app: myapp
ports:
- port: 80
targetPort: 8080
2. ServiceMonitor (Prometheus Operator)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp
labels:
app: myapp
spec:
selector:
matchLabels:
app: myapp
endpoints:
- port: http
path: /actuator/prometheus
interval: 30s
3. Logging with Fluentd
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
<parse>
@type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
<match kubernetes.**>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
</match>
Production Checklist {#checklist}
Pre-Deployment:
- Resource requests and limits defined
- Health probes configured (liveness, readiness, startup)
- Security context applied
- Network policies in place
- Secrets externalized
- Monitoring and logging configured
- HPA configured for scaling
- PodDisruptionBudget defined
- Multi-zone deployment
- Backup strategy defined
During Deployment:
- Use rolling update strategy
- Monitor error rates
- Watch resource usage
- Check logs for errors
- Verify health probes passing
Post-Deployment:
- Run smoke tests
- Verify metrics collection
- Check log aggregation
- Test rollback procedure
- Document deployment
- Update runbooks
Conclusion
Successful Kubernetes deployments require attention to:
- Resource Management - Right-size your applications
- Health Checks - Ensure reliability
- Security - Follow least privilege principle
- Observability - Monitor everything
- Scalability - Plan for growth
Remember: Start simple, iterate based on actual needs and metrics.
Resources
What Kubernetes practices have helped your team the most? Share your experiences!
Continue reading
Next article
LangChain Complete Guide: Building Production-Ready LLM Applications
Related Content
Kubernetes Deployment: Managing Containerized Applications with Deployments
A comprehensive guide to Kubernetes Deployments, covering their purpose, capabilities, use cases, and implementation details for managing containerized applications.
Coiled: Simplifying Python Scaling Beyond Kubernetes
Coiled enables effortless scaling of Python applications from local machines to thousands of nodes without infrastructure management, offering compatibility with major data science libraries and cost-effective resource usage.
Mastering Kubernetes via Homelab: A Cost-Effective Setup Guide
Learn to run Kubernetes locally using Minikube or K3s to eliminate expensive cloud billing while mastering scaling, networking, and configuration management.