cloud

Kubernetes Best Practices for Production

DK
David Kumar
DevOps Engineer
📅 Dec 1, 2024⏱️ 12 min read
#Kubernetes#DevOps#Cloud
☁️

Kubernetes Best Practices for Production

Running Kubernetes in production requires careful planning and adherence to best practices. This guide covers everything you need to know to run reliable, secure, and scalable Kubernetes clusters.

Cluster Architecture

High Availability Setup

Always run multiple control plane nodes:

apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration controlPlaneEndpoint: "lb.example.com:6443" etcd: local: dataDir: /var/lib/etcd extraArgs: initial-cluster-state: new

Node Configuration

Separate workloads using node pools:

apiVersion: v1 kind: Node metadata: name: worker-1 labels: node-role: compute workload-type: cpu-intensive spec: taints: - key: workload-type value: cpu-intensive effect: NoSchedule

Resource Management

Resource Requests and Limits

Always set resource constraints:

apiVersion: v1 kind: Pod metadata: name: myapp spec: containers: - name: app image: myapp:latest resources: requests: memory: "256Mi" cpu: "500m" limits: memory: "512Mi" cpu: "1000m"

Quality of Service Classes

Understand QoS classes for better scheduling:

  • Guaranteed: Requests = Limits
  • Burstable: Requests < Limits
  • BestEffort: No requests/limits

Security Best Practices

RBAC Configuration

Implement least-privilege access:

apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: pod-reader rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "list"]

Network Policies

Isolate pods with network policies:

apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api-allow spec: podSelector: matchLabels: app: api policyTypes: - Ingress ingress: - from: - podSelector: matchLabels: app: frontend ports: - protocol: TCP port: 8080

Pod Security Standards

Enforce security standards:

apiVersion: v1 kind: Namespace metadata: name: production labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted

Health Checks

Liveness and Readiness Probes

Implement proper health checks:

apiVersion: v1 kind: Pod spec: containers: - name: app livenessProbe: httpGet: path: /healthz port: 8080 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5

Deployment Strategies

Rolling Updates

Configure safe rolling updates:

apiVersion: apps/v1 kind: Deployment spec: strategy: type: RollingUpdate rollingUpdate: maxSurge: 25% maxUnavailable: 25% minReadySeconds: 10

Blue-Green Deployments

Use services for blue-green deployments:

apiVersion: v1 kind: Service metadata: name: myapp spec: selector: app: myapp version: blue # Switch to green when ready

Monitoring and Logging

Prometheus Integration

Monitor with Prometheus:

apiVersion: v1 kind: ServiceMonitor metadata: name: myapp spec: selector: matchLabels: app: myapp endpoints: - port: metrics interval: 30s

Centralized Logging

Use EFK or Loki for logging:

apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config data: fluent-bit.conf: | [OUTPUT] Name es Match * Host elasticsearch Port 9200

Backup and Disaster Recovery

etcd Backups

Regular etcd snapshots:

ETCDCTL_API=3 etcdctl snapshot save backup.db --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

Velero for Application Backups

Use Velero for application-level backups:

velero backup create myapp-backup --include-namespaces myapp --storage-location default

Cost Optimization

Resource Optimization

  • Use Horizontal Pod Autoscaling (HPA)
  • Implement Vertical Pod Autoscaling (VPA)
  • Use cluster autoscaling
  • Implement resource quotas

Cluster Autoscaler

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: myapp spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70

Conclusion

Running Kubernetes in production is complex but manageable with the right practices. Focus on security, reliability, and observability from day one.

Remember: Start simple, monitor everything, and iterate based on your actual needs.

DK
About the Author

David Kumar

DevOps Engineer

David is a DevOps Engineer with extensive experience in Kubernetes and cloud infrastructure. He has deployed and managed large-scale Kubernetes clusters for enterprise clients.

Want to Learn More?

Explore our other articles or get in touch with our team for custom solutions.