cloud

Kubernetes Best Practices for Production

DK

David Kumar

DevOps Engineer

📅 Dec 1, 2024⏱️ 12 min read

#Kubernetes#DevOps#Cloud

☁️

Kubernetes Best Practices for Production

Running Kubernetes in production requires careful planning and adherence to best practices. This guide covers everything you need to know to run reliable, secure, and scalable Kubernetes clusters.

Cluster Architecture

High Availability Setup

Always run multiple control plane nodes:

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
controlPlaneEndpoint: "lb.example.com:6443"
etcd:
  local:
    dataDir: /var/lib/etcd
    extraArgs:
      initial-cluster-state: new

Node Configuration

Separate workloads using node pools:

apiVersion: v1
kind: Node
metadata:
  name: worker-1
  labels:
    node-role: compute
    workload-type: cpu-intensive
spec:
  taints:
  - key: workload-type
    value: cpu-intensive
    effect: NoSchedule

Resource Management

Resource Requests and Limits

Always set resource constraints:

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
  - name: app
    image: myapp:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "512Mi"
        cpu: "1000m"

Quality of Service Classes

Understand QoS classes for better scheduling:

Guaranteed: Requests = Limits
Burstable: Requests < Limits
BestEffort: No requests/limits

Security Best Practices

RBAC Configuration

Implement least-privilege access:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]

Network Policies

Isolate pods with network policies:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-allow
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

Pod Security Standards

Enforce security standards:

apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Health Checks

Liveness and Readiness Probes

Implement proper health checks:

apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5

Deployment Strategies

Rolling Updates

Configure safe rolling updates:

apiVersion: apps/v1
kind: Deployment
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
  minReadySeconds: 10

Blue-Green Deployments

Use services for blue-green deployments:

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: blue  # Switch to green when ready

Monitoring and Logging

Prometheus Integration

Monitor with Prometheus:

apiVersion: v1
kind: ServiceMonitor
metadata:
  name: myapp
spec:
  selector:
    matchLabels:
      app: myapp
  endpoints:
  - port: metrics
    interval: 30s

Centralized Logging

Use EFK or Loki for logging:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
data:
  fluent-bit.conf: |
    [OUTPUT]
        Name es
        Match *
        Host elasticsearch
        Port 9200

Backup and Disaster Recovery

etcd Backups

Regular etcd snapshots:

ETCDCTL_API=3 etcdctl snapshot save backup.db   --endpoints=https://127.0.0.1:2379   --cacert=/etc/kubernetes/pki/etcd/ca.crt   --cert=/etc/kubernetes/pki/etcd/server.crt   --key=/etc/kubernetes/pki/etcd/server.key

Velero for Application Backups

Use Velero for application-level backups:

velero backup create myapp-backup   --include-namespaces myapp   --storage-location default

Cost Optimization

Resource Optimization

Use Horizontal Pod Autoscaling (HPA)
Implement Vertical Pod Autoscaling (VPA)
Use cluster autoscaling
Implement resource quotas

Cluster Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Conclusion

Running Kubernetes in production is complex but manageable with the right practices. Focus on security, reliability, and observability from day one.

Remember: Start simple, monitor everything, and iterate based on your actual needs.

DK

About the Author

David Kumar

DevOps Engineer

David is a DevOps Engineer with extensive experience in Kubernetes and cloud infrastructure. He has deployed and managed large-scale Kubernetes clusters for enterprise clients.

Want to Learn More?

Explore our other articles or get in touch with our team for custom solutions.

View All Articles Contact Us