Kubernetes Health Probes and Lifecycle Hooks

Health probes and lifecycle hooks ensure your applications run reliably in Kubernetes. Probes detect unhealthy containers, while lifecycle hooks enable graceful startup and shutdown.

Why Health Probes Matter

flowchart LR
    subgraph Without["Without Probes"]
        A1["Container Running"] --> B1["App Crashed"]
        B1 --> C1["Still Receiving Traffic"]
    end

    subgraph With["With Probes"]
        A2["Container Running"] --> B2["Probe Fails"]
        B2 --> C2["Traffic Stopped"]
        C2 --> D2["Container Restarted"]
    end

    style Without fill:#ef4444,color:#fff
    style With fill:#22c55e,color:#fff

Three Types of Probes

Probe	Purpose	Failure Action
Liveness	Is container alive?	Restart container
Readiness	Can container serve traffic?	Remove from Service
Startup	Has container started?	Block other probes

Liveness Probe

Detects if a container is running but unhealthy (deadlocked, hung).

HTTP Liveness Probe

apiVersion: v1
kind: Pod
metadata:
  name: web-app
spec:
  containers:
    - name: app
      image: myapp:1.0
      livenessProbe:
        httpGet:
          path: /healthz
          port: 8080
        initialDelaySeconds: 10
        periodSeconds: 5
        timeoutSeconds: 3
        failureThreshold: 3
        successThreshold: 1

TCP Liveness Probe

livenessProbe:
  tcpSocket:
    port: 3306
  initialDelaySeconds: 15
  periodSeconds: 10

Exec Liveness Probe

livenessProbe:
  exec:
    command:
      - cat
      - /tmp/healthy
  initialDelaySeconds: 5
  periodSeconds: 5

Readiness Probe

Determines if a container can accept traffic. Failed readiness removes pod from Service endpoints.

apiVersion: v1
kind: Pod
metadata:
  name: web-app
spec:
  containers:
    - name: app
      image: myapp:1.0
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        initialDelaySeconds: 5
        periodSeconds: 5
        failureThreshold: 3

Readiness vs Liveness

flowchart TB
    subgraph Liveness["Liveness Probe"]
        L1["Check"] -->|Fail| L2["Restart Container"]
    end

    subgraph Readiness["Readiness Probe"]
        R1["Check"] -->|Fail| R2["Remove from Service"]
        R2 --> R3["Keep Container Running"]
    end

    style Liveness fill:#ef4444,color:#fff
    style Readiness fill:#f59e0b,color:#fff

Startup Probe

For slow-starting containers. Disables liveness/readiness probes until startup succeeds.

apiVersion: v1
kind: Pod
metadata:
  name: legacy-app
spec:
  containers:
    - name: app
      image: legacy:1.0
      startupProbe:
        httpGet:
          path: /healthz
          port: 8080
        failureThreshold: 30
        periodSeconds: 10
      livenessProbe:
        httpGet:
          path: /healthz
          port: 8080
        periodSeconds: 10

This allows up to 5 minutes (30 × 10s) for startup before liveness probes begin.

Probe Configuration Parameters

Parameter	Default	Description
initialDelaySeconds	0	Wait before first probe
periodSeconds	10	Time between probes
timeoutSeconds	1	Probe timeout
successThreshold	1	Successes to be healthy
failureThreshold	3	Failures before action

Lifecycle Hooks

Execute commands at container startup and shutdown.

flowchart LR
    A["Container Start"] --> B["postStart Hook"]
    B --> C["Running"]
    C --> D["preStop Hook"]
    D --> E["SIGTERM"]
    E --> F["Container Stop"]

    style B fill:#3b82f6,color:#fff
    style D fill:#8b5cf6,color:#fff

postStart Hook

Runs immediately after container creation (not guaranteed before ENTRYPOINT).

apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  containers:
    - name: app
      image: myapp:1.0
      lifecycle:
        postStart:
          exec:
            command:
              - /bin/sh
              - -c
              - echo "Container started" >> /var/log/startup.log

preStop Hook

Runs before container receives SIGTERM. Useful for graceful shutdown.

apiVersion: v1
kind: Pod
metadata:
  name: web-server
spec:
  containers:
    - name: nginx
      image: nginx:1.25
      lifecycle:
        preStop:
          exec:
            command:
              - /bin/sh
              - -c
              - nginx -s quit && sleep 10

HTTP Lifecycle Hook

lifecycle:
  preStop:
    httpGet:
      path: /shutdown
      port: 8080

Graceful Shutdown Pattern

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 3
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: app
          image: myapp:1.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            periodSeconds: 5
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - sleep 5

Shutdown Sequence

Pod marked for termination
Pod removed from Service endpoints
preStop hook executes
SIGTERM sent to container
Wait for terminationGracePeriodSeconds
SIGKILL if still running

Complete Example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: production-app
  template:
    metadata:
      labels:
        app: production-app
    spec:
      terminationGracePeriodSeconds: 30
      containers:
        - name: app
          image: myapp:1.0
          ports:
            - containerPort: 8080
          startupProbe:
            httpGet:
              path: /healthz
              port: 8080
            failureThreshold: 30
            periodSeconds: 2
          livenessProbe:
            httpGet:
              path: /healthz
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 0
            periodSeconds: 5
            timeoutSeconds: 3
            failureThreshold: 3
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 5"]

Best Practices

Practice	Recommendation
Always use readiness probes	Prevent traffic to unready pods
Use startup probes for slow apps	Avoid false positive restarts
Keep probes lightweight	Don't check downstream dependencies
Set appropriate timeouts	Match your application's behavior
Use preStop for graceful shutdown	Allow in-flight requests to complete

Common Issues

Too Aggressive Liveness Probe

# Bad: Restarts healthy but slow containers
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 1
  periodSeconds: 1
  failureThreshold: 1

# Good: Reasonable thresholds
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 10
  failureThreshold: 3

Checking Dependencies in Probes

# Bad: Fails if database is down
livenessProbe:
  httpGet:
    path: /healthz  # Checks database connection

# Good: Only check container health
livenessProbe:
  httpGet:
    path: /healthz  # Returns 200 if app is running

Key Takeaways

Liveness restarts unhealthy containers - Use for deadlock detection
Readiness controls traffic - Use for startup and temporary unavailability
Startup probes for slow containers - Prevents premature restarts
preStop enables graceful shutdown - Complete in-flight requests
Don't check dependencies - Probes should only check container health

References

The Kubernetes Book, 3rd Edition - Nigel Poulton
Kubernetes: Up and Running, 3rd Edition - Burns, Beda, Hightower
Kubernetes Probe Documentation