Container Monitoring and Logging with Docker

Effective monitoring and logging are essential for running containers in production. This article covers tools and practices for gaining visibility into your containerized applications.

Why Container Monitoring Matters

Containers are ephemeral and dynamic, making traditional monitoring approaches insufficient:

Challenge	Traditional Approach	Container Reality
Lifecycle	Long-lived servers	Short-lived containers
Scale	Fixed infrastructure	Dynamic scaling
Discovery	Static IPs	Dynamic IPs and names
Logs	Files on disk	Stdout/stderr streams

Docker Native Logging

Container Logs Basics

# View container logs
docker logs mycontainer

# Follow logs in real-time
docker logs -f mycontainer

# Show timestamps
docker logs -t mycontainer

# Show last N lines
docker logs --tail 100 mycontainer

# Show logs since time
docker logs --since 2025-01-18T10:00:00 mycontainer
docker logs --since 30m mycontainer

Logging Drivers

Docker supports multiple logging drivers:

# Check current logging driver
docker info | grep "Logging Driver"

# Run with specific driver
docker run -d --log-driver json-file --log-opt max-size=10m nginx

Driver	Description	Best For
`json-file`	Default, JSON format	Development, small deployments
`syslog`	Syslog daemon	Traditional Linux logging
`journald`	Systemd journal	Systemd-based systems
`fluentd`	Fluentd collector	Centralized logging
`awslogs`	AWS CloudWatch	AWS deployments
`gcplogs`	Google Cloud Logging	GCP deployments

Configuring Logging in Docker Compose

services:
  app:
    image: myapp
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
        labels: "app,environment"
        tag: "{{.Name}}/{{.ID}}"

  # Fluentd logging
  api:
    image: myapi
    logging:
      driver: fluentd
      options:
        fluentd-address: localhost:24224
        tag: "docker.{{.Name}}"

Container Metrics with Docker Stats

# Real-time stats for all containers
docker stats

# Stats for specific containers
docker stats container1 container2

# One-time stats (no streaming)
docker stats --no-stream

# Custom format
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"

Output:

CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O
abc123         web       0.50%     50.5MiB / 512MiB      9.86%     1.2MB / 500kB     0B / 0B
def456         db        2.30%     256MiB / 1GiB         25.00%    500kB / 100kB     10MB / 5MB

cAdvisor for Container Metrics

cAdvisor (Container Advisor) provides detailed resource usage and performance metrics.

Running cAdvisor

# docker-compose.yml
services:
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    ports:
      - "8080:8080"
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    privileged: true

Access the UI at http://localhost:8080 for:

CPU, memory, network, filesystem metrics
Per-container resource usage
Container health information

Prometheus for Metrics Collection

Prometheus Architecture

flowchart LR
    subgraph Targets["Metrics Sources"]
        App["Application<br/>/metrics"]
        cAdvisor["cAdvisor<br/>/metrics"]
        Node["Node Exporter<br/>/metrics"]
    end

    subgraph Prometheus["Prometheus"]
        Scraper["Scraper"]
        TSDB["Time Series DB"]
        PromQL["PromQL Engine"]
    end

    subgraph Visualization["Visualization"]
        Grafana["Grafana"]
        AlertManager["Alertmanager"]
    end

    Targets --> Scraper
    Scraper --> TSDB
    TSDB --> PromQL
    PromQL --> Grafana
    PromQL --> AlertManager

    style Prometheus fill:#e6522c,color:#fff
    style Grafana fill:#f46800,color:#fff

Prometheus Configuration

# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['alertmanager:9093']

scrape_configs:
  # Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # cAdvisor metrics
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

  # Application metrics
  - job_name: 'app'
    static_configs:
      - targets: ['app:3000']

  # Docker daemon metrics
  - job_name: 'docker'
    static_configs:
      - targets: ['host.docker.internal:9323']

Docker Compose with Prometheus Stack

services:
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: admin
    volumes:
      - grafana_data:/var/lib/grafana

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    ports:
      - "8080:8080"
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro

  alertmanager:
    image: prom/alertmanager:latest
    ports:
      - "9093:9093"
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml

volumes:
  prometheus_data:
  grafana_data:

Grafana Dashboards

Common Container Metrics to Monitor

Metric	Description	Alert Threshold
CPU Usage	Container CPU percentage	> 80%
Memory Usage	Container memory usage	> 85%
Network I/O	Bytes sent/received	Anomaly detection
Disk I/O	Read/write operations	Depends on app
Container Restarts	Number of restarts	> 3 in 5 minutes

Sample PromQL Queries

# Container CPU usage percentage
sum(rate(container_cpu_usage_seconds_total{name!=""}[5m])) by (name) * 100

# Container memory usage
container_memory_usage_bytes{name!=""} / container_spec_memory_limit_bytes{name!=""} * 100

# Network receive rate
rate(container_network_receive_bytes_total{name!=""}[5m])

# Container restart count
changes(container_start_time_seconds{name!=""}[1h])

ELK Stack for Log Aggregation

Architecture

flowchart LR
    subgraph Apps["Applications"]
        App1["Container 1"]
        App2["Container 2"]
        App3["Container 3"]
    end

    subgraph Collection["Log Collection"]
        Filebeat["Filebeat"]
        Logstash["Logstash"]
    end

    subgraph Storage["Storage & Search"]
        ES["Elasticsearch"]
    end

    subgraph Viz["Visualization"]
        Kibana["Kibana"]
    end

    Apps --> Filebeat
    Filebeat --> Logstash
    Logstash --> ES
    ES --> Kibana

    style ES fill:#005571,color:#fff
    style Kibana fill:#e8478b,color:#fff

Docker Compose for ELK

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    volumes:
      - elasticsearch_data:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.0
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    ports:
      - "5044:5044"
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    ports:
      - "5601:5601"
    environment:
      ELASTICSEARCH_HOSTS: http://elasticsearch:9200
    depends_on:
      - elasticsearch

  filebeat:
    image: docker.elastic.co/beats/filebeat:8.11.0
    user: root
    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    depends_on:
      - logstash

volumes:
  elasticsearch_data:

Filebeat Configuration

# filebeat.yml
filebeat.inputs:
  - type: container
    paths:
      - '/var/lib/docker/containers/*/*.log'
    processors:
      - add_docker_metadata:
          host: "unix:///var/run/docker.sock"

output.logstash:
  hosts: ["logstash:5044"]

logging.level: info

Application-Level Logging

Structured Logging (Node.js)

const winston = require('winston');

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.combine(
    winston.format.timestamp(),
    winston.format.json()
  ),
  defaultMeta: { service: 'my-service' },
  transports: [
    new winston.transports.Console()
  ]
});

// Usage
logger.info('User logged in', { userId: '123', action: 'login' });

Structured Logging (Python)

import logging
import json

class JSONFormatter(logging.Formatter):
    def format(self, record):
        log_record = {
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'message': record.getMessage(),
            'service': 'my-service'
        }
        return json.dumps(log_record)

handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logger = logging.getLogger()
logger.addHandler(handler)
logger.setLevel(logging.INFO)

Health Checks and Alerting

Container Health Checks

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

# docker-compose.yml
services:
  app:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Alertmanager Configuration

# alertmanager.yml
global:
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'alerts@example.com'

route:
  group_by: ['alertname']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 1h
  receiver: 'email-notifications'

receivers:
  - name: 'email-notifications'
    email_configs:
      - to: 'team@example.com'

  - name: 'slack-notifications'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/xxx'
        channel: '#alerts'

Alert Rules

# alert_rules.yml
groups:
  - name: container_alerts
    rules:
      - alert: ContainerHighCPU
        expr: sum(rate(container_cpu_usage_seconds_total{name!=""}[5m])) by (name) > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage on {{ $labels.name }}"

      - alert: ContainerHighMemory
        expr: container_memory_usage_bytes{name!=""} / container_spec_memory_limit_bytes{name!=""} > 0.85
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory usage on {{ $labels.name }}"

      - alert: ContainerRestarting
        expr: changes(container_start_time_seconds{name!=""}[15m]) > 3
        labels:
          severity: critical
        annotations:
          summary: "Container {{ $labels.name }} restarting frequently"

Best Practices

1. Centralize Logs

services:
  app:
    logging:
      driver: fluentd
      options:
        fluentd-address: fluentd:24224

2. Use Structured Logging

{
  "timestamp": "2025-01-18T10:00:00Z",
  "level": "info",
  "service": "api",
  "message": "Request processed",
  "requestId": "abc123",
  "duration": 45
}

3. Set Resource Limits with Monitoring

services:
  app:
    deploy:
      resources:
        limits:
          memory: 512M
          cpus: '0.5'

4. Monitor Key Metrics

Category	Metrics
Resources	CPU, Memory, Disk, Network
Application	Response time, Error rate, Throughput
Business	Active users, Transactions, Revenue

Summary

Tool	Purpose	Best For
Docker logs	Basic log viewing	Development
cAdvisor	Container metrics	Resource monitoring
Prometheus	Metrics collection	Time-series data
Grafana	Visualization	Dashboards
ELK Stack	Log aggregation	Search and analysis
Alertmanager	Alerting	Incident response

Key Takeaways

Use centralized logging - Don't rely on local container logs
Monitor container resources - CPU, memory, network, disk
Implement health checks - Detect problems early
Set up alerting - Get notified before users do
Use structured logging - JSON format for easy parsing
Retain logs appropriately - Balance storage costs and debugging needs

Next Steps

In the next article, we'll explore Docker CI/CD pipelines for automated building, testing, and deployment.

References

Docker Deep Dive, 5th Edition - Nigel Poulton
The Ultimate Docker Container Book, 3rd Edition - Dr. Gabriel N. Schenker
Prometheus Documentation
Grafana Documentation