Skip to content

🗄️ Log Aggregation Strategy

📍 Dónde van los Logs

╔═══════════════════════════════════════════════════════════╗ ║ Environment │ Destination │ Retention ║ ╠═════════════╪══════════════════════════╪════════════════╣ ║ Development │ Local files │ 7 días ║ ║ Staging │ ELK stack (centralized) │ 30 días ║ ║ Production │ ELK + archive S3 │ 90d + 1 año ║ ╚═══════════════════════════════════════════════════════════╝

Development

Storage: - Local filesystem: ./logs/ - Console output for immediate feedback

Rotation: - Daily rotation - 7 days retention - Auto-cleanup on CI/CD builds

Access: - Direct file access - tail -f logs/audiolab.log

Staging

Storage: - Centralized ELK stack - Backup to S3 daily

Rotation: - 30 days in Elasticsearch (hot storage) - Archived to S3 after 30 days - Compressed archives

Access: - Kibana dashboards - S3 for historical analysis

Production

Storage: - Primary: ELK stack (hot data) - Secondary: S3 glacier (cold archive)

Retention Policy:

0-90 days:    Elasticsearch (searchable, fast)
90-365 days:  S3 Standard (archived, slower)
365+ days:    S3 Glacier (compliance, very slow)

Access: - Kibana for recent logs - S3 console for archives - Automated compliance exports

🏗️ Stack Options

Components: - Elasticsearch: Log storage and indexing - Logstash: Log parsing and transformation - Kibana: Visualization and search UI - Filebeat: Log shipping agent

Pros: ✅ Open source (Apache 2.0) ✅ Powerful full-text search ✅ Rich visualization (Kibana dashboards) ✅ Scalable (petabyte-scale proven) ✅ Large ecosystem (integrations, plugins)

Cons: ⚠️ Resource intensive (RAM hungry) ⚠️ Complex setup and tuning ⚠️ Requires dedicated infrastructure ⚠️ Can be overkill for small deployments

Best for: - Production environments - High log volume (>100 GB/day) - Multiple services/components - Advanced analytics requirements

Resource Requirements:

Small deployment (< 10 GB/day):
- Elasticsearch: 4 GB RAM, 2 CPU
- Logstash: 2 GB RAM, 1 CPU
- Kibana: 1 GB RAM, 1 CPU

Medium deployment (10-100 GB/day):
- Elasticsearch: 16 GB RAM, 4 CPU
- Logstash: 4 GB RAM, 2 CPU
- Kibana: 2 GB RAM, 1 CPU

Docker Compose Example:

version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
    ports:
      - "9200:9200"
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.0
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    ports:
      - "5601:5601"
    depends_on:
      - elasticsearch

volumes:
  elasticsearch-data:

Components: - Loki: Log aggregation engine - Promtail: Log shipping agent - Grafana: Visualization UI

Pros: ✅ Lightweight (low resource usage) ✅ Integrates with Grafana (unified metrics + logs) ✅ Label-based indexing (cheaper than full-text) ✅ Faster setup than ELK ✅ LogQL query language (similar to PromQL)

Cons: ⚠️ Less powerful search than Elasticsearch ⚠️ No full-text indexing (searches raw logs) ⚠️ Smaller ecosystem ⚠️ Younger project (less mature)

Best for: - Staging environments - Low-to-medium log volume - Already using Grafana for metrics - Cost-conscious deployments

Resource Requirements:

Typical deployment:
- Loki: 1 GB RAM, 1 CPU
- Promtail: 256 MB RAM, 0.5 CPU
- Grafana: 512 MB RAM, 1 CPU

Docker Compose Example:

version: '3.8'
services:
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    volumes:
      - ./loki-config.yaml:/etc/loki/local-config.yaml
      - loki-data:/loki

  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - ./promtail-config.yaml:/etc/promtail/config.yaml
      - /var/log:/var/log
    depends_on:
      - loki

  grafana:
    image: grafana/grafana:10.2.0
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    depends_on:
      - loki

volumes:
  loki-data:

Option 3: CloudWatch Logs (AWS)

Components: - CloudWatch Logs: Managed log storage - CloudWatch Insights: Query interface - CloudWatch Agent: Log shipping

Pros: ✅ Fully managed (no infrastructure) ✅ Native AWS integration (Lambda, EC2, ECS) ✅ Pay-as-you-go pricing ✅ Integrated with AWS alarms ✅ Automatic retention policies

Cons: ⚠️ Vendor lock-in (AWS only) ⚠️ Can be expensive at scale ⚠️ Limited query capabilities vs Elasticsearch ⚠️ Data egress costs

Best for: - AWS-native deployments - Serverless architectures - Small teams (no ops overhead) - Short-term log retention

Pricing (approximate):

Ingestion: $0.50 per GB
Storage:   $0.03 per GB/month
Queries:   $0.005 per GB scanned

Option 4: Self-Hosted File Rotation (Development)

Components: - Log4cpp / Serilog file sinks - Logrotate (Linux) or scheduled task (Windows)

Pros: ✅ Zero external dependencies ✅ Simple setup ✅ Fast for local development ✅ No network latency

Cons: ⚠️ No centralization (can't search across machines) ⚠️ Manual access required ⚠️ No visualizations ⚠️ Limited analysis capabilities

Best for: - Local development - Debugging single components - CI/CD ephemeral environments

📦 Shipping Logs

Filebeat Configuration

# filebeat.yml
filebeat.inputs:
  - type: log
    enabled: true
    paths:
      - /var/log/audiolab/*.log
    fields:
      app: audiolab
      env: production
      component: core
    fields_under_root: true

  - type: log
    enabled: true
    paths:
      - /var/log/audiolab/errors/*.log
    fields:
      app: audiolab
      env: production
      severity: error
    fields_under_root: true

# Output to Elasticsearch
output.elasticsearch:
  hosts: ["elasticsearch:9200"]
  index: "audiolab-logs-%{+yyyy.MM.dd}"
  username: "elastic"
  password: "${ELASTIC_PASSWORD}"

# Output to Logstash (alternative)
# output.logstash:
#   hosts: ["logstash:5044"]

# Processors
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

# Logging
logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644

Promtail Configuration

# promtail-config.yaml
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: audiolab
    static_configs:
      - targets:
          - localhost
        labels:
          job: audiolab
          __path__: /var/log/audiolab/*.log

    pipeline_stages:
      - regex:
          expression: '^(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}) \[(?P<level>\w+)\] (?P<logger>\S+): (?P<message>.*)$'

      - labels:
          level:
          logger:

      - timestamp:
          source: timestamp
          format: '2006-01-02 15:04:05.000'

CloudWatch Agent Configuration

{
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/audiolab/*.log",
            "log_group_name": "/audiolab/application",
            "log_stream_name": "{instance_id}",
            "retention_in_days": 90,
            "timezone": "UTC"
          },
          {
            "file_path": "/var/log/audiolab/errors/*.log",
            "log_group_name": "/audiolab/errors",
            "log_stream_name": "{instance_id}",
            "retention_in_days": 365,
            "timezone": "UTC"
          }
        ]
      }
    }
  }
}

🔍 Log Parsing

Logstash Pipeline

# logstash.conf
input {
  beats {
    port => 5044
  }
}

filter {
  # Parse JSON logs
  if [message] =~ /^{.*}$/ {
    json {
      source => "message"
    }
  }

  # Parse structured logs
  grok {
    match => {
      "message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{DATA:logger}: %{GREEDYDATA:message}"
    }
    overwrite => ["message"]
  }

  # Convert timestamp
  date {
    match => ["timestamp", "ISO8601"]
    target => "@timestamp"
  }

  # Add computed fields
  mutate {
    add_field => {
      "ingested_at" => "%{@timestamp}"
    }
  }

  # Drop debug logs in production
  if [env] == "production" and [level] == "DEBUG" {
    drop { }
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
    index => "audiolab-logs-%{+YYYY.MM.dd}"
    user => "elastic"
    password => "${ELASTIC_PASSWORD}"
  }

  # Also output to S3 for archival
  s3 {
    region => "us-east-1"
    bucket => "audiolab-logs-archive"
    size_file => 104857600  # 100 MB
    time_file => 15         # 15 minutes
    codec => "json_lines"
  }
}

📊 Retention & Archival

Elasticsearch ILM Policy

{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_age": "1d",
            "max_size": "50gb"
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "freeze": {}
        }
      },
      "delete": {
        "min_age": "90d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

S3 Lifecycle Policy

{
  "Rules": [
    {
      "Id": "ArchiveOldLogs",
      "Status": "Enabled",
      "Transitions": [
        {
          "Days": 90,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 365,
          "StorageClass": "GLACIER"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

🚀 Deployment Checklist

Development

  • Local file logging configured
  • Console output enabled
  • DEBUG level active
  • Auto-rotation (7 days)

Staging

  • Centralized logging (Loki/ELK)
  • Log shipper deployed (Promtail/Filebeat)
  • Dashboards configured
  • INFO level minimum
  • 30-day retention

Production

  • High-availability logging cluster
  • Redundant log shippers
  • Automated archival to S3
  • WARNING level minimum (ERROR for hot paths)
  • 90-day hot retention + 1-year archive
  • Alerts configured for log volume spikes
  • PII sanitization verified