🗄️ Log Aggregation Strategy¶
📍 Dónde van los Logs¶
╔═══════════════════════════════════════════════════════════╗ ║ Environment │ Destination │ Retention ║ ╠═════════════╪══════════════════════════╪════════════════╣ ║ Development │ Local files │ 7 días ║ ║ Staging │ ELK stack (centralized) │ 30 días ║ ║ Production │ ELK + archive S3 │ 90d + 1 año ║ ╚═══════════════════════════════════════════════════════════╝
Development¶
Storage:
- Local filesystem: ./logs/
- Console output for immediate feedback
Rotation: - Daily rotation - 7 days retention - Auto-cleanup on CI/CD builds
Access:
- Direct file access
- tail -f logs/audiolab.log
Staging¶
Storage: - Centralized ELK stack - Backup to S3 daily
Rotation: - 30 days in Elasticsearch (hot storage) - Archived to S3 after 30 days - Compressed archives
Access: - Kibana dashboards - S3 for historical analysis
Production¶
Storage: - Primary: ELK stack (hot data) - Secondary: S3 glacier (cold archive)
Retention Policy:
0-90 days: Elasticsearch (searchable, fast)
90-365 days: S3 Standard (archived, slower)
365+ days: S3 Glacier (compliance, very slow)
Access: - Kibana for recent logs - S3 console for archives - Automated compliance exports
🏗️ Stack Options¶
Option 1: ELK Stack (Recommended for Production)¶
Components: - Elasticsearch: Log storage and indexing - Logstash: Log parsing and transformation - Kibana: Visualization and search UI - Filebeat: Log shipping agent
Pros: ✅ Open source (Apache 2.0) ✅ Powerful full-text search ✅ Rich visualization (Kibana dashboards) ✅ Scalable (petabyte-scale proven) ✅ Large ecosystem (integrations, plugins)
Cons: ⚠️ Resource intensive (RAM hungry) ⚠️ Complex setup and tuning ⚠️ Requires dedicated infrastructure ⚠️ Can be overkill for small deployments
Best for: - Production environments - High log volume (>100 GB/day) - Multiple services/components - Advanced analytics requirements
Resource Requirements:
Small deployment (< 10 GB/day):
- Elasticsearch: 4 GB RAM, 2 CPU
- Logstash: 2 GB RAM, 1 CPU
- Kibana: 1 GB RAM, 1 CPU
Medium deployment (10-100 GB/day):
- Elasticsearch: 16 GB RAM, 4 CPU
- Logstash: 4 GB RAM, 2 CPU
- Kibana: 2 GB RAM, 1 CPU
Docker Compose Example:
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
environment:
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
ports:
- "9200:9200"
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
logstash:
image: docker.elastic.co/logstash/logstash:8.11.0
volumes:
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
depends_on:
- elasticsearch
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
ports:
- "5601:5601"
depends_on:
- elasticsearch
volumes:
elasticsearch-data:
Option 2: Loki + Grafana (Recommended for Staging)¶
Components: - Loki: Log aggregation engine - Promtail: Log shipping agent - Grafana: Visualization UI
Pros: ✅ Lightweight (low resource usage) ✅ Integrates with Grafana (unified metrics + logs) ✅ Label-based indexing (cheaper than full-text) ✅ Faster setup than ELK ✅ LogQL query language (similar to PromQL)
Cons: ⚠️ Less powerful search than Elasticsearch ⚠️ No full-text indexing (searches raw logs) ⚠️ Smaller ecosystem ⚠️ Younger project (less mature)
Best for: - Staging environments - Low-to-medium log volume - Already using Grafana for metrics - Cost-conscious deployments
Resource Requirements:
Typical deployment:
- Loki: 1 GB RAM, 1 CPU
- Promtail: 256 MB RAM, 0.5 CPU
- Grafana: 512 MB RAM, 1 CPU
Docker Compose Example:
version: '3.8'
services:
loki:
image: grafana/loki:2.9.0
ports:
- "3100:3100"
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml
- loki-data:/loki
promtail:
image: grafana/promtail:2.9.0
volumes:
- ./promtail-config.yaml:/etc/promtail/config.yaml
- /var/log:/var/log
depends_on:
- loki
grafana:
image: grafana/grafana:10.2.0
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
depends_on:
- loki
volumes:
loki-data:
Option 3: CloudWatch Logs (AWS)¶
Components: - CloudWatch Logs: Managed log storage - CloudWatch Insights: Query interface - CloudWatch Agent: Log shipping
Pros: ✅ Fully managed (no infrastructure) ✅ Native AWS integration (Lambda, EC2, ECS) ✅ Pay-as-you-go pricing ✅ Integrated with AWS alarms ✅ Automatic retention policies
Cons: ⚠️ Vendor lock-in (AWS only) ⚠️ Can be expensive at scale ⚠️ Limited query capabilities vs Elasticsearch ⚠️ Data egress costs
Best for: - AWS-native deployments - Serverless architectures - Small teams (no ops overhead) - Short-term log retention
Pricing (approximate):
Option 4: Self-Hosted File Rotation (Development)¶
Components: - Log4cpp / Serilog file sinks - Logrotate (Linux) or scheduled task (Windows)
Pros: ✅ Zero external dependencies ✅ Simple setup ✅ Fast for local development ✅ No network latency
Cons: ⚠️ No centralization (can't search across machines) ⚠️ Manual access required ⚠️ No visualizations ⚠️ Limited analysis capabilities
Best for: - Local development - Debugging single components - CI/CD ephemeral environments
📦 Shipping Logs¶
Filebeat Configuration¶
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/audiolab/*.log
fields:
app: audiolab
env: production
component: core
fields_under_root: true
- type: log
enabled: true
paths:
- /var/log/audiolab/errors/*.log
fields:
app: audiolab
env: production
severity: error
fields_under_root: true
# Output to Elasticsearch
output.elasticsearch:
hosts: ["elasticsearch:9200"]
index: "audiolab-logs-%{+yyyy.MM.dd}"
username: "elastic"
password: "${ELASTIC_PASSWORD}"
# Output to Logstash (alternative)
# output.logstash:
# hosts: ["logstash:5044"]
# Processors
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# Logging
logging.level: info
logging.to_files: true
logging.files:
path: /var/log/filebeat
name: filebeat
keepfiles: 7
permissions: 0644
Promtail Configuration¶
# promtail-config.yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: audiolab
static_configs:
- targets:
- localhost
labels:
job: audiolab
__path__: /var/log/audiolab/*.log
pipeline_stages:
- regex:
expression: '^(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}) \[(?P<level>\w+)\] (?P<logger>\S+): (?P<message>.*)$'
- labels:
level:
logger:
- timestamp:
source: timestamp
format: '2006-01-02 15:04:05.000'
CloudWatch Agent Configuration¶
{
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/audiolab/*.log",
"log_group_name": "/audiolab/application",
"log_stream_name": "{instance_id}",
"retention_in_days": 90,
"timezone": "UTC"
},
{
"file_path": "/var/log/audiolab/errors/*.log",
"log_group_name": "/audiolab/errors",
"log_stream_name": "{instance_id}",
"retention_in_days": 365,
"timezone": "UTC"
}
]
}
}
}
}
🔍 Log Parsing¶
Logstash Pipeline¶
# logstash.conf
input {
beats {
port => 5044
}
}
filter {
# Parse JSON logs
if [message] =~ /^{.*}$/ {
json {
source => "message"
}
}
# Parse structured logs
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:timestamp} \[%{LOGLEVEL:level}\] %{DATA:logger}: %{GREEDYDATA:message}"
}
overwrite => ["message"]
}
# Convert timestamp
date {
match => ["timestamp", "ISO8601"]
target => "@timestamp"
}
# Add computed fields
mutate {
add_field => {
"ingested_at" => "%{@timestamp}"
}
}
# Drop debug logs in production
if [env] == "production" and [level] == "DEBUG" {
drop { }
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "audiolab-logs-%{+YYYY.MM.dd}"
user => "elastic"
password => "${ELASTIC_PASSWORD}"
}
# Also output to S3 for archival
s3 {
region => "us-east-1"
bucket => "audiolab-logs-archive"
size_file => 104857600 # 100 MB
time_file => 15 # 15 minutes
codec => "json_lines"
}
}
📊 Retention & Archival¶
Elasticsearch ILM Policy¶
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_age": "1d",
"max_size": "50gb"
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
}
}
},
"cold": {
"min_age": "30d",
"actions": {
"freeze": {}
}
},
"delete": {
"min_age": "90d",
"actions": {
"delete": {}
}
}
}
}
}
S3 Lifecycle Policy¶
{
"Rules": [
{
"Id": "ArchiveOldLogs",
"Status": "Enabled",
"Transitions": [
{
"Days": 90,
"StorageClass": "STANDARD_IA"
},
{
"Days": 365,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 2555
}
}
]
}
🚀 Deployment Checklist¶
Development¶
- Local file logging configured
- Console output enabled
- DEBUG level active
- Auto-rotation (7 days)
Staging¶
- Centralized logging (Loki/ELK)
- Log shipper deployed (Promtail/Filebeat)
- Dashboards configured
- INFO level minimum
- 30-day retention
Production¶
- High-availability logging cluster
- Redundant log shippers
- Automated archival to S3
- WARNING level minimum (ERROR for hot paths)
- 90-day hot retention + 1-year archive
- Alerts configured for log volume spikes
- PII sanitization verified