Back to Blog
November 202310 min read

Advanced AWS CloudWatch Monitoring Techniques

MonitoringAWSCloudWatch

Introduction

Effective monitoring is crucial for maintaining healthy cloud infrastructure. This guide explores advanced CloudWatch techniques for comprehensive monitoring and alerting in AWS environments.

Custom Metrics

1. Creating Custom Metrics

# Using AWS CLI
aws cloudwatch put-metric-data   --namespace "CustomMetrics"   --metric-name "ApplicationErrors"   --value 5   --timestamp $(date -u +"%Y-%m-%dT%H:%M:%SZ")   --dimensions Service=API,Environment=Production

# Using AWS SDK (Python)
import boto3

cloudwatch = boto3.client('cloudwatch')

response = cloudwatch.put_metric_data(
    Namespace='CustomMetrics',
    MetricData=[
        {
            'MetricName': 'ApplicationErrors',
            'Value': 5,
            'Unit': 'Count',
            'Dimensions': [
                {
                    'Name': 'Service',
                    'Value': 'API'
                },
                {
                    'Name': 'Environment',
                    'Value': 'Production'
                }
            ]
        }
    ]
)

2. Metric Filters

# CloudWatch Log Metric Filter
{
  "filterPattern": "[timestamp, requestid, field3, field4, field5, field6, statuscode=5*, size, field9...]",
  "metricValue": "1",
  "metricName": "5xxErrorCount",
  "metricNamespace": "API/Errors"
}

CloudWatch Dashboards

1. Dashboard Creation

{
    "widgets": [
        {
            "type": "metric",
            "properties": {
                "metrics": [
                    [ "AWS/EC2", "CPUUtilization", "InstanceId", "i-1234567890" ]
                ],
                "period": 300,
                "stat": "Average",
                "region": "us-west-2",
                "title": "EC2 CPU Utilization"
            }
        },
        {
            "type": "metric",
            "properties": {
                "metrics": [
                    [ "AWS/RDS", "FreeStorageSpace", "DBInstanceIdentifier", "database-1" ]
                ],
                "period": 300,
                "stat": "Average",
                "region": "us-west-2",
                "title": "RDS Free Storage Space"
            }
        }
    ]
}

Alerting Strategies

1. CloudWatch Alarms

# Using AWS CLI
aws cloudwatch put-metric-alarm   --alarm-name "HighCPUUtilization"   --alarm-description "CPU usage exceeds 80%"   --metric-name "CPUUtilization"   --namespace "AWS/EC2"   --statistic "Average"   --period 300   --threshold 80   --comparison-operator "GreaterThanThreshold"   --evaluation-periods 2   --alarm-actions "arn:aws:sns:region:account-id:topic-name"

2. Composite Alarms

aws cloudwatch put-composite-alarm   --alarm-name "SystemHealthAlarm"   --alarm-rule "ALARM(HighCPUUtilization) AND ALARM(LowFreeMemory)"   --alarm-description "System health composite alarm"   --alarm-actions "arn:aws:sns:region:account-id:topic-name"

Log Insights

1. Query Examples

# Error Analysis Query
fields @timestamp, @message
| filter @message like /Error/
| stats count(*) as errorCount by bin(30m)
| sort errorCount desc

# Latency Analysis
fields @timestamp, @message
| filter @message like /Duration/
| parse @message /Duration: (?<duration>.*?) ms/
| stats avg(duration) as avgLatency by bin(5m)

Container Monitoring

1. ECS Container Insights

# Enable Container Insights
aws ecs update-cluster-settings   --cluster your-cluster-name   --settings name=containerInsights,value=enabled

# CloudWatch Agent Configuration
{
  "metrics": {
    "metrics_collected": {
      "ecs": {
        "metrics_collection_interval": 60
      }
    }
  }
}

Performance Optimization

1. Metric Math

{
  "metrics": [
    [ "AWS/EC2", "CPUUtilization", "InstanceId", "i-1234567890" ],
    [ "AWS/EC2", "NetworkIn", "InstanceId", "i-1234567890" ],
    [ "AWS/EC2", "NetworkOut", "InstanceId", "i-1234567890" ],
    {
      "expression": "m1+m2",
      "label": "Total Network Traffic",
      "id": "e1"
    }
  ]
}

Best Practices Checklist

  • Set appropriate metric collection intervals
  • Use metric math for complex monitoring scenarios
  • Implement proper alarm thresholds
  • Configure actionable alerts
  • Maintain organized dashboards
  • Implement proper log retention policies

Cost Optimization

Tips for optimizing CloudWatch costs:

  • Use appropriate metric resolution
  • Implement log retention policies
  • Clean up unused dashboards and alarms
  • Use metric filters effectively

Conclusion

Implementing advanced CloudWatch monitoring techniques helps maintain robust and reliable AWS infrastructure. Regular monitoring and optimization of your monitoring strategy ensures effective system oversight and timely incident response.

Author

Sandeep Choudhary

DevOps Engineer with expertise in AWS, Kubernetes, and cloud infrastructure.