Pages

AWS Landscape 15: Monitoring

CloudWatch provides comprehensive visibility across your multi-account SAP environment, aggregating metrics, logs, and events from EC2 instances, SAP HANA, application servers, Transit Gateway, and Palo Alto firewalls. Centralized dashboards and automated alarms enable proactive operations while supporting compliance and multi-region deployments.

Key Benefits
End-to-end monitoring of infrastructure + SAP applications.
Cross-account dashboards reduce operational complexity.
Automated remediation prevents downtime.
SAP-specific metrics alongside AWS infrastructure data.

Objective
Enable continuous monitoring across all SAP tiers and AWS resources with:
Centralized dashboards for performance/security visibility.
Alarms with automated response workflows.
SAP HANA + NetWeaver integration with CloudWatch Agent.

Monitoring Scope
Resource Metrics                             Logs
EC2         CPU, Memory, Disk I/O, Network /var/log/messages, CloudWatch Agent
SAP HANA DB size, connections, alerts nameserver_trace, indexserver_trace
SAP App         Dialog response time, users dev_w*, workprocess logs
Network         TGW bytes/packets, Flow Logs VPC Flow Logs
Firewalls Threat logs, sessions         Palo Alto syslog

Technical Implementation Steps

Step 1: Deploy CloudWatch Agent
Install on all EC2 instances (SAP HANA, App Servers, ASCS):
Linux:
# Download and install
curl https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm -o agent.rpm
sudo rpm -U ./amazon-cloudwatch-agent.rpm

# Configure for SAP
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
Config JSON (/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json):
json
{
  "metrics": {
    "metrics_collected": {
      "Memory": {"measurement": ["mem_used_percent"]},
      "Disk": [{"device": "/dev/xvda1", "metrics": ["free"]}],
      "procstat": [{"exe": ["hdb", "sapstartsrv"]}]
    }
  },
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {"file_path": "/var/log/sap*", "log_group_name": "SAP-Application"},
          {"file_path": "/hana/shared/*/HDB*/trace/*", "log_group_name": "SAP-HANA"}
        ]
      }
    }
  }
}

Step 2: Enable Detailed Monitoring
EC2 Console → Instance → Actions → Monitor and troubleshoot → Manage detailed monitoring (1-min granularity).

Step 3: Create Cross-Account Dashboard
CloudWatch → Dashboards → Create dashboard:

Widgets:
1. EC2 CPU/Memory (Prod Account) - Line chart
2. SAP HANA Connections - Stacked area
3. TGW Traffic (bytes/packets) - Dual axis
4. Palo Alto Threat Events - Bar chart
5. Recent CloudTrail Events - Table

Step 4: Critical Alarms (CloudWatch → Alarms)
Metric Threshold Action
CPUUtilization >80% (5/5 mins) SNS → PagerDuty
MemoryUtilization >85% Scale ASG + Notify
DiskReadOps >1000 (spikes) Lambda → Add EBS volume
HANA_DB_Size >90% capacity Auto snapshot + alert
TGW_Packets_Dropped >100/sec Network team alert

CLI Example:
aws cloudwatch put-metric-alarm \
  --alarm-name SAP-CPU-High \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --evaluation-periods 2 \
  --alarm-actions arn:aws:sns:region:account:CriticalAlerts

Step 5: EventBridge Automation
Event Pattern:
{
  "source": ["aws.ec2"],
  "detail-type": ["EC2 Instance State-change Notification"],
  "detail": {"state": [{"name": "stopped"}] }
}

Targets:
- Lambda: Auto-restart SAP ASCS
- SNS: Notify on-call engineer
- SSM: Run diagnostic playbook

Step 6: VPC Flow Logs + Firewall Logs
VPC Flow Logs → CloudWatch Logs → Metric Filters
Filter: REJECT → Alarm on blocked SAP traffic
Palo Alto → rsyslog → CloudWatch Logs
/syslog → Log Group: PaloAlto-Threats

Monitoring Architecture

SAP-Specific Metrics (via CloudWatch Agent):
HANA: hdbcons 'systeminformation' → Custom metrics
App: SM50 work process utilization → Log parsing
Solution Manager: EWA reports → CloudWatch Insights

Cross-Account Setup (Organizations):
Management Account: Create CloudWatch Monitors.
Member Accounts: Enable cross-account observability.
RAM: Share dashboards across OUs.

Retention & Compliance:
Prod Logs: 400 days → S3 Glacier
Dev Logs: 30 days → Delete
Audit Logs: 7 years → S3 + Athena

Validation Dashboard Queries:
# CloudWatch Logs Insights - SAP Errors
fields @timestamp, @message
filter @message like /ERROR|FAIL/
stats count(*) by bin(5m)
sort @timestamp desc

No comments:

Post a Comment