Bearse Feature Account Health - Metrics Monitor v0.6.4

October 17, 2025

Bearse Feature Account Health - Metrics Monitor v0.6.4

Introduction

We’re excited to announce the release of Bearse Feature Account Health v0.6.4, which introduces a powerful new Metrics Monitor capability. This release enhances our account health monitoring suite by automatically identifying AWS resources that lack proper CloudWatch monitoring coverage, ensuring no critical infrastructure goes unmonitored.

The Metrics Monitor proactively scans your AWS environment to detect resources without alarms configured, helping maintain comprehensive observability and preventing monitoring blind spots that could lead to service disruptions.

Features

Metrics Monitor - Automated Monitoring Gap Detection

The new Metrics Monitor is a sophisticated monitoring service that automatically identifies AWS resources lacking CloudWatch alarms. This feature is designed to ensure complete monitoring coverage across your infrastructure by:

Core Functionality

Automated Resource Scanning: Continuously scans AWS resources to identify those without CloudWatch alarms
RDS Instance Monitoring: Currently supports Amazon RDS database instances with plans for expansion to other AWS services
Tag-Based Exclusions: Respects resource-level monitoring preferences through the base2_monitor tag
CloudWatch Integration: Publishes metrics about unmonitored resources directly to CloudWatch
Critical Alerting: Triggers critical alerts when resources without monitoring are detected

Smart Resource Detection

The Metrics Monitor intelligently identifies resources that should be monitored by:

Querying CloudWatch for all available metrics in supported namespaces (currently AWS/RDS)
Cross-referencing metrics with existing alarm configurations
Filtering out resources explicitly tagged for exclusion
Validating resource existence to avoid false positives

Configurable Monitoring Scope

Flexible Configuration: Easily enable or disable metrics monitoring through the EnableMetricsMonitoring parameter
Namespace Support: Built with extensible architecture to support additional AWS service namespaces
Dimension Awareness: Monitors specific metric dimensions (e.g., DBInstanceIdentifier for RDS)

Tag-Based Resource Management

Resources can be excluded from monitoring checks using the base2_monitor tag:

Set base2_monitor: false on any RDS instance to exclude it from monitoring requirements
Useful for temporary resources, development environments, or legacy systems with alternative monitoring

Comprehensive Alerting

Critical Severity: Unmonitored resources trigger critical alerts to ensure immediate attention
Actionable Notifications: Alerts include specific resource identifiers and monitoring gaps
CloudWatch Integration: All alerts are delivered through existing CloudWatch alarm infrastructure

Enhanced Infrastructure Configuration

Lambda Function Architecture

Efficient Execution: Lightweight Python-based Lambda function optimized for scanning large environments
Resource Management: Configured with appropriate memory (128MB) and timeout (900 seconds) for comprehensive scans
Error Handling: Robust error handling with detailed logging for troubleshooting

IAM Security Model

Implements least-privilege access with specific permissions for:

CloudWatch Operations: Read metrics, list alarms, and publish custom metrics
RDS Access: Describe instances and read resource tags
Logging: Standard AWS Lambda execution role permissions

Automated Scheduling

Hourly Execution: Runs every hour using EventBridge (CloudWatch Events) scheduling
Non-Disruptive: Designed to operate without impacting monitored resources
Scalable Architecture: Handles environments with hundreds of resources efficiently

Examples

Enabling Metrics Monitoring

To enable the Metrics Monitor in your Bearse deployment, configure the following parameter:

EnableMetricsMonitoring: 'true'

Excluding Resources from Monitoring

To exclude specific RDS instances from monitoring requirements, add the following tag:

Tags:
  - Key: base2_monitor
    Value: 'false'

CloudWatch Metrics

The Metrics Monitor publishes the following custom metric:

Namespace: MetricsMonitor
MetricName: ResourcesWithoutAlarms
Dimensions:
  - Name: Service
    Value: MetricsMonitor

Example Alert Configuration

The system automatically creates a CloudWatch alarm with the following configuration:

AlarmName: MetricsMonitoringAlarm
AlarmDescription: 'base2 - Resources without alarms detected'
MetricName: ResourcesWithoutAlarms
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
EvaluationPeriods: 1
Period: 3600 # 1 hour

Sample Lambda Function Output

When the Metrics Monitor detects unmonitored resources, it produces detailed logging:

Starting metrics monitoring check...
Checking metric: CPUUtilization in namespace: AWS/RDS
Found 5 resources to check for CPUUtilization in AWS/RDS with dimension DBInstanceIdentifier
Skipping resource prod-db-backup in AWS/RDS with dimension DBInstanceIdentifier because tag base2_monitor is set to false
Found 2 resources without monitoring alarms:
  - production-primary-db (AWS/RDS/CPUUtilization)
  - staging-replica-db (AWS/RDS/CPUUtilization)

Integration with Existing Monitoring

The Metrics Monitor seamlessly integrates with your existing monitoring infrastructure:

SNS Integration: Critical alerts are sent through your configured SNS topics
Log Aggregation: All monitoring activity is logged to CloudWatch Logs with 365-day retention
Metric Collection: Custom metrics are available for dashboarding and additional alerting

Implementation Details

Supported AWS Services

Amazon RDS: Database instances and clusters
Future Expansion: Architecture designed to easily support additional AWS services

Configuration Parameters

Parameter	Default	Description
`EnableMetricsMonitoring`	`true`	Enable/disable the Metrics Monitor functionality
`MonitoringNamespace`	`MetricsMonitor`	CloudWatch namespace for custom metrics

IAM Permissions Required

The Metrics Monitor requires the following AWS permissions:

cloudwatch:
  - PutMetricData
  - GetMetricStatistics
  - ListMetrics
  - DescribeAlarmsForMetric
rds:
  - DescribeDBClusters
  - DescribeDBInstances
  - ListTagsForResource

Resource Tagging Strategy

For optimal results, implement a consistent tagging strategy:

Monitoring Exclusions: Use base2_monitor: false for resources that don’t require monitoring
Environment Tags: Consider environment-specific monitoring requirements
Service Categories: Tag resources by service type for easier management

Conclusion

The Metrics Monitor in Bearse Feature Account Health v0.6.4 represents a significant advancement in automated infrastructure monitoring. By proactively identifying monitoring gaps, this feature helps ensure that no critical resources go unmonitored, reducing the risk of undetected service issues.

This release reinforces our commitment to comprehensive account health monitoring and provides the foundation for expanding monitoring coverage to additional AWS services. The intelligent, tag-aware approach ensures that monitoring requirements can be customized to match your specific infrastructure needs while maintaining complete visibility across your environment.

Next Steps for Users

Deploy the Update: Update your Bearse deployment to version 0.6.4
Configure Monitoring: Set EnableMetricsMonitoring: 'true' in your stack parameters
Review Resources: Audit existing resources and apply base2_monitor tags as needed
Monitor Alerts: Watch for critical alerts indicating unmonitored resources
Expand Coverage: Consider the monitoring requirements for your specific environment

The Metrics Monitor is designed to grow with your infrastructure, providing ongoing visibility and assurance that your monitoring coverage remains comprehensive as your AWS environment evolves.