Skip to content

Commit

Permalink
Enable alarms
Browse files Browse the repository at this point in the history
  • Loading branch information
farski committed Jun 18, 2024
1 parent 96d7f37 commit b766b6f
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 46 deletions.
4 changes: 2 additions & 2 deletions components/ftp-connection-check/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Each stack deployed using the included CloudFormation template tests a single FT

It's expected that, for any given production FTP server, several of these stacks will be deployed, testing connectivity to the server from multiple geographic regions. For example, if an FTP server is running in us-east-1, stacks may be deployed in us-east-2, us-west-2, and ca-central-1, with each one targetting that server running in us-east-1.

The status of the Route 53 health check matches the state of a CloudWatch alarm that is also created in the stack. The the Lambda function that is actually running the connection tests fails, the CloudWatch alarm will move into an ALARM state, which will cause the health check to move into an UNHEALTHY state.
The status of the Route 53 health check matches the state of a CloudWatch alarm that is also created in the stack. If the Lambda function that is actually running the connection tests fails, the CloudWatch alarm will move into an ALARM state, which will cause the health check to move into an UNHEALTHY state.

On their own, these health checks don't have any impact on other Route 5 resources, like DNS records. Other Route 53 health checks should be created that list these health checks as _child health checks_.
On their own, these health checks don't have any impact on other Route 53 resources, like DNS records. Other Route 53 health checks should be created that list these health checks as _child health checks_.

### Deployment

Expand Down
88 changes: 44 additions & 44 deletions components/hosted-zone/template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -297,29 +297,29 @@ Resources:
# This alarm does **not** have any influence over DNS records. It is used
# solely for visibility. When the health check becomes UNHEALTHY, the alarm
# will move into an ALARM state, which sends alerts/etc.
# ProdUsEast1HealthCheckAlarm:
# Type: AWS::CloudWatch::Alarm
# Condition: CreateProdHealthCheckUSEAST1
# Properties:
# AlarmName: !Sub FATAL [FTP] us-east-1 Server Connectivity <prod> UNHEALTHY (${AWS::StackName})
# AlarmDescription: >-
# All connection tests for prodoction us-east-1 FTP servers are failing.
# This likely means there is a service issue with Transfer Family,
# Lambda, or RDS in us-east-1.
ProdUsEast1HealthCheckAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateProdHealthCheckUSEAST1
Properties:
AlarmName: !Sub FATAL [FTP] us-east-1 Server Connectivity <prod> UNHEALTHY (${AWS::StackName})
AlarmDescription: >-
All connection tests for production us-east-1 FTP servers are failing.
This likely means there is a service issue with Transfer Family,
Lambda, or RDS in us-east-1.
# By the time this alarm has been triggered, the region has already been
# removed from the DNS pool.
# ComparisonOperator: LessThanThreshold
# Dimensions:
# - Name: HealthCheckId
# Value: !GetAtt ProdUsEast1HealthCheck.HealthCheckId
# EvaluationPeriods: 1
# MetricName: HealthCheckStatus
# Namespace: AWS/Route53
# Period: 60
# Statistic: Minimum
# Threshold: 1
# TreatMissingData: breaching
By the time this alarm has been triggered, the region has already been
removed from the DNS pool.
ComparisonOperator: LessThanThreshold
Dimensions:
- Name: HealthCheckId
Value: !GetAtt ProdUsEast1HealthCheck.HealthCheckId
EvaluationPeriods: 1
MetricName: HealthCheckStatus
Namespace: AWS/Route53
Period: 60
Statistic: Minimum
Threshold: 1
TreatMissingData: breaching

# This health check determines if the us-west-2 FTP server is included in
# the active DNS pool. If this health check in UNHEALTHY, the us-west-2
Expand All @@ -342,26 +342,26 @@ Resources:
# This alarm does **not** have any influence over DNS records. It is used
# solely for visibility. When the health check becomes UNHEALTHY, the alarm
# will move into an ALARM state, which sends alerts/etc.
# ProdUsWest2HealthCheckAlarm:
# Type: AWS::CloudWatch::Alarm
# Condition: CreateProdHealthCheckUSWEST2
# Properties:
# AlarmName: !Sub FATAL [FTP] us-west-2 Server Connectivity <prod> UNHEALTHY (${AWS::StackName})
# AlarmDescription: >-
# All connection tests for prodoction us-west-2 FTP servers are failing.
# This likely means there is a service issue with Transfer Family,
# Lambda, or RDS in us-west-2.
ProdUsWest2HealthCheckAlarm:
Type: AWS::CloudWatch::Alarm
Condition: CreateProdHealthCheckUSWEST2
Properties:
AlarmName: !Sub FATAL [FTP] us-west-2 Server Connectivity <prod> UNHEALTHY (${AWS::StackName})
AlarmDescription: >-
All connection tests for production us-west-2 FTP servers are failing.
This likely means there is a service issue with Transfer Family,
Lambda, or RDS in us-west-2.
# By the time this alarm has been triggered, the region has already been
# removed from the DNS pool.
# ComparisonOperator: LessThanThreshold
# Dimensions:
# - Name: HealthCheckId
# Value: !GetAtt ProdUsWest2HealthCheck.HealthCheckId
# EvaluationPeriods: 1
# MetricName: HealthCheckStatus
# Namespace: AWS/Route53
# Period: 60
# Statistic: Minimum
# Threshold: 1
# TreatMissingData: breaching
By the time this alarm has been triggered, the region has already been
removed from the DNS pool.
ComparisonOperator: LessThanThreshold
Dimensions:
- Name: HealthCheckId
Value: !GetAtt ProdUsWest2HealthCheck.HealthCheckId
EvaluationPeriods: 1
MetricName: HealthCheckStatus
Namespace: AWS/Route53
Period: 60
Statistic: Minimum
Threshold: 1
TreatMissingData: breaching

0 comments on commit b766b6f

Please sign in to comment.