After creating a VM and deploying an application on it, if disk/memory/cpu usage increases drastically then your application may stop responding. In order to handle such kind of situation, monitoring such components is important. Usually, cloud provider provide a monitoring service which can keep track of various resources created on the cloud. In case of AWS, CloudWatch is the monitoring solution.
Here is a list of items that can be tracked for a VM using CloudWatch.
You can define a threshold such as if CPU utilization is above 50% then send out an email. We need to create an alarm when an event occurs. In this case, the event is CPU utilization exceeding 50% threshold. CPU utilization — the item that we want to monitor is known as a metric.
Here is the specification for the alarm:
Metric = CPU utilization
Event = CPU utilization > 50
Action = Send an email notification
It is possible that VM’s CPU utilization goes up for a minute and then comes back down. If we were to send an email every time event occurs then you can get 100s of emails within few days. This is why a metric we should use average over X amount of time or period (such as 10 second, 1 minute and so on). Using this, we can refine event
Event : average (CPU utilization of 1 minute) is > 50
You can also configure to send a notification if above event occurs 3 times out of 5.
A metric can be in one of the following states:
- Ok = event has not occurred
- Alarm = event has occurred
- Insufficient data = self explanatory
We are interested to send an email notification when the metric is in alarm state. Let’s say we had 1000s of VMs, 100s of alarms. If we had a program which can send out an email then we need some kind of pipe like mechanism.
- On one end of the pipe, alarms can send or produce necessary data for the email. Here, alarms can be considered a producer.
- On the other, a program would read the data from the pipe and send out an email. This program can be considered a consumer.
- The pipe is known as a topic.
AWS has a service called Simple Notification Service (SNS) where one can create topics. Consumer program needs to subscribe to the topic. This is why it’s called a subscription in SNS which can be created using UI.
In this post, we looked at
- CloudWatch
- Metric
- Alarm
- Simple Notification Server
- Topic
- Subscription
References
Basic vs Detailed monitoring — using-cloudwatch-new.html
CloudWatch alarm — using-cloudwatch-createalarm.html
States of an alarm — AlarmThatSendsEmail.html
Simple Notification Service — welcome.html