Environment Alerts

Monitoring your application is always a good idea when deploying it into production. Engine Yard Cloud automatically monitors your environment and adds alerts to the dashboard when abnormal resource usage occurs. You will also be notified by email if you have added a notification email address to your environment.

Important! Email notifications are disabled by default until you add a notification email address to each of your environments.

The resources that Engine Yard Cloud monitors on a per-server basis include CPU usage, Swap Used, and Disk Usage. All the alert types that can be generated for your environment are listed below:

Alert Type Description Current Thresholds
Load Average

Load average represents the average system load an instance experiences over a period of time.

  • Warn: 4 x vCPU
  • Fail: 10 x vCPU

For example: for a 1 vCPU, the load would be 4.00 but for a 5 vCPU, it would be 20.00.

Note: A vCPU is the same as an ECU (an Amazon EC2 Compute Unit). For general information about how the load average is calculated, see Load (computing).

IO-Wait

The instance CPU is waiting for disk writes to complete before it can move on to other operations.

  • Warn: 40% iowait
  • Fail: 80% iowait
Swap Used The amount of swap hard disk space used as virtual memory resources. High swap is an indication that an instance needs more memory.
  • Warn: 50% Swap Used
  • Fail: 70% Swap Used
Free Space

Free space is monitored on these mount points: //data/db, and /mnt.

You might not realize the instance is almost out of disk space until you get this alert. The thresholds are calculated based on the space allocated to the mount point.

  • Warn: If the disk space for a particular mount point is 10 GB or less, then the warning threshold is 70% full. If the disk space is greater than 10 GB, then the warning threshold is 80% full.
  • Fail: 90% of disk space is full.

The best practice is to review the content of the volume and confirm the usage is appropriate to your use case. If you do need to increase the available space it might be possible for Engine Yard Support to resize your volume online if your instance is "current generation"; if not, you will need to replace the instance with one that includes a larger volume.

Backup Alerts

These indicate that your backup is running for more time than the current interval between your backups. This can often create situations where backups stack up behind each other driving up load on the target host. In the case of a replica, replication state is not evaluated during backups since it is common for replication to lag or stall during a backup.

As a result the backup tools now provide this warning and, by default, will not start a new backup if one is already running. Some possible ways to address this concern include scheduling a larger interval between backup runs using the dashboard or custom cookbooks, or upgrading to an instance that can process the backup job more quickly. This warning does not indicate anything is critically wrong with your database, and is safe to ignore; however, for visibility and awareness purposes it is not possible to disable this warning.

-
PostgreSQL Alerts Refer to PostgreSQL Alerts -

 

Alert thresholds can be modified using the Collectd chef recipe. However, please note that there is a known issue related to this recipe's interaction with the V7 stack, and as such it is not recommended for this stack. 

Engine Yard also recommends that you monitor the health of your application by specifying a health monitoring URL. For information on configuring a monitoring URL, see Monitor Application Uptime

Important! If you are a premium support customer, monitoring for your environment allows our support team to proactively respond to abnormal activity.


 

Comments

Article is closed for comments.