The application instances of Engine Yard environments can be horizontally scaled automatically based on certain criteria, with the number of instances running under the environment increased or decreased as required, without user intervention. This feature leverages AWS' Auto Scaling feature. Using this feature, AWS allows the definition of scaling policies that can be applied to your environment's application instances, with the EY Cloud platform then taking care of the instance configuration, in the same way as if they were provisioned manually through our UI or API.
Things to take into account:
- As it stands, this feature is for scaling application instances only. If you need to scale other type of instances, open a ticket with our Support Team to discuss other options.
- Auto Scaling is a per-environment configuration. On EY Cloud, a single auto scaling policy can't provision/terminate instances across environments.
- Auto Scaling is only possible on instances that live in a VPC. If your app instances aren't VPC native (that meaning their being EC2-Classic, regardless of ClassicLink status) then you'll need to migrate them to a VPC. Contact our Support Team for assistance on how this can be achieved.
- Time-based scaling is not available yet through the UI.
- New instances will be created from fresh EBS volumes. Given the nature of how AWS auto scaling provisions new instances and how EY Cloud handles snapshots, having auto scaled instances use the latest snapshot is essentially not possible. If your environments rely on snapshots to bring up new app instances, contact our Support Team to evaluate which alternatives are possible.
- When you Terminate an Environment the Auto Scaling configuration will be deleted.
This document covers the following topics:
- Creating an Auto Scaling Group for an Environment
- Defining Auto Scaling Policies
- Application Health Check
Creating an Auto Scaling Group for an Environment
To actually use Auto Scaling on a given environment, an Auto Scaling group has to be created at the environment level. Go to the 'More Options' section, find 'Auto Scaling'
and select it. If the option is greyed out a list of reasons will be displayed. If you require any assistance resolving the reasons please contact Support.
A new window will appear:
(Do not mind about dynamic scaling policies just yet. It will be covered below)
The default settings for an auto scaling group will have one app instance as the minimum and 10 as the maximum. Alter those as desired, as well as the 'health check grace' and 'default cooldown' values.
Please note that the app_master is not in the auto scaling group, in order to avoid any scaling process terminating it.
Click on 'Create Auto Scaling Group' and wait whilst the auto scaling group is provisioned and configured. The following message will show up:
Reload the page and if the provisioning is complete you will see the in-place configuration:
At this point the auto scaling group is provisioned and active. A new setting appears, 'Desired capacity', which by default matches 'Minimum number of servers'. This value is the number of instances that is currently seen as needing to run in the environment, and will be set dynamically by the auto scaler in response to the scaling policy. If you have left the defaults values then a new app instance will be added to the environment as soon as AWS triggers it.
Defining Auto Scaling Policies
The auto scale policies in EYCloud are very similar to the ones in AWS. While it's not necessarily required to know them in details, consider having a read of the AWS docs on the matter.
Click on 'Add New' got start working with policies
Every policy has three sections:
- The policy type and its name. Select here which type of on-demand scaling you'll be using, and give a name to the policy. The linked AWS doc provides a good explanation of the different on-demand types of scaling.
- What will trigger the policy (or its 'alarm'). Here keep in mind that 'metric' is 'aggregated' for 'numbers of periods' across all the instances that at already part of the auto scaling group. And will trigger if the result matches the 'trigger value' according to the 'operand'. In this context, 'period length' is how long the event will take place, which in EYCloud is related to the time it takes for an instance to be provisioned/terminated.
- The action to happen when the policy is triggered. Special attention has to be put here on the value for 'cooldown', which is how long the auto scaling engine will wait until trigger another action. This is related with how long it takes to provision a fully working instance in your environment, or to terminate it.
Click on 'Create' to create the policy, and then on 'Update Auto Scaling Group' to have the changes be put in place in AWS. A common mistake here is to forget to 'Update', in which case you'll need to redo the changes on the policy.
As it is with changes that involve major parts of the infrastructure, we advise to test policies on a non-production environment first. Pay special attention to provision and termination times for your application and adjust the 'period length' and 'cooldown' accordingly. Initially, testing can be conducted by adjusting the 'Desired Capacity' for the group to ensure instances boot and terminate cleanly.
Application Health Check
Defining a load balancer http health check for the application will ensure that new instances provisioned by the auto scaling process do not receive any traffic until they are fully configured and the application running upon them.
By default both HAProxy and Amazon's xLB load balancers will utilise a TCP health check. Such a health check requires only a TCP connection to be accepted by Nginx on the instances, resulting in situations where Nginx is running on the instance but the configuration and deployment of the application is not yet complete, leading to unconfigured instances being added to the load balancer and receiving requests, therefore returning application errors to requests.
It is therefore recommended that you configure a HTTP health check, with a fully-formed health check path, being one that queries a dynamic page of your application and requires a database call to complete.
For HAProxy (the default load balancer on EY) such a path can be added via the 'Edit Application' link, shown when viewing an Application in the EY dashboard. Configured a path will set the health check protocol to HTTP, and HAProxy will see 20x, 30x and 503 responses as healthy, with any other response being seen as unhealthy. Unhealthy instances will be pulled from the load balanced pool until such time that they return a healthy response.
For Amazon Classic Load Balancers the health check path, port and protocol can be set by expanding out the relevant Load Balancer on the 'Tools -> Classic Load Balancers' page. The response code is required to be a 200 for the instance to be seen as healthy, so the options and/or application should be configured to ensure the health check path receives this response and not a redirection response. To ensure the instances are passing the health check follow the 'View this load balancer in the AWS console' link, then select the health check tab and ensure all instances are 'healthy'. Unhealthy instances will not receive any traffic from the Classic Load Balancer.
For Amazon Application Load Balancers the health check path, port and protocol are configured for the Target Groups, which can be viewed from the link on the Environment page of the EY dashboard. The default response code is expected to be a 200, and whilst the acceptable response codes can be configured to allow others, other response codes do not guarantee the functionality of your application and so we recommend instead ensuring the health check options and/or your application are configured to return a 200 error for the health check path. To ensure the instances are passing the health check follow the 'View this load balancer in the AWS console' link for the relevant ALB on the 'Tools -> Application Load Balancers' page, then select 'Target Groups' in the list in the left-hand menu, then the 'Targets' tab and ensure all instances are 'healthy'. Important: Individual unhealthy instances will not receive any traffic from the Application Load Balancer. However, if all instances fail the health check, traffic will then be routed to them all, bringing the application online but rendering the health check void, so this situation should be avoided.
If you require any assistance configuring or verifying health checks please contact Engine Yard Support.