Before you get started, you might like to view this short video on High Availability.
Takeover is the Engine Yard failover process for recovering from failure of an application master instance.
- Takeover requires that you have at least one application slave in your environment.
- The takeover preferences in your environment determine the method in which a takeover occurs. For example, it is possible to disable automated takeovers. In this case, you need to manually initiate application takeovers instead.
If your application master has failed and takeover is happening, you:
- Receive automated email notification from Engine Yard.
- See a takeover message for the environment on your dashboard.
Takeover occurs when Engine Yard detects that your application master is unable to reliably respond to requests. For example, this can happen because of an Amazon EC2 issue or because the instance froze. If the instance does not recover within a short time, Engine Yard does the following:
- Terminates the problem instance.
- Promotes an application slave to master, using the method configured by the environment's takeover preference.
- If your environment does not have elastic IP (EIP) addresses on your application slaves, Engine Yard assigns the old application master’s IP address to the new master. If you have EIP addresses on your application slaves, see EIP Addressing.
- Replaces the application slave instance that was promoted. The new application slave uses the same version of the stack as the other instances in that environment.
- Deletes the old application master. However, if you have configured your environment to detach and create a utility instance, then Engine Yard will act accordingly. See the failed app master behavior setting for more information.
Important: If you or Engine Yard has initiated an application master takeover, then do not initiate a database replica takeover until the app takeover has completed. Initiating two major environment configurations simultaneously may leave your environment in an erroneous state. For example, the dashboard may show that an application slave has been promoted successfully when it has not.
If you have EIP addresses on your application slaves and an application slave is promoted to a master, the new application master will retain the IP address of the promoted application slave. Below is an example of this EIP assignment.
Important: If your domain points to the EIP address, this addressing assignment will crash your application because the EIP does not point to any instances.
Application master IP address: 220.127.116.11
Application slave IP address: 18.104.22.168
Application master IP address: 22.214.171.124
The Engine Yard dashboard shows 126.96.36.199 as available and unused.
Apply Cron Jobs and Custom Chef Recipes
The new application master is the same as the old one with two important exceptions:
- Cron jobs are not set up.
- Custom Chef recipes are not applied.
To apply cron jobs and custom Chef recipes to the new application master:
- Log in to Engine Yard.
- Click the environment name on your dashboard.
- Click Apply.
This action applies/re-applies configuration, including cron jobs and custom Chef recipes as appropriate, to all instances in the environment.
Do the following to prepare your environment in case of application master failover.
To prepare your environment
Do keep a spare application server.
Make sure that your cluster has one spare application server. For example, if you need three application servers to serve the everyday traffic, put four application servers in your cluster. Without a spare, your site might fail or slow down under load during the takeover.
Don’t keep important data on the application server alone.
For example, if your application stores user-generated content, consider an online storage web service (such as Amazon S3).
In some situations you might want to manually initiate a takeover. For example, if you simulate a takeover, you can test your recovery processes.
To manually trigger a takeover
Use the Promote Application Slave feature.
When the takeover is complete, you will be in the post-takeover state and need to perform the steps mentioned in Action required to apply cron jobs and custom Chef recipes.
|For more information about...||See...|
|Using online storage web services (such as Amazon S3) with applications||Use CarrierWave (and optionally fog) to upload and store files|
|Frozen instances||Deal with frozen or crashed instances|