Restarting an instance

This document is new but should be mostly complete at this time. If you do encounter something not covered please reach out to support so we are aware and can assist you further.

Engine Yard gets notification of degraded instances from Amazon and forwards the message so that you can take action.

When you get one of these notifications, it might state the time that the underlying hardware will be taken offline. If there is no time specified, replace the degraded instance as soon as you can.

One way to deal with this type of maintenance is to replace the instance. Another is to restart it, which prompts the AWS Infrastructure to do a stop & start, which in turn moves the instance to a different underlying hardware, whilst preserving all EBS volumes. Read on if you want to go down that path.

  • Things to take into account
  • Known Concerns
    • MySQL DB Masters with Replicas
    • Postgres DBs
    • Postgres DB Masters with Replicas
    • Mixed Legacy/Non-Legacy Restarts
  • How to restart an instance

Things to take into account

When an instance is restarted, a few things happen. They are listed below, with the remedy action if available:

  • The public hostname and IP change. The environment needs an 'Apply' so that all instances are aware of the change, and a 'Deploy' if the instance is providing services to others.
  • If an ElasticIP is attached to the instance *and* the instance is in EC2 classic, the EIP gets detached. Use our CoreAPI cli to redo the assignment, or open a ticket with Support Team for assistance.

Known Concerns

MySQL DB Masters with Replicas

The replica references its master's hostname as part of its slave status information. When the IP of the master changes the replica will not be able to reconnect on its own. An easy way to fix this is to simply replace the replica database.

If you'd rather keep the existing replica it is possible to do so by stopping replication, and then issuing a change master statement referencing the appropriate coordinates. To do this:

  • Stop replication `mysql -u root -e 'stop slave'`
  • Grab the current status information `mysql -u root -e 'show slave status\G'| egrep 'Exec_Master_Log_Pos|Relay_Master_Log_File'`
  • Issue a change master statement with those values and the new db_master hostname: `mysql -u root -e"change master to master_host='#{new_master_hostname}', master_log_file='#{Relay_Master_Log_File}', master_log_pos=#{Exec_Master_Log_Pos}; start slave;"

Postgres DBs

If the instance is a PostgreSQL db on an older stacks that doesn't have /tmp as a tempfs then PostgreSQL may not start on their on due to socket file conflicts. If this is encountered issuing a restart to the Postgres server process with:

sudo -i /etc/init.d/postgresql-$(postgres -V | egrep -o '[0-9]{1,}\.[0-9]{1,}') restart

will result in an error message about a socket file. Removing that socket file will allow the server to start.

Postgres DB Masters with Replicas

Postgres replicas reference their master's hostname as part of the recovery.conf. When the IP of the master changes the configuration of the running server is no longer correct. Chef does automatically update the configuration file Postgres references, so after the chef run completes a Postgres restart on the replica will re-link replication.

sudo -i /etc/init.d/postgresql-$(postgres -V | egrep -o '[0-9]{1,}\.[0-9]{1,}') restart

Mixed Legacy/Non-Legacy Restarts

We have seen at least one case where after a restart of an environment that mixed custom-VPC and non-VPC hosts, the non-VPC hosts restarted without ClassicLink Enabled. It is possible to correct this through the AWS console by re-linking the non-VPC host(s) to the correct VPC. If you require assistance with this please open a ticket with our support team and escalate to our VNOC engineer in IRC.

How to restart an instance

You will find a Restart link next to each instance when viewing the environment.

 

Engine_Yard_Cloud___Environment_production2_of_application_mrqe.png