Set Up Database Replication

This document describes:

Note: If the master database fails, you can promote a replica (slave) to be the new master using the Engine Yard UI.


Database Replica Use Cases

The common use cases for a database replica are:

  • Providing a standby database server that can quickly be promoted to the master role in the event of a hard failure on the master database. This is highly recommended for Production environments where minimizing downtime is a priority.
  • Modifying the database instance size, or increasing the size or type (PIOPS) of the attached volumes.
  • Logical database backups (i.e. mysqldump) can be moved to a replica to remove this load from the master database; we publish the eybackup_slave cookbook to assist in managing the backup scheduled task. Postgres backups will automatically be moved to the first replica, when it exists; this is possible due to the more consistent nature of Postgres replication.
  • Read-only analytics/ad-hoc queries by development teams and business units.

Distributing Reads to Replicas

Replication is asynchronous, and while a replica typically will be within a few seconds of the master at any given time, there are statements that can delay replication more significantly. Having an application read from a delayed source could create conflicts within the application so this is not a safe default.

Typically it is best to start distributing reads by directing cached-based indexing tools (e.g. Thinking-Sphinx) to read from the replica first. The next step would be to add a replica read library (e.g. Octopus/Seamless_Database_Pool) to your application configuration and then selectively enable this type of read in targeted areas were the risk of a stale read would be a minor concern.


Selecting an Environment Type

Engine Yard offers four environment configurations: Single Instance, Staging Configuration, Production Configuration, and Custom Configuration.

For database replication, you need a Production Configuration or a Custom Configuration. 

  • Single Instance. For a small testing environment, a single server that contains both the application and the database is ideal.
  • Staging Configuration. The staging configuration does not include a database replica (slave).
    In non-production environments or environments with small databases, this configuration is a good choice. However, because the database is not replicated, make sure that you backup frequently.
    You can add a database replica (slave) to an existing staging configuration (see below).
  • Production Configuration and Custom Configuration. For typical applications in production environments where you want a database replica (slave), use a production or custom configuration environment containing the following:
    • One application master
    • One or two application slaves
    • One master database
    • A database replica (slave)

Note PostgreSQL only: If you expect your database server requirements to grow so that you will need a larger database instance, consider starting with a custom environment using 64-bit instances. PostgreSQL instances cannot be upgraded easily from 32-bit to 64-bit instances.


Create a New Environment with a Database Replica

This procedure describes how to set up a new environment that includes a database replica (db_slave) so that your database is replicated. This is best practice for a production environment.

(If you already have an environment and want to add a database replica (db_slave) to it, see Add a Database Replica to an Existing Environment below.)

To set up a custom environment with database replication

  1. Create the environment as described in Create an environment.

  2. Select Custom.

  3. Set the size of the master database instance (Server size) and the replica, and enter the size for the /db volume. By default, the /db volume for the replica (slave) will be the same size as the master. Note: consider if you want to setup Provisioned IOPs or Consider your /mnt Volume Size at this time.

  4. For information about choosing replica (slave) instance sizes, see About Database replica (slave) instance sizes.

  5. Click Boot This Configuration.


Add a Database Replica to an Existing Environment

Cluster Configuration

If you have a cluster and want to add a database replica, follow this procedure.

  1. On the Environment page, click Add.

  2. Click Add Database Slave Instance.

  3. Set the instance size and the volume size.
    Note: The volume size must be the same or larger than the master database.
    Note: You may want to Consider your /mnt Volume Size as part of adding a new replica.
    Note: You cannot enlarge a PostgreSQL database to a 64-bit instance using a 32-bit snapshot.

  4. For information about choosing instance sizes, see About Database replica (slave) instance sizes.

  5. MySQL only: If you have an existing snapshot of the database, you can use it instead of creating a new snapshot. To do this, select the checkbox indicating to use a recent snapshot.
    Note: The checkbox appears only if a recent snapshot is available. See About using a new or existing snapshot for the database replica (slave) for information.

  6. Click Add to Cluster.

  7. After the database slave boots, click Apply.
    This generates new database.yml files for the database replica (slave) on the application master.

  8. If your environment hasn’t been upgraded since December 2nd, 2011 (i.e. if you see “Enable creation of db slaves with a larger volume than the db master” in the change log), then the /db volume size is not automatically increased and you need to do one of the following:
  • Upgrade your environment, or
  • Submit a ticket with Engine Yard Support and ask to have /db volume size reset (to the size entered in Step 3 above).

Solo Configuration (Shared app_master/db_master)

The Solo configuration does not support application or db slave instances; in order to add one you'll need to Rebuild the Environment in a Cluster configuration.


About Database Replica Instance Sizes

You can create database replicas with smaller or larger instance sizes.

Database replicas are generated from either existing or new master database snapshots; this snapshot can impact your choice of instance size. For example, PostgreSQL 32-bit snapshots cannot be used to upsize to 64-bit instances.

Smaller database replicas (slaves)

You can save money by using a less expensive instance for your database replica (slave) than for your master database. In general, the minimum recommendation for the database replica (slave) is one size smaller than for the master database. For example, if the master database is running on a standard extra large instance, you might use a standard large instance for your database replica (slave).

Remember that in the case of master database failure, the database replica has to act as the master at least until a new instance as big as the original master can be created. 

Because replication is single-threaded, CPU is not a key factor in sizing. The key factor is the memory on the instance that can be allocated to buffers (the buffer pool innodb for MySQL and shared_buffers for PostgreSQL). For example, if the master database is on a High CPU XL instance, the database replica (slave) could be a High Memory XL instance, which costs less.

If you have concerns about the correct instance size when provisioning a database replica (slave), submit a ticket with Engine Yard Support.

Larger database replicas (slaves)

The sky and your budget are the limits when it comes to provisioning larger database replicas (slaves).
If you are running your database master on a High CPU Medium and need better database performance, we recommend upgrading to the High CPU XL. (Engine Yard does not recommend upgrading from High CPU Medium to a Standard Large because the Standard Large is often less performant than a High CPU Medium.)


Consider your /mnt Volume Size

The /mnt volume is a special EBS device used primarily for storing system logs in a location that will be consistent across restarts of an individual instance; this volume defaults to 25GB on all instance types. On Database instances this volume has some additional uses:

  • For Postgres and MySQL this volume is used for the local staging of logical backups (eybackup) prior to upload to S3. This backup usually consumes space equivalent to about 25% of the total database size on disk; though this can vary widely depending data density and indexing.
  • For MySQL the /mnt volume is also used for disk based temporary tables used by the database during query execution. The amount of space required for this will also vary from application to application. For performance reasons it is typically best to investigate and try to optimize away queries that do use temporary disk tables, but there are valid use cases where these cannot be entirely eliminated.

The /mnt volume can be sized individually to the needs of each database instance. For example, if you have a specific host that always performs your logical backups (eybackup), you may want it's /mnt volume to be larger than the other databases in the environment. The /mnt volume will always default to 25GB so it is important to remain aware of any custom sizing needs when replacing or upgrading instances.


MySQL only: About using a new or existing snapshot for the database replica (slave)

If you have an existing snapshot you can use it instead of taking a new snapshot immediately before creating a new database replica (slave).

Choosing between creating a new snapshot or using an older snapshot is a trade-off:

  • Creating a new snapshot means that the new database replica (slave) has less data from the master binary logs that it has to replay to make the replication. However, creating a snapshot can be very IO intensive on the master and, thus, might be a bad choice for high traffic databases during peak traffic hours. Also, for large databases, the shorter time required for replication to catch up after the database replica (slave) is up might be off-set by the time needed to create the new snapshot.

  • Using an existing snapshot skips creating a new snapshot and instead uses an existing one. This option is only available if the master database has binary log data going back to when the snapshot was created. The new database replica (slave) then has to read and replay more binary log data than if using a new snapshot. The amount of binary log data is entirely dependent on your database write traffic volume, but this method is usually much less IO intensive to the master database than creating a new snapshot.

More information

For more information about... See...
Creating an environment                                          Create an environment.
Increasing the size of a database instance or /db volume                    Monitor database usage and increase the database volume or instance size.               
Make a high availability environment Make a high availability environment.
Upgrade an environment Upgrade an environment.
SSHing into an instance Connect to your instance via SSH.
32-bit and 64-bit instance sizes About instance sizes.
Promoting a database replica Promote a Database Replica.

If you have feedback or questions about this page, add a comment below. If you need help, submit a ticket with Engine Yard Support.


Article is closed for comments.