Worker Counts - current instances and stacks

Note: This article provides details for 4th and 5th generation AWS instances. Stack release stable-v5-3.0.58 introduced a small worker count tweak for 3rd and 4th generation instances. For worker counts for 3rd and 4th generation instances on older v5 and v4 stacks, as well as older instance types, please see this article.

Update April 23rd 2020: As of the stable-v6-1.0.33 release we now support the setting of the worker configuration via Environmental Variables. Customers with environments running on the stable-v6 stack should upgrade to this version in order to be able to set configuration, whilst customers on older major stack versions should continue to contact Support in order to request the setting of the worker configuration in their metadata.

Worker definition

Workers are the processes that allow your application to respond to incoming web requests. Regardless of the application server stack you select, it provides one or more workers per instance to run your application.

Standard worker allocations based on instance ECUs and memory configuration

For Passenger or Unicorn, worker processes are allocated using an algorithm that takes into consideration ECUs and memory available to the instance to determine the number of workers to allocate, as shown in the table below.

Instance Type Memory (GiB) Swap (MB) ECU Pool Size
General Purpose (T2) Micro 1 8192 3.25* 6
General Purpose (T2) Small 2 8192 3.25*
General Purpose (T2) Medium 4 8192 6.5* 14 
General Purpose (T2) Large 8 8192 6.5* 14 
General Purpose (T2) Extra Large 16 8192 13* 28 
General Purpose (T2) 2x Extra Large 32 8192 26* 56 
General Purpose (T3) Micro 1 8192 8*
General Purpose (T3) Small 2 8192 8* 10 
General Purpose (T3) Medium 4 8192 8* 16 
General Purpose (T3) Large 8 8192 8* 16 
General Purpose (T3) Extra Large 16 8192 16* 32
General Purpose (T3) 2x Extra Large 32 8192 32* 64
General Purpose (M4) Large 8 8192 6.5 13
General Purpose (M4) Extra Large 16 8192 13 26
General Purpose (M4) 2x Extra Large 32 8192 26 52
General Purpose (M4) 4x Extra Large 64 8192 53.5 100
General Purpose (M4) 10x Extra Large 160 8192 124.5 100
General Purpose (M4) 16x Extra Large 256 8192 188 100
General Purpose (M5) Large 8 8192 8 16
General Purpose (M5) Extra Large 16 8192 16 32
General Purpose (M5) 2x Extra Large 32 8192 31 62
General Purpose (M5) 4x Extra Large 64 8192 60

100

General Purpose (M5) 8x Extra Large 32 8192 128

100

General Purpose (M5) 12x Extra Large 192 8192 173 100
General Purpose (M5) 24x Extra Large 384 8192 345 100
Compute Optimized (C4) Large 3.75 8192 8 15
Compute Optimized (C4) Extra Large 7.5 8192 16 29
Compute Optimized (C4) 2x Extra Large 15 8192 31 59
Compute Optimized (C4) 4x Extra Large 30 8192 62 100
Compute Optimized (C4) 8x Extra Large 60 8192 132 100
Compute Optimized (C5) Large 4 8192 9 15
Compute Optimized (C5) Extra Large 8 8192 17 30
Compute Optimized (C5) 2x Extra Large 16 8192 34 61
Compute Optimized (C5) 4x Extra Large 32 8192 68 100
Compute Optimized (C5) 9x Extra Large 72 8192 141 100
Compute Optimized (C5) 18x Extra Large 144 8192 281 100
Memory Optimized (R4) Large 15.25 8192 7 14
Memory Optimized (R4) Extra Large 30.5 8192 13.5 27
Memory Optimized (R4) 2x Extra Large 61 8192 27 54
Memory Optimized (R4) 4x Extra Large 122 8192 53 100
Memory Optimized (R4) 8x Extra Large 244 8192 99 100
Memory Optimized (R4) 16x Extra Large 488 8192 195 100
Memory Optimized (R5) Large 16 8192 10 20
Memory Optimized (R5) Extra Large 32 8192 19 38
Memory Optimized (R5) 2x Extra Large 64 8192 38 76
Memory Optimized (R5) 4x Extra Large 128 8192 71 100
Memory Optimized (R5) 8x Extra Large 256 8192 128 100
Memory Optimized (R5) 12x Extra Large 384 8192 173 100
Memory Optimized (R5) 16x Extra Large 512 8192 256 100
Memory Optimized (R5) 24x Extra Large 768 8192 347 100
High I/O (I3) Large 15.25 8192 7 14
High I/O (I3) Extra Large 30.5 8192 13 26
High I/O (I3) 2x Extra Large 61 8192 27 54
High I/O (I3) 4x Extra Large 122 8192 53 100
High I/O (I3) 8x Extra Large 244 8192 99 100
High I/O (I3) 16x Extra Large 488 8192 200 100

 

About ECUs and vCPUs

We use "vCPUs" as a generic label for virtual CPU cores (and in some cases hyperthreads) available to your instance. Amazon (IaaS provider for Engine Yard Cloud) uses the term ECU, or Elastic Compute Units, to convey the relative compute power available on each instance type. This value is different than the virtual CPUs reported by the operating system, such as in top or /proc/cpuinfo, but better allows you to determine the amount of effective compute resources you need. According to Amazon, the definition of ECU is:

"One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor."

* Note: Due to their burstable nature, there are no official ECU figures for T2 and T3 instances. Therefore, for the purposes of the worker count calculation we make an assumption as to the value, based on the vCPU count for the instance size. If your workers utilise high levels of CPU it may be required to lower the worker count to keep CPU usage under the baseline and avoid bursting under standard usage.

Tune for your application's needs

The values in the table (above) are based on generic parameters of an idealized application. However, if your app falls outside these parameters, you should consider tuning your environment, to either:

  • Reduce the worker counts to improve responsiveness because your app requires more resources.
  • Increase available workers for a lightweight app with high requests per minute.

Tune your app using parameters

You can tune your app by setting parameters on your environment. The sections below explain the various parameters that can be tuned. Once you know what values you want tuned, you need to open a support ticket to have them put in effect.

pool_size

This is the most direct way of modifying your pool size. By specifying pool_size, you can set it regardless of resource availability. This is recommended only if you are fully aware of your resource needs and don't plan on changing your instance size in the future.

min_pool_size

Slightly less forceful, the min_pool_size parameter allows you to ensure your application always has the minimum number of workers available. Useful if using the smaller instance types and you don't mind overloading the resources — be careful though, overloaded instances may result in underperforming or even hung instances, potentially triggering a takeover.

If you find yourself using this and having poor responsiveness or takeovers, please consider upgrading to a larger instance size.

Default: 3

max_pool_size

Likewise, the max_pool_size parameter allows you to ensure you don't have more workers than you need, regardless of resource availability. You may want to do this if your workers connect to external entities that cannot handle too many simultaneous connections.

Default: 100

worker_memory_size

This is the best and most accurate way of tuning your app based on memory usage. The standardized values above are based on an app in which each worker uses 250MB of memory, which is close the upper limit of usage we've seen across thousands of apps. If your app is significantly different than this, it is best to tune this value to get accurate resource usage calculations.

Default: 250 (MB)

workers_per_ecu

Some apps are compute heavy, where others barely use any of the processor capacity. The standard configuration provides 2 workers per ECU; tuning this to your app's needs can improve its responsiveness.

Default: 2

swap_usage_percent

While it is not a good idea to go in to swap in most cases, it is a balancing act to provide the optimal number of workers. We have chosen a modest 25% usage of swap by the workers in order to provide a sufficient worker count without excessive paging. If you find that your app is spending a great deal of time in swap, you may want to change this value.

Note: It is usually due to CPU constraints rather than memory constraints that determines the workers available, but depending on how many workers you allow per CPU, this may not be the case. Under default settings, this value most significantly impacts the Compute Optimized instances.

Default: 20 (%)

reserved_memory

If you are running background workers or other processes that eat into your memory resources, you may want to specify a chunk of memory to reserve. By default, this is set to a generous value that accommodates the OS and support processes.

Default: 1500 (MB)

reserved_memory_solo

Similar to reserved_memory but also takes into consideration the database that shared the instance in a single-instance (solo) environment.

Default: 2000 (MB)

db_vcpu_max and db_workers_per_vcpu

These settings are only applicable to single instance environments. By default, no consideration is taken to reserve ECU resources for the database, but if you have a database-heavy application, it would be beneficial to allocate some of the worker resources to the database instead. The number of workers represented by the db_workers_per_ecu parameter is removed from each ECU, up to the db_vcpu_max value.

Default: db_vcpu_max = 0; db_workers_per_ecu = 0.5

Planning Tool

Use the following form to determine how many workers your instances will have, and how much free memory and swap usage will be expected.

Settings Calculated value
Edit the fields below. Leave blank for default value, which is shown in grey italics when not editing the field.
pool_size:
min_pool_size:
max_pool_size:
worker_memory_size: MB
workers_per_ecu:
reserved_memory: MB
reserved_memory_solo: MB
swap_usage_percent: %
db_vcpu_max:
db_workers_per_ecu:
 
Instance Type Cluster Instances Single Instance

 

 


If you have feedback or questions about this page, add a comment below. If you need help, submit a ticket with Engine Yard Support.

Comments

Article is closed for comments.