Worker Allocation on Engine Yard Cloud

Workers are the processes that allow your application to respond to incoming web requests. Regardless of the application server stack you select, it provides one or more workers per instance to run your application.

Standard worker allocations based on instance ECUs and memory configuration

For Passenger, Unicorn, Thin or Mongrel, worker processes are allocated using an algorithm that takes into consideration ECUs and memory available to the instance to determine the number of workers to allocate, as shown in the table below.

As of the May 21st, 2013 stack release, we modified the algorithm that generates the worker counts to better use the available resources; the result is increased numbers of worker processes for most instance sizes.

Instance Type Memory (MB)    Swap (MB)    ECUs     Pool Size      
Original    New         
General Purpose (M1) Small 1700 895 1 3 3
General Purpose (M1) Medium 3840 1343 2 7 7
Compute Optimized (C1) Medium 1740 1343 5 6 6
General Purpose (M1) Large 7680 8192 4 6 8
General Purpose (M1) Extra Large 15360 8192 8 12 16
Compute Optimized (C1) Extra Large 6941 8192 20 24 29
Memory Optimized (M2) Extra Large 17500 8192 6.5 8 13
Memory Optimized (M2) Double Extra Large 35000 8192 13 8 26
Memory Optimized (M2) Quadruple Extra Large 70000 8192 26 24 52
Storage Optimized (Hi1) Quadruple Extra Large 62000 8192 35 70 70
General Purpose (T2) Micro 998 8192 3.25 - 6
General Purpose (T2) Small 2004 8192 3.25 - 6
General Purpose (T2) Medium 3697 8192 6.5 - 13
General Purpose (T2) Large 8192 8192 6.5 - 13
General Purpose (M3) Medium 3750 8192 3 - 6
General Purpose (M3) Large 7444 8192 6.5 - 13
General Purpose (M3) Extra Large 14960 8192 13 - 26
General Purpose (M3) Double Extra Large 30000 8192 26 - 52
General Purpose (M4) Large 7723 8192 6.5 - 13
General Purpose (M4) Extra Large 15775 8192 13 - 26
General Purpose (M4) Double Extra Large 31879 8192 26 - 52
General Purpose (M4) Quadruple Extra Large 64082 8192 53.5 - 100
General Purpose (M4) Decuple (10x) Extra Large 160561 8192 124.5 - 100
Compute Optimized (C3) Large 3750 8192 7 - 14
Compute Optimized (C3) Extra Large 7441 8192 14 - 28
Compute Optimized (C3) Double Extra Large 14961 8192 28 - 56
Compute Optimized (C3) Quadruple Extra Large 30058 8192 55 - 100
Compute Optimized (C3) Octuple Extra Large 60066 8192 108 - 100
Compute Optimized (C4) Large 3765 8192 8 - 13
Compute Optimized (C4) Extra Large 7219 8192 16 - 26
Compute Optimized (C4) Double Extra Large 14767 8192 31 - 52
Compute Optimized (C4) Quadruple Extra Large 29858 8192 62 - 100
Compute Optimized (C4) Octuple Extra Large 60050 8192 132 - 100
Memory Optimized (R3) Large 15026 8192 6.5 - 13
Memory Optimized (R3) Extra Large 30383 8192 13 - 26
Memory Optimized (R3) Double Extra Large 61096 8192 26 - 52
Memory Optimized (R3) Quadruple Extra Large 122519 8192 52 - 100
Memory Optimized (R3) Octuple Extra Large 245373 8192 104 - 100

About ECUs and vCPUs

We use "vCPUs" as a generic label for virtual CPU cores (and in some cases hyperthreads) available to your instance. Amazon (IaaS provider for Engine Yard Cloud) uses the term ECU, or Elastic Compute Units, to convey the relative compute power available on each instance type. This value is different than the virtual CPUs reported by the operating system, such as in top or /proc/cpuinfo, but better allows you to determine the amount of effective compute resources you need. According to Amazon, the definition of ECU is:

"One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor."

For the purpose of this feature, you can consider vCPUs to be synonymous with ECUs.

Tune for your application's needs

The values in the table (above) are based on generic parameters of an idealized application. However, if your app falls outside these parameters, you should consider tuning your environment, to either:

  • Reduce the worker counts to improve responsiveness because your app requires more resources.
  • Increase available workers for a lightweight app with high requests per minute.

Tune your app using parameters

You can tune your app by setting parameters on your environment. The sections below explain the various parameters that can be tuned. Once you know what values you want tuned, you need to open a support ticket to have them put in effect.

Important: In the case of General Purpose (M1) Medium and Compute Optimized (C1) Medium instances, there is an override in place to ensure that pool size from the values provided in the original algorithm are not reduced. If you tune one or more parameters, these overrides are removed and the generated values (based on your updated parameters) will be used instead.

pool_size

This is the most direct way of modifying your pool size. By specifying pool_size, you can set it regardless of resource availability. This is recommended only if you are fully aware of your resource needs and don't plan on changing your instance size in the future.

min_pool_size

Slightly less forceful, the min_pool_size parameter allows you to ensure your application always has the minimum number of workers available. Useful if using the smaller instance types and you don't mind overloading the resources — be careful though, overloaded instances may result in underperforming or even hung instances, potentially triggering a takeover.

If you find yourself using this and having poor responsiveness or takeovers, please consider upgrading to a larger instance size.

Default: 3

max_pool_size

Likewise, the max_pool_size parameter allows you to ensure you don't have more workers than you need, regardless of resource availability. You may want to do this if your workers connect to external entities that cannot handle too many simultaneous connections.

Default: 100

worker_memory_size

This is the best and most accurate way of tuning your app based on memory usage. The standardized values above are based on an app in which each worker uses 250MB of memory, which is close the upper limit of usage we've seen across thousands of apps. If your app is significantly different than this, it is best to tune this value to get accurate resource usage calculations.

Default:250 (MB)

workers_per_ecu

Some apps are compute heavy, where others barely use any of the processor capacity. The standard configuration provides 2 workers per ECU; tuning this to your app's needs can improve its responsiveness.

Default: 2

swap_usage_percent

While it is not a good idea to go in to swap in most cases, it is a balancing act to provide the optimal number of workers. We have chosen a modest 25% usage of swap by the workers in order to provide a sufficient worker count without excessive paging. If you find that your app is spending a great deal of time in swap, you may want to change this value.

Note: It is usually due to CPU constraints rather than memory constraints that determines the workers available, but depending on how many workers you allow per CPU, this may not be the case. Under default settings, this value most significantly impacts the Compute Optimized (C1) Small, Medium, and Extra Large instances.

Default: 20 (%)

reserved_memory

If you are running background workers or other processes that eat into your memory resources, you may want to specify a chunk of memory to reserve. By default, this is set to a generous value that accommodates the OS and support processes.

Default: 1500 (MB)

reserved_memory_solo

Similar to reserved_memory but also takes into consideration the database that shared the instance in a single-instance (solo) environment.

Default: 2000 (MB)

db_vcpu_max and db_workers_per_vcpu

These settings are only applicable to single instance environments. By default, no consideration is taken to reserve ECU resources for the database, but if you have a database-heavy application, it would be beneficial to allocate some of the worker resources to the database instead. The number of workers represented by the db_workers_per_ecu parameter is removed from each ECU, up to the db_vcpu_max value.

Default: db_vcpu_max = 0; db_workers_per_ecu = 0.5

Planning Tool

Use the following form to determine how many workers your instances will have, and how much free memory and swap usage will be expected.

Settings Calculated value
Edit the fields below. Leave blank for default value, which is shown in grey italics when not editing the field.
pool_size:  
min_pool_size: 3
max_pool_size: 100
worker_memory_size: 250MB
workers_per_ecu: 2
reserved_memory: 1500MB
reserved_memory_solo: 2000MB
swap_usage_percent: 25%
db_vcpu_max: 0
db_workers_per_ecu: 0.5
 
 
Instance Type Cluster Instances Single Instance
M1 Small
3 Workers
Free Memory: 0
Swap Use: 550 (61.5%)
3 Workers
Free Memory: 0
Swap Use: 1050 (117.3%)
M1 Medium
7 Workers
Free Memory: 590
Swap Use: 0 (0.0%)
7 Workers
Free Memory: 90
Swap Use: 0 (0.0%)
C1 Medium
6 Workers
Free Memory: 0
Swap Use: 1260 (93.8%)
6 Workers
Free Memory: 0
Swap Use: 1760 (131.0%)
M1 Large
8 Workers
Free Memory: 4180
Swap Use: 0 (0.0%)
8 Workers
Free Memory: 3680
Swap Use: 0 (0.0%)
M1 Extra Large
16 Workers
Free Memory: 9860
Swap Use: 0 (0.0%)
16 Workers
Free Memory: 9360
Swap Use: 0 (0.0%)
C1 Extra Large
29 Workers
Free Memory: 0
Swap Use: 1809 (22.1%)
27 Workers
Free Memory: 0
Swap Use: 1809 (22.1%)
M2 Extra Large
13 Workers
Free Memory: 12750
Swap Use: 0 (0.0%)
13 Workers
Free Memory: 12250
Swap Use: 0 (0.0%)
M2 Double Extra Large
26 Workers
Free Memory: 27000
Swap Use: 0 (0.0%)
26 Workers
Free Memory: 26500
Swap Use: 0 (0.0%)
M2 Quad Extra Large
52 Workers
Free Memory: 55500
Swap Use: 0 (0.0%)
52 Workers
Free Memory: 55000
Swap Use: 0 (0.0%)
Hi1 Quad Extra Large
70 Workers
Free Memory: 43000
Swap Use: 0 (0.0%)
70 Workers
Free Memory: 42500
Swap Use: 0 (0.0%)
T2 Micro
6 Workers
Free Memory: 0
Swap Use: 2002 (24.4%)
4 Workers
Free Memory: 0
Swap Use: 2002 (24.4%)
T2 Small
6 Workers
Free Memory: 0
Swap Use: 996 (12.2%)
6 Workers
Free Memory: 0
Swap Use: 1496 (18.3%)
T2 Medium
13 Workers
Free Memory: 0
Swap Use: 1053 (12.9%)
13 Workers
Free Memory: 0
Swap Use: 1553 (19.0%)
T2 Large
13 Workers
Free Memory: 3442
Swap Use: 0 (0.0%)
13 Workers
Free Memory: 2942
Swap Use: 0 (0.0%)
M3 Medium
6 Workers
Free Memory: 750
Swap Use: 0 (0.0%)
6 Workers
Free Memory: 250
Swap Use: 0 (0.0%)
M3 Large
13 Workers
Free Memory: 2694
Swap Use: 0 (0.0%)
13 Workers
Free Memory: 2194
Swap Use: 0 (0.0%)
M3 Extra Large
26 Workers
Free Memory: 6960
Swap Use: 0 (0.0%)
26 Workers
Free Memory: 6460
Swap Use: 0 (0.0%)
M3 Double Extra Large
52 Workers
Free Memory: 15500
Swap Use: 0 (0.0%)
52 Workers
Free Memory: 15000
Swap Use: 0 (0.0%)
M4 Large
13 Workers
Free Memory: 2973
Swap Use: 0 (0.0%)
13 Workers
Free Memory: 2473
Swap Use: 0 (0.0%)
M4 Extra Large
26 Workers
Free Memory: 7775
Swap Use: 0 (0.0%)
26 Workers
Free Memory: 7275
Swap Use: 0 (0.0%)
M4 Double Extra Large
52 Workers
Free Memory: 17379
Swap Use: 0 (0.0%)
52 Workers
Free Memory: 16879
Swap Use: 0 (0.0%)
M4 Quadruple Extra Large
100 Workers
Free Memory: 37582
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 37082
Swap Use: 0 (0.0%)
M4 Decuple Extra Large
100 Workers
Free Memory: 134061
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 133561
Swap Use: 0 (0.0%)
C3 Large
14 Workers
Free Memory: 0
Swap Use: 1250 (15.3%)
14 Workers
Free Memory: 0
Swap Use: 1750 (21.4%)
C3 Extra Large
28 Workers
Free Memory: 0
Swap Use: 1059 (12.9%)
28 Workers
Free Memory: 0
Swap Use: 1559 (19.0%)
C3 Double Extra Large
56 Workers
Free Memory: 0
Swap Use: 539 (6.6%)
56 Workers
Free Memory: 0
Swap Use: 1039 (12.7%)
C3 Quadruple Extra Large
100 Workers
Free Memory: 3558
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 3058
Swap Use: 0 (0.0%)
C3 Octuple Extra Large
100 Workers
Free Memory: 33566
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 33066
Swap Use: 0 (0.0%)
C4 Large
13 Workers
Free Memory: 0
Swap Use: 985 (12.0%)
13 Workers
Free Memory: 0
Swap Use: 1485 (18.1%)
C4 Extra Large
26 Workers
Free Memory: 0
Swap Use: 781 (9.5%)
26 Workers
Free Memory: 0
Swap Use: 1281 (15.6%)
C4 Double Extra Large
52 Workers
Free Memory: 267
Swap Use: 0 (0.0%)
52 Workers
Free Memory: 0
Swap Use: 233 (2.8%)
C4 Quadruple Extra Large
100 Workers
Free Memory: 3358
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 2858
Swap Use: 0 (0.0%)
C4 Octuple Extra Large
100 Workers
Free Memory: 33550
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 33050
Swap Use: 0 (0.0%)
R3 Large
13 Workers
Free Memory: 10276
Swap Use: 0 (0.0%)
13 Workers
Free Memory: 9776
Swap Use: 0 (0.0%)
R3 Extra Large
26 Workers
Free Memory: 22383
Swap Use: 0 (0.0%)
26 Workers
Free Memory: 21883
Swap Use: 0 (0.0%)
R3 Double Extra Large
52 Workers
Free Memory: 46596
Swap Use: 0 (0.0%)
52 Workers
Free Memory: 46096
Swap Use: 0 (0.0%)
R3 Quadruple Extra Large
100 Workers
Free Memory: 96019
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 95519
Swap Use: 0 (0.0%)
R3 Octuple Extra Large
100 Workers
Free Memory: 218873
Swap Use: 0 (0.0%)
100 Workers
Free Memory: 218373
Swap Use: 0 (0.0%)

Why are some of the swap allocations over 100 percent without tuning?

In the default configuration (no tuned fields, so using the values you see in gray), you might have noticed that some of the instances appear allocated above the swap limits of the instance (for example, 131.0%). This is because we wanted to ensure that the minimal worker counts from the previous allocation scheme were not reduced. In practice, this does not cause a problem for the vast majority of our users, because worker_memory_size and reserved_memory do not often reach the upper limit values in the default values above. However, if these values due cause you issues, this tool and the tuned values it produces will allow you to protect your app from being over-allocated.

Comments

0 comments

Article is closed for comments.