The PostgreSQL Autovacuum Process

Overview

If you use PostgreSQL databases, you should be aware that they require routine vacuuming

A PostgreSQL database within Engine Yard has autovacuum processes that run in the background, performing some of this work for you, working with the default Postgres vacuum configuration parameters. However, these parameters are conservative, which means that dead tuples might not be cleared depending on your database load and the number of deletes and updates.

If that is the case, this will start triggering the alerts, which notify you that your database space is filling up with too many dead tuples, and you may observe some performance issues. The warnings will generally look like this:

Process postgresql-autovac_freeze: POSTGRES_AUTOVAC_FREEZE WARNING: DB 'postgres' management=91%;90;95

 

If you take no action, when the limit is reached your database will continue to accept requests and function, but the database will force an immediate vacuum of the impacted resources which would come at an unpredictable time and could impact performance.

Solution

To address the immediate issue with the autovacuum process of PostgreSQL not keeping up with the database's maintenance requirements, you should consider an ad-hoc manual vacuum during a quiet period.

/usr/bin/vacuumdb -U postgres --all --freeze

Additionally, you should consider the postgresql_maintenance cookbook to schedule a recurring vacuum task for your main databases. Note: v6 repo - v5 repo - for v4 and lower, use ey_cloud_recipes repo. The default schedule, once activated, is to run this process once a week, every Sunday at midnight. You can edit this to any low load time by following the instructions included in the postgresql_maintenance cookbook.

Alternatively (and potentially concurrently), you can tune the autovacuum process to be more aggressive, while simultaneously increasing the threshold at which the alerts are sent. This example configuration could be added to your /db/postgresql/<version>/custom.conf file, although you would need to schedule a time to restart PostgreSQL before these changes would take effect:

# Changes autovacuum to be more aggressive and increases threshold before forced vacuum
vacuum_cost_limit = 800
vacuum_freeze_min_age = 10000000
vacuum_freeze_table_age   =  800000000
autovacuum_freeze_max_age = 1000000000
autovacuum_vacuum_cost_delay = 10ms
autovacuum_max_workers = 6
autovacuum_naptime = 3min
autovacuum_vacuum_scale_factor = 0.02
autovacuum_analyze_scale_factor = 0.02
autovacuum_analyze_threshold = 100
autovacuum_vacuum_threshold = 100

Take the parameters in the example above with care, and consult your DBA for fine-tuning.

<supportagent>

We do not offer fine-tuning within the Support Team (and even Platinum is limited to Database "Maintenance" - i.e., no fine-tuning even for Platinum), but the customer is welcome to request a Professional Services request, as long as they understand this might require a paid engagement. Essentially, Support should tell the customer to run the ad-hoc request and guide them to the Chef docs, potentially doing the ad-hoc ourselves in a time frame agreed upon with the customer, but stop here. Tuning the autovacuum process or a hands-on approach to set up the Chef recipe would require a Professional Services engagement since that is custom application work.

Also, if the customer asks, we can say that the parameters of the alert are calculated through the below database query:

SELECT freez, txns, ROUND(100*(txns/freez::float)) AS perc, datname
FROM
(SELECT foo.freez::int, age(datfrozenxid) AS txns, datname
FROM pg_database d JOIN (SELECT setting AS freez FROM pg_settings
WHERE name = 'autovacuum_freeze_max_age') AS foo
ON (true) WHERE d.datallowconn) AS foo2 ORDER BY 3 DESC, 4 ASC;

</supportagent>

Summary

By scheduling a recurring vacuum task or tuning the autovacuum process to be more aggressive, you can prevent the PostgreSQL database from forcing an immediate vacuum at an unpredictable time, which could impact performance. Additionally, understanding the meaning of the alert percentage can help you better manage your database maintenance.

FAQ

  1. What does the 91% in the alert mean?
    The 91% describes how close the database is to issuing a vacuum operation. We alert this ratio because this vacuum would come at an unpredictable time once you reach 100%, and could impact performance.
  2. What happens if I don't take action when I receive the alert?
    If you take no action, your database will continue to accept requests and function, but the database will force an immediate vacuum of the impacted resources when the limit is reached, which comes at an unpredictable time and will impact performance.
  3. How can I prevent the database from forcing an immediate vacuum?
    You can prevent this by running an ad-hoc vacuum at your preferred time frame, scheduling a recurring vacuum task, or tuning the autovacuum process to be more aggressive.

Comments

Article is closed for comments.