Rows get deleted from pemhistory.* tables when a database cleanup job is executed. All the rows with recorded time greater than the data retention time (for that probe) will be deleted. Likewise, Rows will be updated in pemdata.* each time a probe gets executed, in accordance with the probe’s execution frequency.
Because of the above-described write activity, we may face situations where the PEM purge job is taking more time or increasing the CPU utilization. Here are a few of the reasons why this might be happening, as well as solutions for them:
There are too many rows/data for the pem purge activity to clean up. To resolve this, we need set the proper retention policy for the historical records and may need to run the purge job more frequently.
Incorrect retention periods for the pem probes; for example pemhistory.table_statistics and pemhistory.index_statistics store statistics of all the tables in the database, and are checked every 5 minutes by PEM probes. A simple resolution for this issue is to reduce the frequency at which the probes are fired, thus reducing the amount of gathered data and in turn reduce the rate of increase in size of pemhistory.table_statistics.
Tables are not getting analysed properly and hence statistics are not getting updated. This can be resolved by checking the stats table for the autovaccum/analyse status and if needed, perform manual VACUUM or schedule a VACUUM ANALYZE.
Heavy bloating on a table can cause performance issues and load the server for processing cleanup jobs. A common scenario is continuous delete and update operations on tables; this will increase the bloat of the table. We can check the bloat of the table by using the query specified at the link below from the PostgreSQL wiki and depending on the results, determine the maintenace tasks to execute. PostgreSQL wiki: https://wiki.postgresql.org/wiki/Show_database_bloat
Database is not tuned/configured properly. This will require checking the database settings and updating postgresql.conf accordingly.
Concurrent sessions are blocking to the purge cleanup job. We can check pg_stat_activity if any query is blocking the pem purge job. We can also enable the log_lock_waits parameter so that it will start logging everything about the locking and blocking session.