cancel
Showing results for 
Search instead for 
Did you mean: 

Autoprewarm: a new functionality in pg_prewarm

EDB Team Member

Autoprewarm

 

In PostgreSQL 11, a new functionality of autoprewarm has been added into the contrib module pg_prewarm.  This automatically warms the shared buffers with the same pages held before the last server restart. To accomplish this, Postgres now has a background worker to periodically record the contents of the shared buffers in the file -- “autoprewarm.blocks”. Next,  it will reload those pages after the server restart.

 

Now, that we know the basic functionality of autoprewarm, let's dive into the details of how to use this feature.

 

How to enable Autoprewarm?

 

To enable autoprewarm, set "shared_preload_libraries" with pg_prewarm. This parameter requires the server restart to take effect.

postgres=# alter system set shared_preload_libraries = 'pg_prewarm';
ALTER SYSTEM

After the restart, there will be a new background worker process -- "autoprewarm master" for automatic prewarming of shared buffers.

$ ps -aef | grep autoprewarm
mithuncy   5453 5446  8 02:56 ?    00:01:15 postgres: autoprewarm master

To be precise, "autoprewarm master" will periodically record the information about pages in shared buffers in the file "$PGDATA/autoprewarm.blocks". The frequency of updating "autoprewarm.blocks" is decided by a configuration parameter pg_prewarm.autoprewarm_interval. Once the server restarts, the master will read "autoprewarm.blocks" and sort the list of pages to be prewarmed. Next, it will launch a worker for each database, one-at-a-time. Then the per-database worker aka autoprewarm worker will load the pages that belong to their database. Once the prewarm is completed, the master will keep updating the "autoprewarm.blocks" periodically.

$ ps -aef | grep autoprewarm
mithuncy   6377 6370  5 03:50 ? 00:00:00 postgres: autoprewarm master  
mithuncy   6393 6370 15 03:50 ?    00:00:00 postgres: autoprewarm worker

To see the effects of prewarming, I re-ran the following query after a server restart.

postgres=# explain (analyze, buffers) select count(*) from apw_tests;
                          QUERY PLAN
------------------------------------------------------------------------------
Aggregate  (cost=188.44..188.45 rows=1 width=8) (actual time=1.804..1.804 rows=1 loops=1)
   Buffers: shared hit=45
    -> Seq Scan on apw_tests  (cost=0.00..159.75 rows=11475 width=0) (actual time=0.013..1.071 rows=10000 loops=1)
       Buffers: shared hit=45

Here you see all the buffers are hit. That means we found them in the shared buffer cache and none of them was loaded from disk. Without autoprewarm those pages will be loaded from disk after a restart.

postgres=# explain (analyze, buffers) select count(*) from apw_tests;
                             QUERY PLAN
------------------------------------------------------------------------------
Aggregate  (cost=170.00..170.01 rows=1 width=8) (actual time=1.798..1.798 rows=1 loops=1)
   Buffers: shared read=45
   -> Seq Scan on apw_tests  (cost=0.00..145.00 rows=10000 width=0) (actual time=0.021..1.118 rows=10000 loops=1)
      Buffers: shared read=45

 

Performance

 

Test setup: pg_bench prepared read only tests for 1 client.

Machine: x86_64 8 core Intel machine with 16GB ram.

Server setup : shared_buffers = 8GB, pgbench scale_factor=300 (entire data fits into shared buffers)

 

TPS was measured at every 5 seconds of the run. From the tests, it was observed with autoprewarm system produced peak performance right immediately after the restart. When autoprewarm was disabled it took almost 300 secs to reach same peak TPS.

 autoprewwarm_perf.png

 

 

Inside "autoprewarm.blocks"

 

Contents of the file are in a readable format.

<<524288>>
13307,1663,16391,0,524065
13307,1663,16391,0,524066
13307,1663,16391,0,524067
13307,1663,16391,0,524068
13307,1663,16391,0,524069
………………….

The first line says about the total number of pages and each line after that gives information about a page. Each page is uniquely represented by database oid, tablespace oid, relfilenode of the relation, fork file number, and the block number.

 

 

Utility Functions

 

  • autoprewarm_start_worker() RETURNS void
    Use this to launch the autoprewarm worker if autoprewarm was not configured during the server startup.
  • autoprewarm_dump_now() RETURNS int8
    This updates autoprewarm.blocks immediately. This may be useful if the autoprewarm worker is not running currently but it is expected to be used at the next server restart.