cancel
Showing results for 
Search instead for 
Did you mean: 

How much shared_buffers is used by each table and also when we execute Select on a huge table

We have encountered a few scenarios where customers complained that shared_buffers is not used even though it is free.

Let's see how this is used by a select query which is running on a huge table and also table wise stats for share_buffers.

 

Assume that you have 4GB of shared_buffers and an idle case, where all the buffers are available/free.

If user executes a select * from <table> which is 2GB in size then only 32 buffers are used which is  256kb in

size rather than loading all the data into the shared_buffers.

 

How can we see this?

 

pg_buffercahce extension helps to provide insight into the shared_buffers information

 

Execute select * from a table for the first time and see all the data is read from the disk -"shared read"

edb=# explain (analyze, verbose, buffers) select * from table_load_sb ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on enterprisedb.table_load_sb (cost=0.00..389457.06 rows=10999902 width=14) (actual time=0.507..836.343 rows=11000000 loops=1) Output: id, name Buffers: shared read=59460 I/O Timings: read=302.248 Planning time: 1.177 ms Execution time: 1156.976 ms (6 rows) Run the query again edb=# explain (analyze, verbose, buffers) select * from table_load_sb ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on enterprisedb.table_load_sb (cost=0.00..389457.06 rows=10999902 width=14) (actual time=0.336..801.014 rows=11000000 loops=1) Output: id, name Buffers: shared hit=32 read=59428 . # We can see that the 32 buffers are available i.e, (32 *8 )kb =256kb I/O Timings: read=276.691 Planning time: 0.037 ms Execution time: 1113.679 ms (6 rows)

 

It seems PostgreSQL act smartly and decide not to fill all the shared_buffers with one table data which can impact on 

other queries in real time. But, when exactly PostgreSQL decides to use 32buffers approach rather than loading the 

table into shared_buffers? Generally, for small tables with 300mb or such we can entire table can be loaded into the

shared_buffers.

 

After few test runs with shared_buffers vs table size, I have found that if the table size is less than 25% of shared_buffers

it is trying to load the data into shared_buffers but once the table size reaches more than 25% of shared_buffers select 

query uses 32 buffers for each run of the query.

 

In case, if user wants to load the table regardless of it's size pg_prewarm can be used to tell the postgers to load

the table data into the shared_buffers.

https://www.postgresql.org/docs/10/pgprewarm.html

 

To find the table wise shared_buffers stats use the below query:
edb=# SELECT c.relname, count(*) AS buffers
             FROM pg_buffercache b INNER JOIN pg_class c
             ON b.relfilenode = pg_relation_filenode(c.oid) AND
                b.reldatabase IN (0, (SELECT oid FROM pg_database
                                      WHERE datname = current_database()))
             GROUP BY c.relname
             ORDER BY 2 DESC ;
                 relname                  | buffers 
------------------------------------------+---------
 pg_proc                                  |     140
 pg_attribute                             |      33
 table_load_sb                            |      32
 pg_class                                 |      17

Each buffers is 8kb in size 
Version history
Revision #:
3 of 3
Last update:
‎05-10-2019 10:15 AM
Updated by:
 
Contributors