We are trying to run 1000 user test via JMETER, this is gradual and over a period of time will hit that 1000 user mark. I have run the postgresqltuner script which lists any issues with our postgres configuration, made a change to our Linux OS
/proc/sys/vm/overcommit_memory to 2
PostGres configuration is below: (apologies included the complete configuration)
listen_addresses = '*'
max_connections = 1000
superuser_reserved_connections = 1
shared_buffers = 12GB
huge_pages = try
work_mem = 10485kB
maintenance_work_mem = 1536MB
autovacuum_work_mem = -1
dynamic_shared_memory_type = posix
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 1000
pg_stat_statements.track = 'top'
pg_stat_statements.save = off
effective_io_concurrency = 2
max_worker_processes = 8
wal_level = hot_standby
fsync = on
wal_buffers = 16MB
commit_delay = 0
commit_siblings = 5
max_wal_size = 2GB
min_wal_size = 1GB
checkpoint_completion_target = 0.8
max_wal_senders = 3
max_replication_slots = 5
random_page_cost = 4.0
effective_cache_size = 16GB
default_statistics_target = 100
log_destination = 'stderr'
log_directory = 'pg_log'
log_filename = 'postgresql-%a.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 1000
log_hostname = off
log_line_prefix = '< %m >'
log_timezone = 'America/New_York'
autovacuum = on
log_autovacuum_min_duration = 1s
datestyle = 'iso, mdy'
timezone = 'America/New_York'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
We are running in hot standby with a secondary and fsync on.
The Server(s) are in AWS:
DB - 32GB RAM / 8 vCPUs and 50GB SSD disk on AWS.
When making the linux configuration from 0 to 2 for commited memory we get out of memory except errors.
We are wondering whether it is an issue with the way our postgres environment is configured or the underlying Linux OS.
The queries hitting the database are mostly long SELECT queries using lots of nested joins.
Any help is much appreciated.
What is the value of the overcommit_ratio when you are setting the overcommit_memory to 2?
Also, is swap space allocated on this machine?
Overcommit_ratio is 50 (this is default) and hasnt been changed.
Swap space is also not configured. In terms of the configuration for Postgres we are not sure whether to have a lower amout of work_mem or keep as is.
When you set overcommit_memory to 2, the Linux kernel will not overcommit memory, and only allocate as much memory as defined in overcommit_ratio.
Could you please test it with setting the overcommit_ratio to 80, which will allow the system to use 80% of the RAM.
Sure let me give it a try.
With regards to the Sequential Page cost and Random page cost, is it wise to set them the same? Have seen examples of random pagecost being set to 1.1 and seq page cost # out.
Do you also see any issue with my postgres configuration?
You can set the sequential_page_cost and random_page_cost to the same value if your DB is entirely cached in RAM. You can set the random_page_cost to 1.1.
Regarding other DB configuration parameters, they look good, however, make changes to them as per your benchmark tests.
Our DB is approx 30GB in size and RAM is 32GB so not entirely sure if it would be cached in RAM.
We also have the wal_level to Hot standby, do we need to make changes to the secondary database server or can that remain as is.
Alot of our transactions are lots of SQL, nested joins that have hundreds of parameters. What is the best practice for managing queires large as that in terms of postgres parameters?
It better to have the same setting on the standby server as well as you will be using it as a Master in case failover/switchover.
For large queries which join, sorts, it better to have higher work_mem at the session-level.
However, it's more of a trial and error type of thing when you are doing the DB tuning: test, change the parameters, test again.