tl;dr:This article is about usingpg_stop_backup()when setting up Streaming Replication. It is not an article about backup/restore methodology or policy
In a recent support case, I came across a customer who used a clever way to create streaming replication base backups–by taking a Google Cloud instance and cloning it. With the proliferation of cloud computing, it’s very convenient to be able to create a block-level clone of a VM within minutes or even seconds, and it would be much faster than any program like scp or rsync. They had found it to be faster thanpg_basebackupfor sure, on the order of several minutes for a ~50GB database. Basically, they would start a base backup, clone the VM, and then stand up the clone as a streaming replication standby. Unfortunately, for some reason, they could not usepsqlto log in to the standby–they would simply see the following error:
It was really strange. If you go to the primary and do aSELECT * FROM pg_stat_replication, you’ll see that while WAL is advancing on the primary, it’s getting replayed on the standby–the data stream is flowing, and replication is working, but yet we’re not able to log in to the standby to run read-only queries.
What’s going on?
A clue into this is that in a typical streaming replication instance, you’ll see the following in your log on startup:
On this customer’s instance, we weren’t seeing the last two lines (consistent recovery state reached...anddatabase system is ready to accept read only connections). Apparently, the standby wasn’t in a consistent state with the primary.
But, it LOOKS consistent…
One may argue that if you look inpg_stat_replication, all the evidence points to the idea that the standby IS in a consistent state with the primary. It’s replaying all the primary’s WAL. The LSN is advancing on both the standby and the primary–how could it NOT be consistent? To the human eye and the human intuition, things are consistent, as evidenced by the advancing LSN, but to a machine, it may not know that. Recall that if not usingpg_basebackup, the proper steps to setting up a Streaming Replication standby involves the following steps:
Executepg_start_backup('any_label')on the primary
Copy all the files in the primary’s$PGDATAdirectory, including WAL files
Executepg_stop_backup()on the primary
Set uprecovery.confon the standby (and delete postmaster.pid, set hot_standby=on, etc.)
Start up Postgres on the standby
Apparently, the customer had neglected to execute thepg_stop_backup()step, which left the standby in a state of technically perpetual inconsistency. This is because thepg_stop_backup()step writes aBACKUP_ENDentry into the WAL stream, which lets the standby know that it is done replaying all the copied WAL from step 2, and has now technically reached a consistent state, and can allow read-only connections. Without thisBACKUP_ENDentry, it will never know whether it has replayed all the WAL during the copy (what if the copy took a whole year to process?). ThisBACKUP_ENDentry is the foolproof way for Postgres to ensure a consistent state between the primary and standby.
The moral of the story:Be sure to stop your backups!When setting up a streaming replication standby, it is imperative to executeSELECT pg_stop_backup()after copying all of$PGDATA; without it, you’ll never be able to log in to your standby and run your read-only queries.