cancel
Showing results for 
Search instead for 
Did you mean: 

My 3 Wishes for RDS, Fulfilled by CDS

Moderator

 

Most of us have worked on a traditional database environment (bare metal or virtualized machines) where a DBA has to work on setting up the database, maintaining every operational detail from trivial to most critical, provisioning the database for developers, setting up backup and recovery, HA, and working on performance tuning, just to name a few. This requires a good amount of time and could ultimately be a factor in distracting one’s focus from more critical tasks.

 

Fortunately, with the recent DevOps motion and cloud use emphasis on the rise, doing these operations is now just a few quick clicks away, thus allowing a DBA to spend more time on important tasks such as capacity planning, performance tuning and supporting future projects.  However, this newfound freedom comes with a tradeoff; on one hand, using traditional systems provides the DBA with more access to the host operating system, whereas, on the other hand, that access becomes more restricted in one way or another with most cloud services.

 

Thus, the question remains...is it actually advantageous  to run critical workloads using Postgres as a Service?

 

EDB Support has seen repeated customer situations where having a healthy database system becomes an especially crucial element to avoid putting the business at risk. One important factor in ensuring a healthy database system involves having the tools and access necessary to perform any task required, whether it is in the database itself or in the host operating system it is running on.  Without this access DBAs and sysadmins are left with their hands bound and at the mercy of the service running the database itself.

 

My intent behind this blog post is more about giving perspective and thoughts to potential problems a real system can have and not about addressing how to fix it. I am going to share a couple of production scenarios which we have seen with customers in the past, in an effort to help you understand when access to the host (directly or indirectly) is needed:

 

  1. Database crash/recovery issue or potential bug
  2. Database corruption troubleshooting
  3. Transaction wraparound or vacuum problem

  

Now let's look at each one to see how this impact us.

 

1. Database crash/recovery-mode issue or potential bug

 

A database crash/recovery issue can't be troubleshot directly with the Postgres database logs. You need to have strace or GDB output to understand more about what's happening at the system level and what the actual problem is behind a database crash/recovery-mode issue, and to reproduce/identify bugs in the code.

 

2. Database corruption troubleshooting

 

Have you ever been in a situation where you were unable to read data from a table or where the Postgres catalog has corruption?  I have.  

 

Basic troubleshooting involves verifying the catalog corruption, dumping (reading) the table and, in advanced troubleshooting, reading the block itself from the disk to understand if the header is corrupted. One can do all this only with the appropriate level of access to the backend to troubleshoot the problem and identify what went wrong. 

 

3. Transaction wraparound or vacuum problem

 

Transaction wraparound is a known issue with Postgres. You can certainly mitigate its effects by tuning your autovacuum settings and make it more aggressive, or do manual vacuum as a maintenance step. However, if your database isn’t able to keep up and take care of the transaction wraparound, you may also find yourself in a situation where the database is not accepting any new operations.

 

ERROR: database is not accepting commands to avoid wraparound data loss in database "ptp"

HINT: Stop the postmaster and vacuum that database in single-user mode.

You might also need to commit or rollback old prepared transactions.

 

How do you fix a problem like this?

 

The easiest way to handle this is to run Postgres in single-user mode and then perform a vacuum analysis on the database. But not having access to the backend will also prevent you from performing this operation.

 

Conclusion

 

As you can see, it's really important to understand all aspects before making a decision to choose any of the container based services currently offered by different cloud providers.

 

Having said that, you can still leverage all the managed service features like autoscaling, load balancing, automatic backup, and replication without having to forfeit handling serious database issues when using the Cloud Database Service (CDS) provided by EnterpriseDB.