cancel
Showing results for 
Search instead for 
Did you mean: 

Replication Server redundancy

Highlighted
Level 3 Adventurer

Replication Server redundancy

Hello,

 

We are planning to have MMR setup across 2 regions, each region would have 1 Master and 1 Slave.

Q1: Does 1 instance of Replication server do the relication for both directions?

 

To achieve Replication Server HA, we plan to install Replication server on all 4 servers, so that if the Replication server on Region1-Master goes down then it can run on Region1-Slave OR if the entire Region1 goes down then it can run on Region2-Master or Slave.

Q2: Can failover of Replication server be automated in any way? OR will we need to manually start the Replication server on Region1-Slave or Region2 nodes? Basically, does EDB provide any native redundancy or high availability for the Replication Server?

 

Thanks

Tags (2)
8 REPLIES 8
EDB Team Member

Re: Replication Server redundancy

Welcome to PostgresRocks Community
@pcpg wrote:

Hello,

 

We are planning to have MMR setup across 2 regions, each region would have 1 Master and 1 Slave.

Q1: Does 1 instance of Replication server do the relication for both directions?

 

Yes, one instance of xDB Replication Server handles data shipping across MMR nodes. 

 

To achieve Replication Server HA, we plan to install Replication server on all 4 servers, so that if the Replication server on Region1-Master goes down then it can run on Region1-Slave OR if the entire Region1 goes down then it can run on Region2-Master or Slave.

 

 

It works but the only downside is you need to reconfigure MMR setup from where the next Replication Server will take charge on the server.

 

Because Replication Server is based on Logical replication slots and slots information never replicated to slaves/standbys. You can verify by running queries on slave/standby "pg_replication_slots" view.

 

So, assume in your case, Replication Server used on Region1-Master to configure MMR between Region1-Master to Region2-Master and for some reason, Region1-Master went offline. Now you have to promote Region1-Slave as Master and start Replication Server on that server. Replication Server fails to replicate from Region1-Slave(new master) to Region2-Master because the slot information stored in Region1-Master is not available on Region1-Slave(which is now master).  Thus MMR replication should be reconfigured. 

 

Q2: Can failover of Replication server be automated in any way? OR will we need to manually start the Replication server on Region1-Slave or Region2 nodes? Basically, does EDB provide any native redundancy or high availability for the Replication Server?

 

Currently, automation/HA are not supported in Replication Server. In any case, manual start and reconfigure of replication setup is required. 

 

As per your architecture, you are looking for no service outage, it may not be a solution but you can try OS Level HA Active/Passive software. Because, One instance of Replication Server(edb-xdbpublication service) handles whole MMR replication system. 

 

--Raghav

 

Thanks


 

Level 3 Adventurer

Re: Replication Server redundancy

Thanks Raghav.

1. I guess I cant have the slots pre-configured either on the slave or on Region2-Master, correct?

2. Can we configure 2 SMRs i.e.

SMR1 for replication from Region1 to Region2

SMR2 for replication from Region2 to Region1

(pardon me - probably this may sound a bit absurd/dumb but I am trying to draw a parallel to GoldenGate in the Oracle land wherein we have GG instance on Region1 that replicates from Region1 to Region2  and GG instance on Region2 that replicates in the opp direction)

 

Thanks again.

EDB Team Member

Re: Replication Server redundancy


@pcpg wrote:

Thanks Raghav.

1. I guess I cant have the slots pre-configured either on the slave or on Region2-Master, correct?

 

Yes, no slot pre-configuration, and also remember Slave is READ-ONLY, so no operation on that server. 

 

2. Can we configure 2 SMRs i.e.

SMR1 for replication from Region1 to Region2

SMR2 for replication from Region2 to Region1

 

You can, but databases should be different.

 

(pardon me - probably this may sound a bit absurd/dumb but I am trying to draw a parallel to GoldenGate in the Oracle land wherein we have GG instance on Region1 that replicates from Region1 to Region2  and GG instance on Region2 that replicates in the opp direction)

 

No problem. Your questions are absolutely a valid one and thank you for posting it here.

EDB Replication Server has its own uniqueness in features and in similar way GG has its own set of features. Maybe some of the features look common in nature with both the softwares, but as a matter of fact they are addressing different kind of requirements.

 

I would recommend to engage our EDB members to discuss about your architecture and requirements you want to achieve. Our folks will be happy to reach out to you and give architectural solutions. 

 

Thank you. 

Raghav

Blog: raghavt.blogspot.com


 

Level 3 Adventurer

Re: Replication Server redundancy

Thanks. Pls see inline.

 


@Raghav wrote:

@pcpg wrote:

Thanks Raghav.

1. I guess I cant have the slots pre-configured either on the slave or on Region2-Master, correct?

 

Yes, no slot pre-configuration, and also remember Slave is READ-ONLY, so no operation on that server. 

 

But lets say, if we have only Masters (no Slaves) then, I believe, slots would already exist and starting the redundant xDB Replication Server would not require creating (or recreating) those, isnt it?

 

2. Can we configure 2 SMRs i.e.

SMR1 for replication from Region1 to Region2

SMR2 for replication from Region2 to Region1

 

You can, but databases should be different.

 

(pardon me - probably this may sound a bit absurd/dumb but I am trying to draw a parallel to GoldenGate in the Oracle land wherein we have GG instance on Region1 that replicates from Region1 to Region2  and GG instance on Region2 that replicates in the opp direction)

 

No problem. Your questions are absolutely a valid one and thank you for posting it here.

EDB Replication Server has its own uniqueness in features and in similar way GG has its own set of features. Maybe some of the features look common in nature with both the softwares, but as a matter of fact they are addressing different kind of requirements.

 

I would recommend to engage our EDB members to discuss about your architecture and requirements you want to achieve. Our folks will be happy to reach out to you and give architectural solutions. 

 

Thank you. 

Raghav

Blog: raghavt.blogspot.com


 

Also, in our case, its going to be an EFM cluster in each region. And logical replication will leverage the VIP i.e. irrespective of whether the source is Node1 or Node2 in Region1, replication destination will always be Region2-VIP. And same the other way round. Does this offer any advantage of achieving redundancy of xDB? Or that makes no difference because when the promotion happens, the slots still need to be created?

 

Regards

Level 3 Adventurer

Re: Replication Server redundancy

Hi Raghav,

You mentioned:

Replication Server fails to replicate from Region1-Slave(new master) to Region2-Master because the slot information stored in Region1-Master is not available on Region1-Slave(which is now master).

If the slots need to be re-configured on the old slave (i.e. the new master, after promotion), then EFM would not work, isnt it? (I am not confused between EFM and xDB neither am I mixing up but) EFM helps achieve auto-failover, correct?

We tested by doing "EFM promotion" and observed that MMR worked in both directions between the new Master on Region1 and original Master on Region2. Did that work because we did a EFM promotion (which is like a graceful switchover) OR because xDB continued to run on the original master on Region1?

And if we instead simulate one of the below 2, then we will need to reconfigure the slots:

- (master) node failure followed by EFM failover with xDB also failing over to new master

- xDB Replication server failure and it failing over to slave node on Region1 or one of the nodes on Region2

 

Thanks

EDB Team Member

Re: Replication Server redundancy


@pcpg wrote:

Hi Raghav,

You mentioned:

Replication Server fails to replicate from Region1-Slave(new master) to Region2-Master because the slot information stored in Region1-Master is not available on Region1-Slave(which is now master).

If the slots need to be re-configured on the old slave (i.e. the new master, after promotion), then EFM would not work, isnt it? (I am not confused between EFM and xDB neither am I mixing up but) EFM helps achieve auto-failover, correct?

 

Good to know that you are not mixing up EFM and xDB. Both address different requirements. EFM for auto-failover and xDB for SMR/MMR replication. 

 

We tested by doing "EFM promotion" and observed that MMR worked in both directions between the new Master on Region1 and original Master on Region2. Did that work because we did a EFM promotion (which is like a graceful switchover) OR because xDB continued to run on the original master on Region1?

And if we instead simulate one of the below 2, then we will need to reconfigure the slots:

- (master) node failure followed by EFM failover with xDB also failing over to new master

- xDB Replication server failure and it failing over to slave node on Region1 or one of the nodes on Region2

 

At this place, it seems little mixy. Both are two different thing. EFM helps to achieve auto-failover on two node architecture configured(Streaming Replication) one node as master and another node as Slave. EFM witness node will be monitoring both Master/Slave node and in case of unavailability of Master, Slave node will be promoted as Master by EFM. 

xDB Replication Server only see the nodes that are part of it are in write mode for MMR and one write and one read in SMR architecture.  

 

--Raghav

Level 3 Adventurer

Re: Replication Server redundancy

Hi, So the understanding is not mixed but yes, the arch is such that at some point (or for some scenarios which we are testing), it has to get mixed or rather we should say its integrated.

I understand what you have mentioned.

Are the slots for Streaming Repl different from the slots for MMR? I guess, no!

Let me re-phrase:
Lets say, the env. is:
Region1= Node1 (master) + Node2 (Slave) + Node3 (EFM-Witness)
Region2= Node4 (master) + Node5 (Slave) + Node6 (EFM-Witness)
xDB Repl server is running on Node1.

MMR is setup between Node1 & Node4 which is in-line with what you have mentioned i.e. MMR is between 2 master nodes (or DBs).

After promotion (by EFM) in Region1, MMR should not have worked between Node2 (the new master) & Node4. However, our testing shows it works correctly without needing any slot recreation (on Node2).

 

Regards

Community Manager

Re: Replication Server redundancy

Hi pcpg,
 
Thank you so much for your great questions and comments on this topic!  The information provided is greatly appreciated and very beneficial to all community members.
 
However, at this point, the content of this thread has become quite detailed and it is best if this conversation is moved under your EDB support subscription. This is the best manner in which to assist you on the granular nature of your situation and in a more timely manner than a general forum can offer. 
 
The EDB Support Team will be opening a case for you under your active EDB support subscription. You will be hearing from them shortly. 
 
Thank you again for your comments and contributions. We look forward to hearing from you on future topics!
 
Postgres Rocks Community Management