Listen to Marc Linster and Zahid Iqbal from the Thursday, July 11, 2019 webinar to learn how EDB Replication Server uses Kafka and Zookeeper to create a highly scalable replication infrastructure. During this webinar, Marc and Zahid highlight how EDB Postgres Replication Server (EPRS) can be used to create a highly available and geographically distributed Postgres infrastructures.
Link to the webinar recording (https://info.
Q & A Session
Q. Is it possible to use your tool as a data change capture, meaning not consuming the data into a DB but having flexible access to the Avro schema for any use?
A. Yes, the CDC changes are retained in the Kafka queue (topics) as per the retention period (default for 7 days) and can be used accordingly. Producers and Consumers are loosely coupled, so data gets published and maintained in the Kafka queue and can be used independently of if a given Consumer has caught up or not.
Q. How complex is the Avro schema declaration? Is it a kind of on the fly serialization or is it user defined?
A. The AVRO schema is based on the Publication table structure and constructed at runtime as the first CDC change is captured and processed by EPRS Producer.
Q. Can you replicate both to and from non-PostgreSQL databases (such as MS SQL Server, etc.)?
A. It’s not currently supported but planned for a future release expected to be supported sometime mid/end of 2020.
Q. Is there a capability to transform the data?
A. Currently, EPRS does not provide an explicit interface/API to facilitate data transformation. There is a future enhancement in the pipeline to let the user apply the transformation before the data gets applied to a target consumer.
Q. How conflicts are handled in a master-master setup?
How sequences are managed, do they exist one for the cluster or each node?
A. In the initial version, there is a basic level of conflict detection/handling support that is limited to INSERT-INSERT conflict mode for a given source and target database. By default, CDC application on the specific consumer is stopped, the user can, however, choose to SKIP or RETRY (till the conflicting row is removed from the consumer). The UPDATE-UPDATE conflict handling is currently not supported. The Sequence replication is not supported, the user is required to ensure that a node-specific Sequence is defined with a non-conflicting range.
Q. It was mentioned that the product works on Oracle DB and SQL Server, does it work on Progress DB?
A. The EPRS7 currently supports replication to/from PostgreSQL/EDB Postgres Advanced Server. The replication support for Oracle and MS SQL Server will be provided in a future release.
Q. Is there any recommended way to transform or denormalize data on the replicated consumer?
A. Transformation will be supported in the future.
Q. Do I understand that DDL currently can not be replicated?
A. Correct, the DDL (schema) needs to be manually applied to each of the target database.
Q. Can you touch a little bit on if EPRS supports both asyncronous and syncronous replication?
A. EPRS only supports asynchronous replication, the synchronous replication is supported via PG physical streaming replication.
Q. Can ERPS be on the same server(s) as Postgres?
A. EPRS can either be collocated with the database server on the same host or deployed on a separate host.
Q. Is the client a single point of failure?
A. I’m not sure if I got it correctly, if you are referring to the Rep CLI client then that is not essentially a single point of failure, one can run Rep CLI from any system and connect with any of the EPRS servers. By default, it connects with the EPRS server running on the local system.
Q. How do you repair a table that has been found inconsistent?
A. Re-publish the table by removing and adding it in the given Publication (once it’s added again, a Snapshot is performed for the specific table to seed with the same set of data as available on the source Publication database table).
Q. Does this support LOB data types and how replication of complex data structures does affect overall performance?
A. Yes, LOB (bytea) is supported with a restriction on the size of data currently limited to 150MB max row size. There might be a slight impact on the replication latency.