cancel
Showing results for 
Search instead for 
Did you mean: 

EnterpriseDB Failover Manager (EFM) 3.1 - Quick Start Guide

EDB Team Member

 This quick start guide is meant to get you running EnterpriseDB Failover Manager (EFM) 3.1 quickly in a test environment. For a production deployment, please consult the User’s Guide for complete information.

 

Installation and Prerequisites

 

This guide assumes:

 

  • A database server is already running and streaming replication is set up between a master and one or two standby nodes. For this example, we will use three nodes total. If there is only one standby, then the last node will be a failover manager “witness” node (more below).
  • Failover Manager has already been installed. See the user’s guide for detailed information about EFM installation.
  • When setting up our cluster, we will use the default cluster name “efm”.

 

EFM uses two files that need to be configured by the user: the <clustername>.properties and <clustername>.nodes files. The properties file contains the runtime settings for the EFM agent, and the nodes file is read at startup to tell an agent how to find the rest of the cluster or, in the case of the first node started, can be used to simplify authorization of subsequent nodes.

 

Step 1: Setup

 

Start the setup on a master or standby node. Then, you can copy the created files to other nodes, including a witness node, to save time.

 

Copy the sample files to create EFM configuration files, and correct the ownership:

cd /etc/edb/efm-3.1

cp efm.properties.in efm.properties

cp efm.nodes.in efm.nodes

chown efm:efm efm.properties

chown efm:efm efm.nodes

 

 

Create the encrypted password (needed for the properties file):

 

/usr/edb/efm-3.1/bin/efm encrypt efm

 

Follow the onscreen instructions to produce the encrypted version of your database password.

 

Step 2: Update the efm.properties File

 

Edit the efm.properties file, adding the encrypted password, and updating the properties noted below. Read the descriptions of each property in the file for more information.

 

Add values for the following required properties to the efm.properties file:

 

Database connection properties (needed even on the witness so it can connect to other databases when needed):

db.user

db.password.encrypted

db.port

db.database

 

 

Owner of the data directory, generally postgres or enterprisedb:

db.service.owner

 

 

Only one of these is needed, depending on whether or not you are running the database as a service:

 

db.service.name

db.bin

  

 

The data directory where EFM will find or create recovery.conf files:

db.recovery.conf.dir

 

 

Set to receive email notifications (the notification text is also included in the agent log):

user.email

 

 

This is the local address of the node and the port to use for EFM. Other nodes will use this address to reach the agent, and the agent will also use this address for connecting to the local database (as opposed to connecting to localhost). An example of the format is included here:

bind.address=1.2.3.4:7800

 

 

Set this to true on a witness node and false otherwise. A witness node is simply a node that does not have a local database running:

is.witness

 

 

If you’re running on a network without access to the Internet, you will have to change this to an address that is available on your network:

pingServerIp=8.8.8.8

 

 

Optional properties:

 

These properties can be either true or false in production depending on how you want to set up your cluster. Set them both to true while trying an EFM test cluster to simplify startup:

auto.allow.hosts=true

stable.nodes.file=true

 

 

Explanation: Having an agent join an existing cluster is a two step process, the first of which is authorizing the node to join (adding it to the “allowed hosts” list in EFM). By setting auto.allow.hosts to true, you are telling the first agent started to authorize addresses that it knows about at startup from the efm.nodes file. If you set this property to false, you will have to run the ‘efm allow-node’ command for each node that you want to add to the cluster.

 

 

In the default configuration, when an agent starts, it will maintain the efm.nodes file for you (explained below), adding and removing information as other nodes join or leave the cluster. This means that the first node started will clear out the file contents because no other nodes are running, and will add addresses back in as other agents join. Setting the stable.nodes.file property will turn off this feature, so that an agent will not rewrite the file. This is especially helpful when learning how to set up failover manager, as starting/stopping a single agent will not leave it with an empty efm.nodes file. This can be surprising when learning failover manager, because when restarting this node it will have no information to pre-authorize the rest of the cluster.

 

Step 3: Update the efm.nodes File

 

This file is read at startup and tells an agent where to find the other nodes in the cluster. For the first agent started, it can serve a second purpose of authorizing other nodes to join (the auto.allow.hosts property above).

 

Add the addresses and ports of every node in the cluster to this file. Your file should look like the content below, where the address/port for each node matches the efm.properties bind.address value on that node:

 

# List of node address and port combinations separated by whitespace.

# The list should include at least the membership coordinator's address.

1.2.3.4:7800

1.2.3.5:7800

1.2.3.6:7800

 

 

Please note that the Failover Manager agent will not verify the content of the efm.nodes file; the agent expects that some of the addresses in the file cannot be reached (e.g. that another agent hasn’t been started yet).

Step 4: Configure the Other Nodes

 

Copy the efm.properties and efm.nodes files to the /etc/edb/efm-3.1 directory on the other two nodes. Make sure the files are owned by efm:efm.

 

The efm.properties file can be the same on every node, except for these changes:

  • Change bind.address to use the node’s local address.
  • Set is.witness to true if this is a witness node. On a witness node, the properties relating to a local database installation (e.g. db.service.owner) are ignored.

 

Step 5: Start the EFM Cluster

 

  • On any node, start the efm agent. For instance, systemctl start efm-3.1 on a Centos/RHEL 7 machine and  service efm-3.1
     start on Centos/RHEL 6.
  • After the agent starts, run the following command to see the status of the single-node cluster. You should see the addresses of the other nodes in the “Allowed node host list.” /usr/edb/efm-3.1/bin/efm cluster-status efm
  • Start the agent on the other nodes. Run the  /sr/edb/efm-3.1/bin/efm cluster-status efm command on any node to see the cluster status.

 

If any agent fails to start, see the startup log for information about what went wrong:

 

cat /var/log/efm-3.1/startup-efm.log

 

Under unusual circumstances, you may be directed to the agent log for more information.

 

Performing a Switchover

 

If the cluster status output shows that the master and standby(s) are in sync, you can perform a switchover with the following command:

/usr/edb/efm-3.1/bin/efm promote efm -switchover

 

That command will promote a standby and reconfigure the master database as a new standby in the cluster. To switch back, run the command again. If your cluster has more than one standby, you can change the priority of which one will be promoted with the following command:

 

/usr/edb/efm-3.1/bin/efm set-priority efm <address> 1

 

 

 

For more information on the efm command line tool, run:

/usr/edb/efm-3.1/bin/efm --help

 

 

To stop the agents across the cluster, run:

/usr/edb/efm-3.1/bin/efm stop-cluster efm