Corosync Cluster with Failover IP

9 December, 2020

Fabian Rothlauf

Manager MyEngineer

Fabian kehrte nach seinem fünfjährigen Ausflug nach Weimar zurück in seine Geburtsstadt Nürnberg und hat im September 2016 bei NETWAYS als Systems Engineer im Hosting Support angefangen. Der Mopsliebhaber, der schon seit seinem 16. Lebensjahr ein Faible für Adminaufgaben hat, liebt außerdem Grillen, Metal und Computerspiele. An seinem Beruf reizt ihn vor allem die Abwechslung, gute Weiterentwicklungsmöglichketen und dass es selten mal einen Stillstand gibt. Nachdem er die Berufsschulzeit bereits mit Eric und Georg genießen durfte, freut er sich bei NETWAYS nun auf weitere nette Kolleg:innen, interessante Aufgaben und neue Blickwinkel.

by Fabian Rothlauf | Dec 9, 2020

Cloud Tutorial

One of the first customer requirements you usually read is: High availability. It has long been the norm to ensure that the project can still be reached without any problems even in the event of partial failures and that single points of failure are avoided. A Corosync / Pacemaker cluster is often used for this, the technology behind it has been tried and tested for over a decade – the basic idea behind it is to create virtual resources that can be started on any connected node.

The following describes how to create a Corosync / Pacemaker cluster under Ubuntu. If you are already familiar with these steps and just want to know how to store the failover IP in the cloud, you will find the relevant information further down in the tutorial.

Create Corosync / Pacemaker Cluster

Installation

The first step is to install the necessary packages. crmsh provides a shell that can be used to control the cluster.

root@test-node-1:~# apt install corosync pacemaker crmsh

root@test-node-1:~# apt install corosync pacemaker crmsh

Configuration

The authkey is then created, which is required for communication between the nodes. Without this, the service may not be able to be started.

root@test-node-1:~# corosync-keygen

root@test-node-1:~# corosync-keygen

This can take some time, depending on how much is going on on the server. For newly created VMs, it would take too long and you can use the following snippet, for example, which writes random data to the hard disk – but please always keep in mind that you should not fill up the hard disk, so if necessary you should adapt the command to your own needs!

while /bin/true; do dd if=/dev/urandom of=/tmp/entropy bs=1024 count=10000; for i in {1..50}; 
do cp /tmp/entropy /tmp/tmp_$i_$RANDOM; done; rm -f /tmp/tmp_* /tmp/entropy; done

while /bin/true; do dd if=/dev/urandom of=/tmp/entropy bs=1024 count=10000; for i in {1..50}; 
do cp /tmp/entropy /tmp/tmp_$i_$RANDOM; done; rm -f /tmp/tmp_* /tmp/entropy; done

Once this has been done, copy the authkey to both servers at /etc/corosync/authke y.

The cluster is then configured in the file /etc/corosync/corosync.conf, in which, among other things, the private IPs of the cluster nodes are defined. This file is also identical on all nodes.

totem {
  version: 2
  cluster_name: test-cluster
  transport: udpu
  interface {
    ringnumber: 0
    bindnetaddr: 172.16.0.0
    broadcast: yes
    mcastport: 5407
  }
}

nodelist {
  node {
    ring0_addr: 172.16.0.10
  }
  node {
    ring0_addr: 172.16.0.20
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  to_logfile: yes
  logfile: /var/log/corosync/corosync.log
  to_syslog: yes
  timestamp: on
}

service {
  name: pacemaker
  ver: 1
}

totem {
  version: 2
  cluster_name: test-cluster
  transport: udpu
  interface {
    ringnumber: 0
    bindnetaddr: 172.16.0.0
    broadcast: yes
    mcastport: 5407
  }
}

nodelist {
  node {
    ring0_addr: 172.16.0.10
  }
  node {
    ring0_addr: 172.16.0.20
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  to_logfile: yes
  logfile: /var/log/corosync/corosync.log
  to_syslog: yes
  timestamp: on
}

service {
  name: pacemaker
  ver: 1
}

The cluster then requires a restart so that all data is transferred. From this point on, the cluster should also output its status and recognize all nodes. Initially, this may take a few seconds.

root@test-node-1:~# systemctl restart corosync && systemctl restart pacemaker

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:40:20 2020 by hacluster via crmd on test-node-1

2 nodes configured
0 resource configured

Online: [ test-node-1 test-node-2 ]

root@test-node-1:~# systemctl restart corosync && systemctl restart pacemaker

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:40:20 2020 by hacluster via crmd on test-node-1

2 nodes configured
0 resource configured

Online: [ test-node-1 test-node-2 ]

The cluster can be controlled and its current configuration can be accessed via crm. The configuration should be very similar to the following:

root@test-node-1:~# crm configure show
node 2886729779: test-node-1
node 2886729826: test-node-2
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=stop \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

root@test-node-1:~# crm configure show
node 2886729779: test-node-1
node 2886729826: test-node-2
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=stop \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

This configuration can now be edited directly and resources defined. This can also be done using “crm configure“, but in this example the configuration is applied directly.

root@test-node-1:~# crm configure edit

node 2886729779: test-node-1
node 2886729826: test-node-2
primitive ha-vip IPaddr2 \
  params ip=172.16.0.100 cidr_netmask=32 arp_count=10 arp_count_refresh=5 \
  op monitor interval=10s \
  meta target-role=Started
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=ignore \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

root@test-node-1:~# crm configure edit

node 2886729779: test-node-1
node 2886729826: test-node-2
primitive ha-vip IPaddr2 \
  params ip=172.16.0.100 cidr_netmask=32 arp_count=10 arp_count_refresh=5 \
  op monitor interval=10s \
  meta target-role=Started
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=ignore \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

The attentive reader will notice that no-quorum-policy has also been adjusted. This is important for the operation of a cluster consisting of only two nodes, as no quorum could be formed if one node fails.

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:45:21 2020 by hacluster via crmd on test-node-1

2 nodes configured
1 resource configured

Online: [ test-node-1 test-node-2 ]

Full list of resources:

ha-vip (ocf::heartbeat:IPaddr2): Started test-node-1

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:45:21 2020 by hacluster via crmd on test-node-1

2 nodes configured
1 resource configured

Online: [ test-node-1 test-node-2 ]

Full list of resources:

ha-vip (ocf::heartbeat:IPaddr2): Started test-node-1

Configure failover IP in OpenStack

There are actually two ways to store the IP in OpenStack. Firstly, you can navigate to the port of the VM in the web interface via Network -> Networks -> said network -> Ports and add the desired IP to the “Allowed address pairs” tab. This is also possible via the OpenStack CLI tool:

openstack port list --server test-node-1
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                         | Status |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| 0a7161f5-c2ff-402c-9bf4-976215a95cf3 |      | fa:16:3e:2a:f3:f2 | ip_address='172.16.0.10', subnet_id='9ede2a39-7f99-48c8-a542-85066e30a6f3' | ACTIVE |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+

openstack port list --server test-node-1
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                         | Status |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| 0a7161f5-c2ff-402c-9bf4-976215a95cf3 |      | fa:16:3e:2a:f3:f2 | ip_address='172.16.0.10', subnet_id='9ede2a39-7f99-48c8-a542-85066e30a6f3' | ACTIVE |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+

The additionally permitted IP address is added to the port as follows. A complete network can also be defined here if several IP resources are to be created.

openstack port set 0a7161f5-c2ff-402c-9bf4-976215a95cf3 --allowed-address ip-address=172.16.0.100

openstack port set 0a7161f5-c2ff-402c-9bf4-976215a95cf3 --allowed-address ip-address=172.16.0.100

This step must be repeated for both servers. The IP is then also accessible in the OpenStack project. If this is not the case, it may help to switch the IP resource to the other node so that the IP is announced there. However, this should not be the case due to the ARP settings of the resource set above.