Dec 9, 2020 | Cloud, Tutorials

Corosync Cluster with Failover IP

by

One of the first customer requirements you usually read is High availability. For a long time now, it has been more the norm that the project is still accessible without problems even in the event of partial failures and that “single points of failure” are avoided. A Corosync / Pacemaker cluster is often used for this purpose; the technology behind it has been tried and tested for more than a decade – the basic idea behind it is: you create virtual resources that can be started on every connected node.

The following describes how to create a Corosync / Pacemaker cluster using Ubuntu. If you are already familiar with these steps and would just like to know how to store the failover IP in OpenStack, you will find the relevant information further down in the article.

Create a Corosync / Pacemaker cluster

Installation

The first step is to install the necessary packages. crmsh offers a shell that can be used to control the cluster.

root@test-node-1:~# apt install corosync pacemaker crmsh

Configuration

Then the authkey is created, which is needed for the communication between the nodes. Without this, the service may not be started.

root@test-node-1:~# corosync-keygen

This can take some time, depending on how busy the server is. For newly created VMs, it would take too long and you can use the following snippet, for example, which writes random data to the hard disk – but please always keep in mind that you should not write all over the hard disk, so if necessary you should adapt the command to your own needs!

while /bin/true; do dd if=/dev/urandom of=/tmp/entropy bs=1024 count=10000; for i in {1..50}; 
do cp /tmp/entropy /tmp/tmp_$i_$RANDOM; done; rm -f /tmp/tmp_* /tmp/entropy; done

When this is done, you copy the authkey onto both servers at /etc/corosync/authkey . 

root@test-node-1:~# cp authkey /etc/corosync/authkey
root@test-node-1:~# chown root.root /etc/corosync/authkey
root@test-node-1:~# chmod 0400 /etc/corosync/authkey
root@test-node-1:~# scp /etc/corosync/authkey root@test-node-2:/etc/corosync/authkey

The cluster is then configured in the file /etc/corosync/corosync.conf, in which, among other things, the private IPs of the cluster nodes are defined. This file is also identical on all nodes.

totem {
  version: 2
  cluster_name: test-cluster
  transport: udpu
  interface {
    ringnumber: 0
    bindnetaddr: 172.16.0.0
    broadcast: yes
    mcastport: 5407
  }
}

nodelist {
  node {
    ring0_addr: 172.16.0.10
  }
  node {
    ring0_addr: 172.16.0.20
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  to_logfile: yes
  logfile: /var/log/corosync/corosync.log
  to_syslog: yes
  timestamp: on
}

service {
  name: pacemaker
  ver: 1
}

The cluster then needs to be restarted, therefore all data is transferred. At this point, the cluster should already display its status and recognise all nodes. Initially, this may take a few moments.

root@test-node-1:~# systemctl restart corosync && systemctl restart pacemaker

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:40:20 2020 by hacluster via crmd on test-node-1

2 nodes configured
0 resource configured

Online: [ test-node-1 test-node-2 ]

By means of the crm, the cluster can be controlled and its current configuration accessed. The configuration should be very similar to the following:

root@test-node-1:~# crm configure show
node 2886729779: test-node-1
node 2886729826: test-node-2
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=stop \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

Now this configuration can be edited directly and resources can be defined. This can also be done using “crm configure“, but in this example the configuration is taken over directly.

root@test-node-1:~# crm configure edit

node 2886729779: test-node-1
node 2886729826: test-node-2
primitive ha-vip IPaddr2 \
  params ip=172.16.0.100 cidr_netmask=32 arp_count=10 arp_count_refresh=5 \
  op monitor interval=10s \
  meta target-role=Started
property cib-bootstrap-options: \
  have-watchdog=false \
  dc-version=1.1.18-2b07d5c5a9 \
  cluster-infrastructure=corosync \
  cluster-name=test-cluster \
  stonith-action=reboot \
  no-quorum-policy=ignore \
  stonith-enabled=false \
  last-lrm-refresh=1596896556 \
  maintenance-mode=false
rsc_defaults rsc-options: \
  resource-stickiness=1000

The attentive reader will notice that the “no-quorum-policy” has also been adapted. This is important for the operation of a cluster that consists of only two nodes, as no quorum could be formed if one node fails.

root@test-node-1:~# crm status
Stack: corosync
Current DC: test-node-1 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon Dec 7 15:40:20 2020
Last change: Mon Dec 7 15:45:21 2020 by hacluster via crmd on test-node-1

2 nodes configured
1 resource configured

Online: [ test-node-1 test-node-2 ]

Full list of resources:

ha-vip (ocf::heartbeat:IPaddr2): Started test-node-1

Configure Failover IP in OpenStack

There are two possibilities to store the IP in OpenStack. Firstly, you can navigate to the VM’s port in the web interface via Network -> Networks -> said network -> Ports and add the desired IP to the tab “Allowed address pairs“. On the other hand, this is also possible via OpenStack CLI Tool:

openstack port list --server test-node-1
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                         | Status |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+
| 0a7161f5-c2ff-402c-9bf4-976215a95cf3 |      | fa:16:3e:2a:f3:f2 | ip_address='172.16.0.10', subnet_id='9ede2a39-7f99-48c8-a542-85066e30a6f3' | ACTIVE |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+

The additionally permitted IP address is added to the port as follows. A complete network can also be defined here if several IP resources are to be created.

openstack port set 0a7161f5-c2ff-402c-9bf4-976215a95cf3 --allowed-address ip-address=172.16.0.100

This step must be repeated for both servers. After this, the IP is also accessible in the OpenStack project. If this is not the case, it may help to move the IP resource to the other node, therefore the IP is announced there. However, this should not be the case due to the ARP settings of the resource set above.
The additionally permitted IP address is added to the port as follows. A complete network can also be defined here if several IP resources are to be created.

crm resource migrate ha-vip test-node-2

 

 

The whole thing doesn’t work as planned, or there are still further questions? Our MyEngineers are sure to help!

Subcribe for next article

More articles in Cloud | Tutorials
LUKS Encrypted Storage on OpenStack

LUKS Encrypted Storage on OpenStack

Thoroughly securing your IT landscape has become more and more important over the last couple of years. With an increase in (user) data to be managed, processed, and stored, encryption of this data should be on your agenda towards fully secured IT infrastructure....

Securing ingress-nginx with cert-manager

Securing ingress-nginx with cert-manager

In one of our first tutorials, we showed you how to get started with ingress-nginx on your Kubernetes cluster. As a next step, we will tell you how to go about securing ingress-nginx with cert-manager by creating TLS certificates for your services! What is...

Migrating Servers from VMware to Openstack

Migrating Servers from VMware to Openstack

In this tutorial, we will have a look at migrating servers from VMware to OpenStack. After VMware's recent acquisition by Broadcom, many Cloud Service Providers (CSPs) face termination of their partnership programs with VMware. With no further information publicly...

Mastering Kubernetes with Cilium: Empowering L7 Traffic Control

Mastering Kubernetes with Cilium: Empowering L7 Traffic Control

With the new release of the Cilium CNI on our Kubernetes Service you'll get the ability to filter traffic based on L7 properties. It's very powerful and can help a lot with your services security. In this tutorial, we'll be securing an API endpoint to allow access...

Using Terraform with OpenStack

Using Terraform with OpenStack

Many of you may already be familiar using Terraform with Azure or AWS. Although these may be the most used platforms, there is still a need for variety of other options due to local regulations (GDPR). As our systems are geared towards Open-Source, we will be looking...

Dynamic Inventory – An Ansible and Openstack Lovestory

Dynamic Inventory – An Ansible and Openstack Lovestory

For those of you that may not be too familiar with Ansible, it is a great tool to get started in the world of automation and making your life with configuration management a whole lot easier. In this tutorial we will be going through a basic playbook that you can use...

ReadWriteMany (RWX) with the NFS Ganesha Provisioner

ReadWriteMany (RWX) with the NFS Ganesha Provisioner

Introduction You have the desire that your application needs to scale across multiple nodes for load balancing, but needs access to a common PVC? For this purpose, you need a PVC that is RWX-enabled. As part of our Managed Kubernetes Cluster, it is possible to create...

Resizing Persistent Volumes in Kubernetes

Resizing Persistent Volumes in Kubernetes

You want to resize a PersistentVolume (PV) in Kubernetes? In this tutorial, you'll learn how to do it. If you don't already know what a PV is and how you can create one, you should check out the tutorial Creating Persistent Volumes in Kubernetes first.   Let's...

How to start your NETWAYS Managed Database

How to start your NETWAYS Managed Database

In the first Database tutorial, Sebastian already explained what Vitess is all about and what possibilities it offers you, when running your application compared to an ordinary database. In this tutorial, I would like to explain how easy it is for you to start your...