Cluster Suite HA – How to configure a 2 node test

This page documents how to set up a two-node cluster test with Cluster Suite/High Availability, comparing fencing methods between KVM virtual machines and bare-metal servers. You will be required to do some manual editing of xml files, but it’s a lot easier to view than some other clustering file formats.  These examples were done with the luci/conga GUI along with editing the xml in cluster.conf.

Hardware requirements:

One bare-metal server for the cluster master, and two servers (virtual or physical) for the cluster nodes.

Software requirements:

RHEL6, and a RHEL subscription to the High Availability package/add-on channel. We will be working with the packages luci, ricci, rgmanager, and cman.

At first, I thought this must be a tribute to a famous 50’s TV show couple, but what’s with the ricci instead of ricky? Maybe the developers were a fan of a certain actress with that last name.

Package installation

Install these packages on the cluster master:
yum groupinstall "High Availability"
or: yum install luci
Set the password, start the service:
passwd luci ; service luci start
yum install ricci ; passwd ricci
Point your web browser to the URL suggested when you started the luci service, https://servername:8084.
Log in as root, set the cluster permissions for user ricci on the Admin tab in the upper right hand corner of the GUI. Log out and log back in as ricci.
Install on cluster nodes:
yum install ricci ; passwd ricci ; service ricci start ; chkconfig ricci on
You will probably need to set the fully-qualified domain name of each cluster node as its host name, and in its own /etc/hosts file, such as:
/etc/hosts:
192.168.1.2 test1.domain.org
192.168.1.3 test2.domain.org
192.168.1.4 test1-vm.domain.org
192.168.1.5 test2-vm.domain.org
/etc/sysconfig/network:
HOSTNAME=test1.domain.org

Add nodes

Add nodes to the Nodes tab. Select Download Software to automatically install required packages.
Test reboot of one node at a time from the GUI. Again, I restate that having the fully-qualified domain name was required for nodes to add properly.
cluster.conf excerpt at this stage:
<clusternodes>
                <clusternode name="test1.domain.org" nodeid="1"/>
                <clusternode name="test2.domain.org" nodeid="2"/>
</clusternodes>

Add fence devices

Create fence devices for Dell DRAC5

cluster.conf:
<clusternodes>
 <clusternode name="test1.domain.org" nodeid="1"/>
  <fence>
   <method name="fence_drac5">
    <device name=test1-drac"/>
   </method>
  </fence>
 </clusternode>   
<clusternode name="test2.domain.org" nodeid="2"/>
 <fence>
   <method name="fence_drac5">
    <device name=test2-drac"/>
   </method>
 </fence>
 </clusternode> 
</clusternodes>
<fencedevices>
   <fencedevice agent="fence_drac5" ipaddr="192.168.0.1" login="root" \
module_name="server-1" name="test1-drac" passwd="addpasshere" action="reboot" \ 
secure="on"/>
   <fencedevice agent="fence_drac5" ipaddr="192.168.0.2" login="root" \
module_name="server-2" name="test2-drac" passwd="addpasshere" action="reboot" \ 
secure="on"/>
</fencedevices>

Note: module_name is a required option which refers to the DRAC’s definition of which server is to be accessed. The cluster.conf validation wouldn’t work without it.

Other sources mentioned adding command_prompt=”admin->” but this didn’t work for me.

Add fencing for KVM virtual machines

This configuration applies to two VMs running on the same physical host, using fence_xvm.

<clusternodes>
 <clusternode name="test1-vm.domain.org" nodeid="1"/>
  <fence>
   <method name="1">
    <device domain="test1-vm" name="fence_xvm"/>
   </method>
  </fence>
 </clusternode>   
<clusternode name="test2-vm.domain.org" nodeid="2"/>
 <fence>
   <method name="1">
    <device domain="test2-vm" name="fence_xvm"/>
   </method>
 </fence>
 </clusternode> 
</clusternodes>
<fencedevices>
   <fencedevice agent="fence_xvm" name="fence_xvm"/>
</fencedevices>
<rm>

You could, of course, configure fence_xvm as the first fencing method, fence_drac5 as the backup method. This would work, but if you have other virtual machines running on the same host, they would all be restarted with the fencing.

At the time of this writing, fence_virt/fence_virsh across multiple physical hosts is still not very well documented or supported. This is a bit disappointing since it’s a feature which many virtualization users could use.

Test Your Fencing Methods

On one node, try: cman_tool kill -n nodename.fqdn.org

and watch the logs on the other node for successful fencing, and takeover.

Other methods of triggering fence:

1. service network stop

2. echo c > /proc/sysrq-trigger  # hangs the system

3. pull the network cable on one node

Add failover domain

Create domain “failover1” and add the two nodes to it. This section defines which nodes are a member of the failover group that will be active.

Add service group

Create a service group “servicegroup1” with a virtual IP address, and Apache. References failoverdomain “failover1”.
Policy:
“restart” means it will attempt to restart the service once before relocating to another available node.
“restart-disable” means that the system will attempt to restart the service, if fails it will leave it disabled and not attempt to relocate it to another node.
“relocate” means relocate to another node without attempting a restart of service on the same node.
cluster.conf for resource,service group:
define a virtual IP address that can be failed-over between nodes.
<resources>
 <ip address="192.168.1.10" monitor_link="on" sleeptime="10"/>
</resources>
<service autostart="0" domain="failover1" name="servicegroup1" recovery="restart">
 <ip ref="192.168.1.10"/>
</service>
When one node goes down that is a member of the service group, the virtual IP relocates to the other node in a few seconds.

Recommended documentation:

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/pdf/Cluster_Administration/Red_Hat_Enterprise_Linux-6-Beta-Cluster_Administration-en-US.pdf
http://linux.dell.com/wiki/index.php/Products/HA/DellRedHatHALinuxCluster/Cluster#Configuring_Fencing_Using_Conga
This blog post on clustering is excellent in detail:
https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial
Advertisements