I have re-setup a 2 node cluster on Cent OS 6.5 for the purpose of this blog i.e. testing the cluster. The details of host name and IP address are as below:
Checking the status of the cluster.
The cluster service is running on node 2. We will perform some sanity tests and there should not be any disturbances in the cluster operation.
While rebooting node 2, the cluster service is moving to node 1 by stopping the service on node 2 as shown below.
Now, we can see that the service has started on node 1.
After node 2 comes up, we check the cluster status.
We need to make sure the service does not fail back to node 2 as per cluster configuration. We can see above the service is still active on node 1 and will not move to node 2 which was the last owner.
As the service is running on node 1, we power off that node
We can see that the services are running on node 2 after we power off the first node.
I have powered off both the nodes as shown below. For this test we will power on both nodes at the same time and check if the cluster services come up automatically.
After we power on both the nodes, the second node has come up and the cluster service is activated on that node.
When checking the status of node 1, the node is still in the boot process. The cluster is delaying the power on process of node 1 while the cluster service is activated in the node 2.
Now, we manually move the cluster service among the nodes. The second node is active right now. So we run the following command to move the given service to node 1. The -r specifies the service name and -m specifies the node name where the services are to be moved.
The test is successful and we further move the service to the second node as shown below.
We can also test if the fencing configuration is done correctly by fencing the nodes. For manual fencing we run the following command. The fencing operation is set as reboot. On successful fencing of the node, the node is rebooted by the fencing user configured on vCenter.
[root@N2 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.200.131 N1 N1.off.com
192.168.200.132 N2 N2.off.com
10.10.10.1 H1 H1.off.com
10.10.10.2 H2 H2.off.com
192.168.200.135 F1 FIP.off.com
Checking the status of the cluster.
[root@N1 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:16:55 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, Local, rgmanager
N2.off.com 2 Online, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N2.off.com started
The cluster service is running on node 2. We will perform some sanity tests and there should not be any disturbances in the cluster operation.
1. Reboot Active Node N2
[root@N2 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:18:33 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, rgmanager
N2.off.com 2 Online, Local, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N2.off.com started
[root@N2 ~]# reboot
While rebooting node 2, the cluster service is moving to node 1 by stopping the service on node 2 as shown below.
[root@N1 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:19:31 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, Local, rgmanager
N2.off.com 2 Online, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N2.off.com stopping
Now, we can see that the service has started on node 1.
[root@N1 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:19:48 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, Local, rgmanager
N2.off.com 2 Online
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N1.off.com started
After node 2 comes up, we check the cluster status.
[root@N2 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:21:21 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, rgmanager
N2.off.com 2 Online, Local, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N1.off.com started
We need to make sure the service does not fail back to node 2 as per cluster configuration. We can see above the service is still active on node 1 and will not move to node 2 which was the last owner.
2. Poweroff Active Node N1
As the service is running on node 1, we power off that node
[root@N1 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:21:11 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Online, Local, rgmanager
N2.off.com 2 Online, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N1.off.com started
[root@N1 ~]# poweroff
Broadcast message from root@N1
(/dev/pts/1) at 15:21 ...
The system is going down for power off NOW!
We can see that the services are running on node 2 after we power off the first node.
[root@N2 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:22:55 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Offline
N2.off.com 2 Online, Local, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N2.off.com started
3. Power off both nodes and power on both nodes at the same time
I have powered off both the nodes as shown below. For this test we will power on both nodes at the same time and check if the cluster services come up automatically.
After we power on both the nodes, the second node has come up and the cluster service is activated on that node.
[root@N2 ~]# clustat
Cluster Status for officetest @ Fri Dec 7 15:33:14 2018
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
N1.off.com 1 Offline
N2.off.com 2 Online, Local, rgmanager
/dev/block/8:16 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:Test Service N2.off.com started
When checking the status of node 1, the node is still in the boot process. The cluster is delaying the power on process of node 1 while the cluster service is activated in the node 2.
4. Move services among nodes
Now, we manually move the cluster service among the nodes. The second node is active right now. So we run the following command to move the given service to node 1. The -r specifies the service name and -m specifies the node name where the services are to be moved.
[root@N2 ~]# clusvcadm -r 'Test Service' -m N1.off.com
Trying to relocate service:Test Service to N1.off.com...Success
service:Test Service is now running on N1.off.com
The test is successful and we further move the service to the second node as shown below.
[root@N1 ~]# clusvcadm -r 'Test Service' -m N2.off.com
Trying to relocate service:Test Service to N2.off.com...Success
service:Test Service is now running on N2.off.com
5. Test fencing of nodes
We can also test if the fencing configuration is done correctly by fencing the nodes. For manual fencing we run the following command. The fencing operation is set as reboot. On successful fencing of the node, the node is rebooted by the fencing user configured on vCenter.
[root@N1 ~]# fence_node N2
fence N2 success
[root@N2 ~]# fence_node N1
fence N1 success
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.