Show pagesource Old revisions Backlinks Add to book Export to PDF ODT export Book Creator Add this page to your book Book Creator Remove this page from your book Manage book (0 page(s)) Help Procedure: Cluster monitoring Scope Testing and Commissioning Procedure of Cluster Description A server cluster is composed of 2 rigorously identical servers configured in normal / backup high availability. The first server in normal mode is called “primary”, the backup server is called “secondary”. Prerequisites At a minimum, each server uses 3 network adapters configured as follows: ETH1 = Main Network Interface = IP_Server ETH2 = bridged network interface for virtual machines = IP_Br0 ETH3 = “Private” server synchronization network interface, direct link between the cluster nodes. On HP servers, the HP_ILO management interface for monitoring the machine can be set to benefit from the information of the server's physical state (see ILO monitoring documentation). The 2 servers are connected to each other by a link allowing to have the servers in 2 different and distant technical premises to ensure the physical integrity of the equipment and the non-propagation of a physical damage on one of the two rooms. Connection Schema Functioning of the cluster The Linux services used for the Cluster are: Drbd = data replication between disk spaces Corosync = Configuration and scheduling of Cluster services Peacemaker = Monitoring cluster services The services configured and monitored by the Cluster are: Apache = Web server MySQL = Database Samba = File Sharing Libvirtd = KVM Virtualization Engine Libvirtguest = Virtualization Management Tools IP Cluster / Route Cluster = Active Network Node All Linux services are controlled by Corosync, do not use the standard services of Linux daemons, do not use “services” or “systemctl” commands or automatic scripts like “samba”. Any activation of the services by this type of command cancels the system monitoring by peacemaker and corosync. Server's supervision web page Goto « http://ip_server/web/system/ezmonitor » Checking the synchronization of cluster data In a terminal or by ssh access on one of the cluster nodes use the drbd-overview command Here the 2 primary and secondary servers are perfectly synchronized at the data level since the status UpToDate is effective on both servers. Primary / Secondary Uptodate / Uptodate shows the synchronization status of the 2 nodes of the cluster. In case the DRBD service is not started correctly (Cluster out of service), it is possible to restart the server data synchronization service via the following command: # service drbdserv –full-restart Validation of the correct functioning of the cluster (Corosync) To know the state of the services managed by the cluster via a terminal or by access ssh, use the command crm status The command returns the configuration and cluster status Description of the configuration file The first block indicates the state of the cluster Last updated: Sun Sep 23 08:21:21 2018 Last change: Tue Aug 28 09:42:27 2018 via crm_attribute on dzacupsvr2 Stack: corosync Current DC: dzacupsvr2 (34212362) - partition with quorum Version: 1.1.7-2.mga1-ee0730e13d124c3d58f00016c3376a1de5323cff 2 Nodes configured, unknown expected votes 11 Resources configured. The second block tells you which is the primary node, and where are the services Online: [ dzacupsvr dzacupsvr2 ] Resource Group: services samba (lsb:smb): Started dzacupsvr apache (ocf::heartbeat:apache): Started dzacupsvr mysql (ocf::heartbeat:mysql): Started dzacupsvr libvirtd (lsb:libvirtd): Started dzacupsvr libvirt-guests (lsb:libvirt-guests): Started dzacupsvr Master/Slave Set: drbdservClone [drbdserv] Masters: [ dzacupsvr ] Slaves: [ dzacupsvr2 ] fsserv (ocf::heartbeat:Filesystem): Started dzacupsvr Resource Group: iphd clusterip (ocf::heartbeat:IPaddr2): Started dzacupsvr clusterroute (ocf::heartbeat:Route): Started dzacupsvr ⇒ the 2 servers are “online”, and each service is operational on the primary. Verifying the correct functioning of the cluster See the cluster configuration, use the following command: # crm configure show Example of a configuration file of the Abidjan cluster: Commands for verifying the correct functioning of the cluster for example « abjairsvr2 » DESIRED Action SYSTEM Command Checking the cluster statusservice corosync status See cluster nodescrm node See the cluster configurationcrm configure show Edit cluster configurationcrm configure edit Put a cluster node in standby time to change a configuration crm node standby abjairsvr2 Put back in service a node of the cluster (here secondary of abidjan) crm node online abjairsvr2 Change a cluster configuration parameter crm configure rsc_defaults resource-stickiness=100 View the status of a cluster service crm resource libvirt-guests status Purge a cluster service that does not start crm resource cleanup libvirt-guests Check whether or not a split brain exists (service that has migrated to a non-operational node)grep “split-brain” /var/log/syslog Move a service from one node to another (in the case of a split brain) crm resource move libvirt-guests abjairsvr2 Reattach a service to the cluster crm resource manage libvirt-guests Check that the configuration files are identical between the nodes of a server crm cluster diff /etc/samba/smb.conf Verification of cluster management tools DESIRED Action SYSTEM Commands See cluster nodessystemctl status pacemaker See the cluster configurationsystemd-analyze verify pacemaker.service Edit cluster configurationsystemctl pacemaker.service reload Put a cluster node in standby time to change a configurationsystemd-delta pacemaker.service Put back in service a node of the cluster (here secondary of abidjan)journalctl -u pacemakermore Table of Contents Procedure: Cluster monitoring Scope Description Prerequisites Connection Schema Functioning of the cluster Server's supervision web page Checking the synchronization of cluster data Validation of the correct functioning of the cluster (Corosync) Description of the configuration file The first block indicates the state of the cluster The second block tells you which is the primary node, and where are the services Verifying the correct functioning of the cluster Commands for verifying the correct functioning of the cluster Verification of cluster management tools