====== Clustering ====== * [[http://www.redhat.com/cluster_suite/]] * take the demo with a grain of salt: I'm not showing a scenario where apache or mysql crash due to application error, but instead this is a "clean" test due to induced hardware failure. * use the echo "c" > /proc/something (see kdump talk) * Why? Clustering is for cheap, basic HA. * Oracle, Veritas * watcher, script to do checking, script for failover, STONITH * supports up to 16 nodes * does both application/server failover, and ip load balancing (load balancing is derived from Piranha) * doesn't do resource monitoring, can't distribute computational loads (for things like HPC) * service locking control done by fencing and [[wp>STONITH]] * uses [[wp>Distributed lock manager]] for shared resources * shared resources can be: * NFS * SCSI and iSCSI * Fiber * CIFS * [[wp>GFS2]] (allows nodes direct concurrent access to shared resources, no client/server roles) === Clustering Parts === * Nodes * these are the hosts that will be part of a cluster * nodes are configured to use a fence device * Fence Devices * These provide the STONITH functionality, and prevent "split-brain" situations * Managed Resources * These are the cluster's resources * Failover domain * This defines the set of nodes for a particular service * Resources * IP Address * Storage * Init script * Services * The actual application that the cluster is running (service uses resources and failover domain) * Application * storage * shared IP * STONITH === gotchas === * figuring out which clustering init scripts to start at boot * figuring out the order in which to configure the cluster === misc === * easy enough to setup initialy, but there is testing needed before it's ready for production * need some tests to see what happens when an application/server/services fails in the middle of an operation * With RHEL 6, Red Hat seems to be pushing Clustering as a more serious solution * their documentation was recently updated and seems to be more organized and clearer * RHEL is promoting RHEL 6 Clustering with Oracle as an alternative to Oracle RAC === some commands === * cluster status: ''clustat'' * relocate a service: ''clusvcadm -r SERVICE'' * enable/disable service: ''clusvcadm -e/-d SERVICE === demo === == apache/soft kill == * go here: [[https://cluster-test.cc.columbia.edu/playground:hostname]] - edit page, and save - kill apache while watching clustat - once recovered, edit page again (and show ?do=check to show groups still active), and preview to show new hostname, then save - edit page again, but don't save - kill apache (still watching clustat) - once recovered, just save page - show new hostname on page == mysql & apache server kill == * this is a less thorough test * connect to mysql, select last update * kill server (let it reboot though, because fencing isn't setup) * once recovered, reconnect to mysql and select last update === notes === * [[http://www.haifux.org/lectures/168/linux-ha-clusters.html]] * [[wp>Microsoft Cluster Server]] * [[wp>Red Hat Cluster Suite]] * [[wp>High-availability cluster]]