Clustering

  • take the demo with a grain of salt: I'm not showing a scenario where apache or mysql crash due to application error, but instead this is a “clean” test due to induced hardware failure.
    • use the echo “c” > /proc/something (see kdump talk)
  • Why? Clustering is for cheap, basic HA.
  • Oracle, Veritas
    • watcher, script to do checking, script for failover, STONITH
  • supports up to 16 nodes
  • does both application/server failover, and ip load balancing (load balancing is derived from Piranha)
  • doesn't do resource monitoring, can't distribute computational loads (for things like HPC)
  • service locking control done by fencing and STONITH
  • uses Distributed lock manager for shared resources
    • shared resources can be:
      • NFS
      • SCSI and iSCSI
      • Fiber
      • CIFS
      • GFS2 (allows nodes direct concurrent access to shared resources, no client/server roles)

Clustering Parts

  • Nodes
    • these are the hosts that will be part of a cluster
    • nodes are configured to use a fence device
  • Fence Devices
    • These provide the STONITH functionality, and prevent “split-brain” situations
  • Managed Resources
    • These are the cluster's resources
    • Failover domain
      • This defines the set of nodes for a particular service
    • Resources
      • IP Address
      • Storage
      • Init script
    • Services
      • The actual application that the cluster is running (service uses resources and failover domain)
  • Application
    • storage
    • shared IP
  • STONITH

gotchas

  • figuring out which clustering init scripts to start at boot
  • figuring out the order in which to configure the cluster

misc

  • easy enough to setup initialy, but there is testing needed before it's ready for production
    • need some tests to see what happens when an application/server/services fails in the middle of an operation
  • With RHEL 6, Red Hat seems to be pushing Clustering as a more serious solution
    • their documentation was recently updated and seems to be more organized and clearer
    • RHEL is promoting RHEL 6 Clustering with Oracle as an alternative to Oracle RAC

some commands

  • cluster status: clustat
  • relocate a service: clusvcadm -r SERVICE
  • enable/disable service: ''clusvcadm -e/-d SERVICE

demo

apache/soft kill
    1. edit page, and save
    2. kill apache while watching clustat
    3. once recovered, edit page again (and show ?do=check to show groups still active), and preview to show new hostname, then save
    4. edit page again, but don't save
    5. kill apache (still watching clustat)
    6. once recovered, just save page
    7. show new hostname on page
mysql & apache server kill
  • this is a less thorough test
  • connect to mysql, select last update
  • kill server (let it reboot though, because fencing isn't setup)
  • once recovered, reconnect to mysql and select last update

notes

playground/clustering_talk.txt · Last modified: 2011/01/28 11:47 by john
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki