The perfect load balanced & high availability web cluster with 2 servers running Xen on Ubuntu 8.04 hardy heron p9
15. Custom scripts for monitoring (lb1, lb2, web1, web2)
I made a few bash script to monitor the whole setup (they are a bit ugly but they work). If you make them better, feel free to mail them to me!
15.1 Monitoring from lb1.example.com
First we must install sendmail so lb1.example.com will be able to send mail :
apt-get install sendmail mailx
The first script will check if the backup load balancer (lb2.example.com) is still available to takeover :
vi /root/lb2_check
#!/bin/bash # Backup load balancer check # Copyright (c) 2008 blogama.org # This script is licensed under GNU GPL version 2.0 or above # --------------------------------------------------------------------- ### This script does 1 verification ### ### 1) Check if backup load balancer failed and send mail notification ### ### To be modified ### EMAIL="admin@example.com" ###### Do not make modifications below ###### ### Binaries ### MAIL=$(which mail) ### To restore to original when problem fixed ### if [ $1 ]; then if [ $1=="fix" ]; then rm /root/lb2_problem.txt > /var/log/ha-log exit 1; fi fi ### Check if already notified ### cd /root if [ -f lb2_problem.txt ]; then exit 1; fi ### Check if Heartbeat is running on hot standby ### tail /var/log/ha-log 2>&1 | grep "Asking other side for ping node count" if [ "$?" -ne "1" ]; then echo "Backup load balancer failed" > /root/lb2_problem.txt $MAIL -s "Backup load balancer problem" $EMAIL < /root/lb2_problem.txt fi
we make this script executable :
chmod +x /root/lb2_check
If the lb2.example.com fails, then it will create a file /root/lb2_problem.txt and send a mail notification. Until the file lb2_problem.txt is there, it wont check again. Also we must empty the log file once the problem is fixed for the script to work properly.
Once the problem is fixed on lb2.example.com, please manually run :
/root/lb2_check fix
The next script will check if any ports failed on either web1 or web2 by checking the ldirectord log file. There is already a mail notification with ldirectord but it sends millions of notification, mine only send one until you fix the problem :
vi /root/ports_failed
and make it look like this :
#!/bin/bash # Ldirectord ports failure check # Copyright (c) 2008 blogama.org # This script is licensed under GNU GPL version 2.0 or above # --------------------------------------------------------------------- ### This script does 1 verification ### ### 1) Check for port failure on load balanced servers ### ### To be modified ### EMAIL="admin@example.com" ###### Do not make modifications below ###### ### Binaries ### MAIL=$(which mail) #to restore to original when problem fixed if [ $1 ]; then if [ $1=="fix" ]; then rm /root/port_problem.txt > /var/log/ldirectord.log fi fi ###check if already notified### cd /root if [ -f port_problem.txt ]; then cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log exit 1; fi ### Check if port failed ### cat /var/log/ldirectord.log 2>&1 | grep Deleted if [ "$?" -ne "1" ]; then cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log cat "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt $MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt fi
we make it executable :
chmod +x /root/ports_failed
This is the same as the first script, once the problem is fixed you must run :
/root/ports_failed fix
in order to make the script running again.
Now add both scripts to your crontab :
crontab -e* * * * * /root/ports_failed >/dev/null 2>&1 * * * * * /root/lb2_check >/dev/null 2>&1
15.2 Monitoring from lb2.example.com
Monitoring the second load balancer is important because it will tell us if the master load balancer failed and if it did, keep an eye for ports failure on web1 and web2.
First we must install sendmail so lb2.example.com will be able to send mail :
apt-get install sendmail
vi /root/ports_check
And paste this script :
#!/bin/bash # Ldirectord ports failure check # Copyright (c) 2008 blogama.org # This script is licensed under GNU GPL version 2.0 or above # --------------------------------------------------------------------- ### This script does 2 verifications ### ### 1) check if master load balancer failed and send mail notification ### ### 2) If master load balancer failed, check for port failure on load balanced servers ### ### To be modified ### EMAIL="admin@example.com" ###### Do not make modifications below ###### ### Binaries ### MAIL=$(which mail) ### Date ### NOW=$(date) ### To restore to original when problem fixed ### if [ $1 ]; then cd /root/ if [ $1=="fix" ]; then if [ -f lb1_problem.txt ]; then rm /root/lb1_problem.txt fi if [ -f port_problem.txt ]; then rm /root/port_problem.txt fi if [ -f /root/server_problem_notified.txt ]; then rm /root/server_problem_notified.txt fi > /var/log/ldirectord.log > /var/log/ha-log exit 1; fi fi #check if ldirectord is running on lb2.example.com (means that lb1.example.com failed) #$LDIRECTORD /etc/ha.d/ldirectord.cf status 2>&1 | grep running cat /var/log/ha-log | grep "takeover complete" > /dev/null 2>&1 if [ "$?" -ne "1" ]; then ###check if already notified### cd /root if [ -f port_problem.txt ]; then cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log exit 1; fi ### Check if port failed ### cat /var/log/ldirectord.log 2>&1 | grep Deleted if [ "$?" -ne "1" ]; then cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log echo "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt $MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt fi ### Check if already notified that master load balancer failed ### cd /root if [ -f server_problem_notified.txt ]; then exit 1; fi ### Notify that master load balancer failed ### cd /root MESSAGE="$NOW : Master load balancer failed" echo $MESSAGE > lb1_problem.txt $MAIL -s "Master load balancer failed" $EMAIL < /root/lb1_problem.txt echo "notified" > server_problem_notified.txt fi
We make it executable :
chmod +x /root/ports_check
And we add it to our crontab :
crontab -e* * * * * /root/ports_failed >/dev/null 2>&1
When you get a notification from the script, please run afterward :
/root/ports_check fix
15.3 Monitoring from web1 & web2
Monitoring of web cluster is already partially done with monit and munin.
The part that is not covered yet is the monitoring of MySQL replication.
Please read the following article :
Repair MySQL master-master replication
MySQL monitoring is optional but on a production server, problems can happend with MySQL replication so I really recommend using those scripts or something similar to check databases consistency.
15.4 Monitoring from remote server
This part is adding extra security by checking important ports (25,53,80,443) from a remote server. First make sure these packages are installed :
apt-get install dnsutils sendmail mailx telnet
Here is the script :
#!/bin/bash # Script to check important port on remote webserver # Copyright (c) 2008 blogama.org # This script is licensed under GNU GPL version 2.0 or above # --------------------------------------------------------------------- ### This script does a verification on port 25, 53, 80 and 443 ### ### After 2 failed check it will send a mail notification ### ### To be modified ### WEBSERVERIP="192.168.1.106" MAILSERVERIP="192.168.1.106" EMAIL="admin@example.com" DNSSERVERIP="192.168.1.106" DOMAINTOCHECKDNS="example.com" DOMAINIP="192.168.1.106" ###### Do not make modifications below ###### ### Binaries ### MAIL=$(which mail) TELNET=$(which telnet) DIG=$(which dig) ### Check if already notified### cd /root if [ -f server_problem.txt ]; then exit 1; fi ### Test SMTP ### ( echo "quit" ) | $TELNET $MAILSERVERIP 25 | grep Connected > /dev/null 2>&1 if [ "$?" -ne "1" ]; then echo "PORT CONNECTED" else if [ -f server_problem_first_time_25.txt ]; then echo "PORT 25 NOT CONNECTED" >> /root/server_problem.txt else echo "NOT CONNECTED" > /root/server_problem_first_time_25.txt fi fi ### Test HTTP ### ( echo "quit" ) | $TELNET $WEBSERVERIP 80 | grep Connected > /dev/null 2>&1 if [ "$?" -ne "1" ]; then echo "PORT CONNECTED" else if [ -f server_problem_first_time_80.txt ]; then echo "PORT 80 NOT CONNECTED" >> /root/server_problem.txt else echo "NOT CONNECTED" > /root/server_problem_first_time_80.txt fi fi ### Test HTTPS### ( echo "quit" ) | $TELNET $WEBSERVERIP 443 | grep Connected > /dev/null 2>&1 if [ "$?" -ne "1" ]; then echo "PORT CONNECTED" else if [ -f server_problem_first_time_443.txt ]; then echo "PORT 81 NOT CONNECTED" >> /root/server_problem.txt else echo "NOT CONNECTED" > /root/server_problem_first_time_443.txt fi fi ### Test DNS ### $DIG $DOMAINTOCHECKDNS @$DNSSERVERIP | grep $DOMAINIP if [ "$?" -ne "1" ]; then echo "PORT CONNECTED" else if [ -f server_problem_first_time_53.txt ]; then echo "PORT 53 NOT CONNECTED" >> /root/server_problem.txt else echo "NOT CONNECTED" > /root/server_problem_first_time_53.txt fi fi ### Send mail notification after 2 failed check ### if [ -f server_problem.txt ]; then $MAIL -s "Server problem" $EMAIL < /root/server_problem.txt fi
References :
The Perfect Server - Ubuntu Hardy Heron (Ubuntu 8.04 LTS Server)
How To Set Up A Loadbalanced High-Availability Apache Cluster Based On Ubuntu 8.04 LTS
Ldirectord manpage
Installing Xen On An Ubuntu 8.04 (Hardy Heron) Server From The Ubuntu Repositories
Howto: Setup a DNS server with bind
HOWTO Delegate Reverse Subnet Maps
Guide to reverse zones
Installing An Ubuntu Hardy 8.04 LTS DNS Server With BIND
Digg howto
How To Set Up Database Replication In MySQL
http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html?page=1
MySQL Master-Master replication table sync
Virtual Users And Domains With Postfix, Courier, MySQL And SquirrelMail (Ubuntu 8.04 LTS)
Mirror Your Web Site With rsync
Server Monitoring With munin And monit On Debian Etch
- subjects:

