[Gluster-users] Shared web hosting with GlusterFS and inotify

Emile Heitor emile.heitor at nbs-system.com
Thu Sep 16 17:07:10 UTC 2010


Hi list,

We've been runing various tests with our GlusterFS / inotify setup and 
it's been very stable so far. To anyone wanting to give a try to this 
solution, here's a really quick  HOWTO (Debian based, but easily portable) :

Rationale: declaring a GlusterFS mountpoint as an Apache DocumentRoot 
is, for now, way too slow for a busy web server. Our approach is to 
point the DocumentRoot to the local storage, and use inotify to 
replicate the changes made by Apache to the GlusterFS mountpoint.

GlusterFS configuration
===============

Used version: glusterfs-server / glusterfs-client 3.0.5 (backport)

#  apt-get install glusterfs-server

Configuration
-------------------

On both machines :

# cd /etc/glusterfs
# glusterfs-volgen --name repstore1 --raid 1 machine1:/var/www 
machine2:/var/www
# cp glust1-repstore1-export.vol glusterfsd.vol
# cp repstore1-tcp.vol glusterfs.vol
# echo '/etc/glusterfs/glusterfs.vol  /var/gluster  glusterfs  
defaults,_netdev  0  0' >> /etc/fstab

/var/www being the local storage (where Apache DocumentRoot must point to)
/var/gluster being the glusterfs (shared) mountpoint (where customer 
home directory may reside)

inotify configuration
=============

# apt-get install inotify-tools

Copy the following script on both servers and make it executable :

$ cat insync.sh
#!/bin/sh

[ $# -lt 2  ] && echo "usage: $0 <source> <destination>" && exit 1

PATH=${PATH}:/bin:/sbin:/usr/bin:/usr/sbin; export PATH

SRC=$1
DST=$2

cd "${SRC}"

LOG='/var/log/insync.log'
# no recursion
RSYNC='rsync -dlptgoD --delete "${srcdir}" "${dstdir}/" 2>> "${LOG}"'

do_rsync() {
     eval ${RSYNC}
}

inotifywait -mr \
     --exclude '(\..*\.sw.*|\.landfill.*)' \
     -e close_write -e create -e delete_self -e delete . 2>> "${LOG}" | \
     while read line
     do
         dir="${line%/*}/"
         tmp="${line##*/ }"
         action="${tmp%% *}"
         file="${tmp#* }"

         srcdir="${SRC}/${dir}"
         dstdir="${DST}/${dir}"

         srcfile="${srcdir}/${file}"
         dstfile="${dstdir}/${file}"

         [ -d "${srcdir}" ] && \
         [ ! -z "`df -T \"${srcdir}\"|grep tmpfs`" ] \
&& continue

         # debug
         echo "${dir}" "${action}" "${file}" >> "${LOG}"

         case "${action}" in
         CLOSE_WRITE,CLOSE)
             # source (parent) dir does not exists, don't rsync
             [ ! -d "${srcdir}" ] && continue
             # it's a directory, let CREATE,ISDIR handle it
             [ -d "${srcfile}" ] && continue

             # nonexistant dest file, just copy it
             [ ! -f "${dstfile}" ] && do_rsync && continue

             md5src="`md5sum \"${srcfile}\"|cut -d' ' -f1`"
             md5dst="`md5sum \"${dstfile}\"|cut -d' ' -f1`"
             [ ! $md5src == $md5dst ] && do_rsync
             ;;
         CREATE,ISDIR)
             # source dir must exist
             [ -d "${srcfile}" ] && \
             [ ! -d "${dstfile}" ] && mkdir "${dstfile}" 2>> "${LOG}"
             ;;
         DELETE*)
             rm -rf "${dstfile}" 2>> "${LOG}"
             ;;
         esac
     done

Add the following entry in /etc/inittab so the script starts at boot 
time and respawns if anything fails :

ino:2345:respawn:/path/to/insync.sh /var/www /var/gluster

As the script is still in development, we made it log every operation in 
order to follow very closely its status. So we created the following 
file /etc/logrotate.d/insync :

/var/log/insync.log {
   rotate 6
   monthly
   compress
   missingok
   notifempty
}

Hope you'll find it useful, please give some feedback if you see 
anything failing with this method.

Regards,

-- 
Emile Heitor, Responsable d'Exploitation
---
www.nbs-system.com, 140 Bd Haussmann, 75008 Paris
Tel: 01.58.56.60.80 / Fax: 01.58.56.60.81




More information about the Gluster-users mailing list