[Gluster-devel] Client side AFR race conditions?
    Martin Fick 
    mogulguy at yahoo.com
       
    Tue May  6 16:59:48 UTC 2008
    
    
  
--- Anand Babu Periasamy <ab at gnu.org.in> wrote:
> I really want to understand the issue and help you
> out. We always have heated discussions even in our 
> labs. We only take it positively :) Your feedback is
> very valuable to us.
No prob!  I appreciate it.
> Martin Fick wrote:
> In other words, what prevents conflicts when 
> client A & B both write to the same file?  Could 
> A's write to subvolume A succeed before B's write
> to subvolume A, and at the same time B's write to
> subvolume B succeed before A's write to subvolume
> B? 
OK, just for giggles I created a test script to
attempt to replicate this theoretical problem.  I was
in fact able to do so fairly easily, in fact, more
easily than I might have hoped!
What I wrote was a simple shell script which attempts
to write to a file at the same time from two different
processes.  Since this is actually hard to time with a
shell script, I did not expect very good results.  A C
program is likely to create much greater contention
possibilities.  The script simply writes once from
each process to a file and then increments the
filename counter and starts over with the next file. 
This should perform one 20 character write only from
each process to each file.  I have attached the test
script (stress).  I run the script with the following
options:
  ./stress /mnt -d /mnt2 -c 100
This tells it to perform the test on 100 files and
specifies the 2 different mount points.
As for my glusterfs setup, I use two client afr mounts
on the same machine /mnt and /mnt2.  As noted above,
the script is run so that each process points to a
different client mount.  The server runs on the same
machine and maps the client subvolumes to /export/a
and /export/b, configs below.
Here are the split brain counts for a few successive
runs of 100 file writes: 
4,8,6,7,6,10,18,7,0,5,4,9,10,12.  Not good!  That
means that it is actually very easy to create this
problem, at least in the single digits percentage
wise.
I use this simple command to get those results:
  diff -q /export/a /export/b |wc -l
This command compares the two different subvolumes.  I
have looked at the file themselves, and yes they are
different, either lots of AAAs or lots of BBBs.
I hope this helps you debug this race condition a
little,
-Martin
I am using debian, packages:  
  glusterfs client and server 1.3.8-0pre
  fuse-utils and libfuse  2.7.2-glfs
  linux kernel 2.6.17-1-vserver-686 
Server:
volume a
  type storage/posix
  option directory /export/a
end-volume
volume b
  type storage/posix
  option directory /export/b
end-volume
volume afr
  type cluster/afr
  subvolumes a b
end-volume
volume server
  type protocol/server
  option transport-type tcp/server
  option bind-address 192.168.1.75
  subvolumes a b
  option auth.ip.a.allow *
  option auth.ip.b.allow *
end-volume
Client:
volume ca
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.1.75
  option remote-subvolume a
end-volume
volume cb
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.1.75
  option remote-subvolume b
end-volume
volume afr
  type cluster/afr
  subvolumes ca cb
end-volume
stress:
#!/bin/bash
trap clean HUP INT QUIT TERM
clean()
{
  kill $A $B 2>/dev/null
  wait $A $B
  rm "$A2B" "$B2A" "$P" 2>/dev/null
  exit 1
}
writei()
{  # dir val rec snd
  typeset  dir="$1" val="$2" rec="$3" snd="$4"
  typeset -i i=0
  while  read i < "$rec" ; do
    [ ! -z "$SW_P" ]  && read  < "$P"
    echo "$val" > "$dir/$i"  # we follow
    i=$(($i +1))
    [ ! -z "$SW_S" ]  && sleep 1
    [ ! -z "$C"  -a  $i -gt $C ] && return
    echo $i > "$snd"
    [ ! -z "$C"  -a  $i -eq $C ] && return
    echo "$val" > "$dir/$i"  # we initiate
  done
}
D="$1"
D2="$D"
COM=/tmp/$(basename $0)
A2B="$COM.a2b"
B2A="$COM.b2a"
P="$COM.prompt"
mkfifo "$A2B" "$B2A"
VA=AAAAAAAAAAAAAAAAAAAAA
VB=BBBBBBBBBBBBBBBBBBBBB
while [ $# -gt 0 ] ; do
  case "$1" in
    -p|--prompt) SW_P="-p" ; mkfifo "$P" ;;
    -s|--sleep)  SW_S="-s" ;;
    -m|--manual) shift ; M="$1" ;;
    -d) shift ; D2="$1" ;;
    -c|--count) shift ; C="$1" ;;
    -a) shift ; VA="$1" ;;
    -b) shift ; VB="$1" ;;
  esac
shift ; done
writei "$D" "$VA" "$B2A$M" "$A2B" &
A=$!
writei "$D2" "$VB" "$A2B$M" "$B2A" &
B=$!
echo 0 > "$B2A"
if [ ! -z "$SW_P" ] ; then
  while true ; do  read ; echo > "$P" ; done
fi
wait $A $B
clean
      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
    
    
More information about the Gluster-devel
mailing list