[Gluster-devel] Client side AFR race conditions?

Martin Fick mogulguy at yahoo.com
Tue May 6 16:59:48 UTC 2008


--- Anand Babu Periasamy <ab at gnu.org.in> wrote:

> I really want to understand the issue and help you
> out. We always have heated discussions even in our 
> labs. We only take it positively :) Your feedback is

> very valuable to us.

No prob!  I appreciate it.


> Martin Fick wrote:
> In other words, what prevents conflicts when 
> client A & B both write to the same file?  Could 
> A's write to subvolume A succeed before B's write
> to subvolume A, and at the same time B's write to
> subvolume B succeed before A's write to subvolume
> B? 


OK, just for giggles I created a test script to
attempt to replicate this theoretical problem.  I was
in fact able to do so fairly easily, in fact, more
easily than I might have hoped!

What I wrote was a simple shell script which attempts
to write to a file at the same time from two different
processes.  Since this is actually hard to time with a
shell script, I did not expect very good results.  A C
program is likely to create much greater contention
possibilities.  The script simply writes once from
each process to a file and then increments the
filename counter and starts over with the next file. 
This should perform one 20 character write only from
each process to each file.  I have attached the test
script (stress).  I run the script with the following
options:

  ./stress /mnt -d /mnt2 -c 100

This tells it to perform the test on 100 files and
specifies the 2 different mount points.


As for my glusterfs setup, I use two client afr mounts
on the same machine /mnt and /mnt2.  As noted above,
the script is run so that each process points to a
different client mount.  The server runs on the same
machine and maps the client subvolumes to /export/a
and /export/b, configs below.


Here are the split brain counts for a few successive
runs of 100 file writes: 
4,8,6,7,6,10,18,7,0,5,4,9,10,12.  Not good!  That
means that it is actually very easy to create this
problem, at least in the single digits percentage
wise.

I use this simple command to get those results:

  diff -q /export/a /export/b |wc -l

This command compares the two different subvolumes.  I
have looked at the file themselves, and yes they are
different, either lots of AAAs or lots of BBBs.

I hope this helps you debug this race condition a
little,


-Martin



I am using debian, packages:  

  glusterfs client and server 1.3.8-0pre
  fuse-utils and libfuse  2.7.2-glfs
  linux kernel 2.6.17-1-vserver-686 


Server:

volume a
  type storage/posix
  option directory /export/a
end-volume

volume b
  type storage/posix
  option directory /export/b
end-volume

volume afr
  type cluster/afr
  subvolumes a b
end-volume

volume server
  type protocol/server
  option transport-type tcp/server
  option bind-address 192.168.1.75

  subvolumes a b

  option auth.ip.a.allow *
  option auth.ip.b.allow *
end-volume


Client:


volume ca
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.1.75
  option remote-subvolume a
end-volume

volume cb
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.1.75
  option remote-subvolume b
end-volume

volume afr
  type cluster/afr
  subvolumes ca cb
end-volume




stress:


#!/bin/bash

trap clean HUP INT QUIT TERM

clean()
{
  kill $A $B 2>/dev/null
  wait $A $B
  rm "$A2B" "$B2A" "$P" 2>/dev/null
  exit 1
}

writei()
{  # dir val rec snd
  typeset  dir="$1" val="$2" rec="$3" snd="$4"
  typeset -i i=0

  while  read i < "$rec" ; do
    [ ! -z "$SW_P" ]  && read  < "$P"
    echo "$val" > "$dir/$i"  # we follow
    i=$(($i +1))

    [ ! -z "$SW_S" ]  && sleep 1

    [ ! -z "$C"  -a  $i -gt $C ] && return
    echo $i > "$snd"
    [ ! -z "$C"  -a  $i -eq $C ] && return
    echo "$val" > "$dir/$i"  # we initiate
  done
}

D="$1"
D2="$D"

COM=/tmp/$(basename $0)
A2B="$COM.a2b"
B2A="$COM.b2a"
P="$COM.prompt"
mkfifo "$A2B" "$B2A"

VA=AAAAAAAAAAAAAAAAAAAAA
VB=BBBBBBBBBBBBBBBBBBBBB

while [ $# -gt 0 ] ; do
  case "$1" in
    -p|--prompt) SW_P="-p" ; mkfifo "$P" ;;
    -s|--sleep)  SW_S="-s" ;;

    -m|--manual) shift ; M="$1" ;;
    -d) shift ; D2="$1" ;;
    -c|--count) shift ; C="$1" ;;
    -a) shift ; VA="$1" ;;
    -b) shift ; VB="$1" ;;
  esac
shift ; done


writei "$D" "$VA" "$B2A$M" "$A2B" &
A=$!

writei "$D2" "$VB" "$A2B$M" "$B2A" &
B=$!

echo 0 > "$B2A"

if [ ! -z "$SW_P" ] ; then
  while true ; do  read ; echo > "$P" ; done
fi

wait $A $B
clean




      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ





More information about the Gluster-devel mailing list