[Gluster-devel] self-heal issues with AFR in 2.5

Sebastien LELIEVRE slelievre at tbs-internet.com
Tue Jun 26 13:31:55 UTC 2007


Hi everyone,

I just wanted to give you a little feedback on mainline--2.5. I am using
my 3 test servers, and will soon add 2 more.

The main problem I am experiencing is : self-heal repairs directory
inconsitencies but not file inconcistencies

Here is the current state:

tbs-lab1 (client)
tbs-lab2 (brick1)
tbs-lab3 (brick2)

on the client, afr = brick1 + brick2.

for the test, brick1 has whole data and brick2 is blanked (for now).

brick1 and brick2 are both on a ext3 FS with extended attributes :

/dev/sda2	/glusterfs	ext3	defaults,user_xattr	0 2

I am currently using glusterfs--mainline--2.5--patch-220 on each machine.

client : CFLAGS="-O3" ./configure --prefix=/usr/local --sysconfdir=/etc
--disable-server --disable-ibverbs

servers : CFLAGS="-O3" ./configure --prefix=/usr/local --sysconfdir=/etc
--disable-fuse-client --disable-ibverbs

here is the servers spec file (X is the number of the brick, so 1 or 2,
here) :

volume brickX
  type storage/posix
  option directory /glusterfs
end-volume

volume locksX
  type features/posix-locks
  subvolumes brickX
end-volume

volume serverX
  type protocol/server
  option transport-type tcp/server
  subvolumes locksX
  option auth.ip.locksX.allow 192.168.0.5 # client IP
end-volume

volume traceX
  type debug/trace
  subvolumes serverX
  option debug on
end-volume

launching both bricks with this command :

glusterfsd -f /etc/glusterfs/glusterfs-server.vol
--log-file=/var/log/glusterfs/glusterfsd.log --log-level=DEBUG

Now here is the client spec file :

volume brick1c
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.0.6	# brick1 IP
  option remote-subvolume locks1
end-volume

volume brick2c
  type protocol/client
  option transport-type tcp/client
  option remote-host 192.168.0.7	# brick2 IP
  option remote-subvolume locks2
end-volume

volume afr
  type cluster/afr
  subvolumes brick1c brick2c
  option replicate *:2
end-volume

volume writebehind
 type performance/write-behind
 option aggregate-size 131072
 subvolumes afr
end-volume

And I mounting the volume as such :

glusterfs -f /etc/glusterfs/glusterfs-client.vol -l
/var/log/glusterfs/glusterfs.log /mnt/glusterfs --log-level=DEBUG

DEBUG is here to give as much information as needed to correct the issue.

So, at this state, we have :
brick1:~# ls -l /glusterfs
total 4
drwxr-xr-x 2 www-data www-data 4096 2007-05-11 11:27 apache2-default

and :
brick2:~# ls -l /glusterfs
total 0

If I do a 'ls -l' on the client, it instantaneously creates
apache2-default directory on brick2 (but not subdirs, which haven't been
accessed yet)


Issue is here : let's try to access a file from the client, let's say :

cp /mnt/glusterfs/apache2-default/apache_pb2_ani.gif /dev/null

brick1:~# ls -l /glusterfs
total 4
drwxr-xr-x 2 www-data www-data 4096 2007-05-11 11:27 apache2-default

brick1:~# ls -l /glusterfs/apache2-default/
total 151
-rw-r--r-- 1 www-data www-data 2160 2007-05-11 11:27 apache_pb2_ani.gif
-rw-r--r-- 1 www-data www-data 2414 2007-05-11 11:27 apache_pb2.gif
.... bla bla bla *snip* ...
-rw-r--r-- 1 www-data www-data   26 2007-05-11 11:27 robots.txt


brick2:~# ls -l /glusterfs
total 4
drwxr-xr-x 2 root root 4096 2007-06-26 10:03 apache2-default

brick2:~# ls -l /glusterfs/apache2-default/
total 0

Can you see where I might have forgotten something, or done something
wrong ?

log files are too big to be put on attachement (limit is 40kB)

I can provide them to anyone who wants

Cheers,

Sebastien LELIEVRE
slelievre at tbs-internet.com           Services to ISP
TBS-internet                   http://www.TBS-internet.com




More information about the Gluster-devel mailing list