[Gluster-users] favorite-child and self-heal BUG ?

Thu Jan 8 11:03:44 UTC 2009

self-heal doesn't work right 

on client:

Command line : /usr/sbin/glusterfs --log-level=WARNING --volfile=/etc/glusterfs/glusterfs-client.vol /mnt/glusterfs
given volfile
+-----
 1: volume client1
 2:   type protocol/client
 3:   option transport-type tcp/client
 4:   option remote-host trac-xx-1.atm
 5:   option remote-port 6996
 6:   option remote-subvolume brick
 7: end-volume
 8:
 9: volume client2
10:  type protocol/client
11:  option transport-type tcp/client
12:  option remote-host trac-xx-2.atm
13:  option remote-port 6996
14:  option remote-subvolume brick
15: end-volume
16:
17: volume afr
18:   type cluster/afr
19:   subvolumes client1 client2
20:   option entry-self-heal on
21:   option data-self-heal on
22:   option metadata-self-heal off
23:   option favorite-child client1
24: end-volume
25:
26: volume wh
27:   type performance/write-behind
28:   option flush-behind on
29:   subvolumes afr
30: end-volume
31:
32: volume io-cache
33:   type performance/io-cache
34:   option cache-size 64MB
35:   option page-size 1MB
36: #  option cache-timeout 2
37:   subvolumes wh
38: end-volume
39:
40: volume iot
41:   type performance/io-threads
42:   subvolumes io-cache
43:   option thread-count 4
44: end-volume
+-----
2009-01-08 11:29:22 W [afr.c:2007:init] afr: You have specified subvolume 'client1' as the 'favorite child'. This means that if a discrepancy in the content or attributes (ownership, permission, etc.) of a file is detected among the subvolumes, the file on 'client1' will be considered the definitive version and its contents will OVERWRITE the contents of the file on other subvolumes. All versions of the file except that on 'client1' WILL BE LOST.

on server:

Command line : /usr/sbin/glusterfsd -p /var/run/glusterfsd.pid -f /etc/glusterfs/server.vol -l /var/log/glusterfs/glusterfsd.l
og -L WARNING -p /var/run/glusterfsd.pid
given volfile
+-----
 1: volume posix
 2:   type storage/posix
 3:   option directory /var/storage/glusterfs
 4: end-volume
 5:
 6: volume p-locks
 7:   type features/posix-locks
 8:   subvolumes posix
 9:   option mandatory-locks on
10: end-volume
11:
12: volume wh
13:   type performance/write-behind
14:   option flush-behind on
15:   subvolumes p-locks
16: end-volume
17:
18: volume brick
19:   type performance/io-threads
20:   subvolumes wh
21:   option thread-count 2
22: end-volume
23:
24: volume server
25:   type protocol/server
26:   subvolumes brick
27:   option transport-type tcp/server
28:   option auth.addr.brick.allow 10.*.*.*
29: end-volume
30:
+-----

scenario:

1.

trac-xx-1:/var/storage/glusterfs# /etc/init.d/glusterfs-server stop
Stopping glusterfs server: glusterfsd.
trac-storage-1:/var/storage/glusterfs# rm -rf *
trac-storage-1:/var/storage/glusterfs# ls
trac-storage-1:/var/storage/glusterfs# touch 10
trac-storage-1:/var/storage/glusterfs# touch 11
trac-storage-1:/var/storage/glusterfs# ls
10  11
trac-storage-1:/var/storage/glusterfs# /etc/init.d/glusterfs-server start
Starting glusterfs server: glusterfsd.

2.

trac-xx-2:/var/storage/glusterfs# /etc/init.d/glusterfs-server stop
Stopping glusterfs server: glusterfsd.

trac-xx-2:/var/storage/glusterfs# rm -rf *
trac-xx-2:/var/storage/glusterfs# ls
trac-xx-2:/var/storage/glusterfs# touch 20
trac-xx-2:/var/storage/glusterfs# touch 21
trac-xx-2:/var/storage/glusterfs# ls
20  21
trac-xx-2:/var/storage/glusterfs# /etc/init.d/glusterfs-server start
Starting glusterfs server: glusterfsd.

3,

noc-xx-2:~# cat /etc/fstab
......
......

/etc/glusterfs/glusterfs-client.vol  /mnt/glusterfs  glusterfs  defaults  0  0

noc-xx-2:~# mount -a
noc-xx-2:~# cd /mnt/glusterfs/
noc-xx-2:/mnt/glusterfs# ls
20  21

?!!!!!!

trac-xx-1:/var/storage/glusterfs# ls
20  21

?!!!!!!

trac-xx-2:/var/storage/glusterfs# ls
20  21

shouldn't it be the other way around? Client1 is set to be the "favorite child" not client 2 

in log:

2009-01-08 11:29:30 W [afr-self-heal-entry.c:1100:afr_sh_entry_impunge_mknod] afr: creating file /20 mode=0100644 dev=0x0 on client1
2009-01-08 11:29:30 W [afr-self-heal-entry.c:1100:afr_sh_entry_impunge_mknod] afr: creating file /21 mode=0100644 dev=0x0 on client1
2009-01-08 11:29:30 W [afr-self-heal-entry.c:495:afr_sh_entry_expunge_unlink] afr: unlinking file /10 on client1
2009-01-08 11:29:30 W [afr-self-heal-entry.c:495:afr_sh_entry_expunge_unlink] afr: unlinking file /11 on client1

in documentaion:
"Self-heal is triggered when a 
file or directory is 
first accessed, that is, the
fi
rst time any operation is attempted on it."

The files have been changed before running ls command on the client "/mnt/glusterfs". According to the glusterFS documentation the synchronization starts when the file is first opened on the client not when the server is started. Maybe I've misunderstood something. Please clarify

glusterfs 1.4.0rc7 built on Jan  7 2009 15:00:10
Repository revision: glusterfs--mainline--3.0--patch-814

Linux debian etch 4.0 2.6.18-5-xen-amd64