[Gluster-devel] Bugs in replication's self-heal

Fri Mar 26 09:42:08 UTC 2010

hi,

There are bugs in replication. They exist in 2.0.x, and I tried 3.0.3, it
also exists, so I report the bugs here with my solution.

settings:  2 server(server1 and server2), 1 client with replication

1)
client:   [root at centos dir1]# pwd

/mnt/gluster/dir1

[root at centos dir1]# ls
ddd

      then kill server2;

client:   [root at centos dir1]# rm -rf ddd/

     start server2;

client:   [root at centos dir1]# ls
nothing.

althrough the directory is empty in client. But ddd still exists in server2:

server2:[root at centos dir1]# ls

ddd

If client continue to do lookup, mkdir or other operations on ddd in dir1,
sometimes there would be segment fault or the connection between client and
server would be lost. Not every time, but do sometimes.

If kill server1 instead of server2, then client will recreate ddd in
server2!

trace at client:
————————————————————————————————————————————————————
[2010-03-26 17:18:35] N [trace.c:1397:trace_rmdir] trace2: 504: (loc
{path=/dir1/ddd, ino=7209308})
[2010-03-26 17:18:35] N [trace.c:833:trace_rmdir_cbk] trace2: 504:
(op_ret=0, *prebuf = {st_ino=3604666, st_mode=40755, st_nlink=2, st_uid=0,
st_gid=0, st_size=4096, st_blocks=16, st_atime=[Mar 26 16:16:39],
st_mtime=[Mar 26 17:19:56], st_ctime=[Mar 26 17:19:56]}, *postbuf = {(null)}
[2010-03-26 17:18:35] N [trace.c:1247:trace_xattrop] trace2: 504:
(path=/dir1, ino=7209322 flags=0)
[2010-03-26 17:18:35] N [trace.c:1128:trace_xattrop_cbk] trace2: 504:
(op_ret=0, op_errno=0)
[2010-03-26 17:18:35] N [trace.c:1176:trace_entrylk] trace2: 504:
volume=afr, (loc= {path=/dir1, ino=7209322} basename=ddd,
cmd=ENTRYLK_UNLOCK, type=ENTRYLK_WRLCK)
[2010-03-26 17:18:35] N [trace.c:1113:trace_entrylk_cbk] trace2: 504:
op_ret=0, op_errno=0
[2010-03-26 17:18:46] D [client-protocol.c:7041:notify] remote1: got
GF_EVENT_CHILD_UP
[2010-03-26 17:18:46] D [client-protocol.c:7041:notify] remote1: got
GF_EVENT_CHILD_UP
[2010-03-26 17:18:46] N [client-protocol.c:6246:client_setvolume_cbk]
remote1: Connected to 192.168.0.182:7001, attached to remote volume 'trace'.
[2010-03-26 17:18:46] N [client-protocol.c:6246:client_setvolume_cbk]
remote1: Connected to 192.168.0.182:7001, attached to remote volume 'trace'.
[2010-03-26 17:19:03] N [trace.c:1769:trace_opendir] trace1: 507:( loc
{path=/dir1, ino=7209322}, fd=0x958b0f8)
[2010-03-26 17:19:03] N [trace.c:1769:trace_opendir] trace2: 507:( loc
{path=/dir1, ino=7209322}, fd=0x958b0f8)
[2010-03-26 17:19:03] N [trace.c:808:trace_opendir_cbk] trace2: 507:
(op_ret=0, op_errno=22, fd=0x958b0f8)
[2010-03-26 17:19:03] N [trace.c:808:trace_opendir_cbk] trace1: 507:
(op_ret=0, op_errno=22, fd=0x958b0f8)

>From trace, we can see when user was still in dir1 which the "rm" operation
took place in, "ls" wouldn’t call "lookup" but call "opendir" straightly. So
the changelogs store in dir1 didn’t be noticed.

My solution: Every time the “opendir” is called, add another “lookup” call
to lookup the directory which is opendir by the “opendir”.

2)
client:    [root at centos dir2]# pwd

/mnt/gluster/dir2

[root at centos dir2]# ll ww/

-rw-r--r-- 1 root root 242 2010-03-26 file

kill server2:

client:       [root at centos dir2]# rm -rf ww

start server2:

client:       [root at centos dir2]# ll ww/

-rw-r--r-- 1 root root 242 2010-03-26 file

In my opinion, in the function afr_self_heal():

         if (local->success_count && local->enoent_count) {

                   afr_self_heal_missing_entries (frame, this);

         } else {

                   gf_log (this->name, GF_LOG_TRACE,

                            "proceeding to metadata check on %s",

                            local->loc.path);

                   afr_sh_missing_entries_done (frame, this);

the judge is too simple.

My solution: if (local->success_count && local->enoent_count) is ture, add
some call to recursively lookup the common parent directory which exists in
each server. Anyway, there must be one, for example the mount point. Lookup
the first common parent directory, then let self-heal decide should whether
delete or create.

Here’s my modified replicate in glusterfs-2.0.8 in the attachment:
format:
#ifndef AFR_ENTRY
    My codes
#else
    original codes
#endif

Besides, in the attachment, I use Hexiaobin’s code about recursive delete in
Afr-self-heal-entry.c which he may report it earlier.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20100326/7b3f78b8/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: afr.tar.gz2
Type: application/octet-stream
Size: 45174 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20100326/7b3f78b8/attachment-0003.obj>