[Gluster-users] geo-replication fails on CentOS 6.5, gluster v 3.5.2

Mon Sep 29 10:42:54 UTC 2014

Hi,

replies within.

On Mon, 2014-09-29 at 13:50 +0530, Aravinda wrote:
> On 09/27/2014 04:45 AM, Kingsley wrote:
> > Hi,
> >
> > I'm new to gluster so forgive me if I'm being an idiot. I've searched
> > the list archives back to May but haven't found the exact issue I've
> > come across, so I thought I'd ask on here.
> >
> > Firstly, I'd like to thank the people working on this project. I've
> > found gluster to be pretty simple to get going and it seems to work
> > pretty well so far. It looks like it will be a good fit for the
> > application I have in mind, if we can get geo-replication to work
> > reliably.
> >
> > Now on to my problem ...
> >
> > I've set up an additional gluster volume and configured geo-replication
> > to replicate the master volume to it using the instructions here:
> >
> > https://github.com/gluster/glusterfs/blob/master/doc/admin-guide/en-US/markdown/admin_distributed_geo_rep.md
> >
> > To keep things simple while it was all new to me and I was just testing,
> > I didn't want to add confusion by thinking about using non-privileged
> > accounts and mountbroker and stuff so I just set everything up to use
> > root.
> >
> > Anyway, I mounted the master volume and slave on a client machine (I
> > didn't modify the content of the slave volume, I just mounted it so that
> > I could check things were working).
> >
> > When I manually create or delete a few files and wait 60 seconds for
> > replication to do its thing, it seems to work fine.
> >
> > However, when I hit it with a script to simulate intense user activity,
> > geo-replication breaks. I deleted the geo-replication and removed the
> > slave volume, then re-created and re-enabled geo-replication several
> > times so that I could start again from scratch. Each time, my script
> > (which just creates, renames and deletes files in the master volume via
> > a glusterfs mount) runs for barely a minute before geo-replication
> > breaks.
>
> Are these fop involves renames and delete of the same files?

Pretty much, yes.

My test script partly simulates a voicemail app we have, which writes
voicemail messages as wav files into a directory, one directory per
mailbox.

In the real app, the message files are numbered sequentially starting at
msg0000.wav, then msg0001.wav and so on. New messages are saved with the
next message number in sequence, but deleting a message involves
renumbering all of the later messages. Therefore, deleting a single
message can result in a lot of files being renamed (we didn't write this
voicemail app ourselves). My test script follows the same basic logic
but uses slightly different filenames.

I just ran it again for a few ops with a newly created geo-replication
volume slaving from the gv2 volume. msg000000 already existed before I
ran the script; the script log is below this (bear in mind that a delete
also renames any higher-numbered files):

Mon Sep 29 10:50:04 2014  1/10: created /mnt/gv2/mailtest/mbx0/msg000002, 3217034 bytes in 0.325s
Mon Sep 29 10:50:04 2014  2/10: created /mnt/gv2/mailtest/mbx0/msg000003, 3913942 bytes in 0.239s
Mon Sep 29 10:50:04 2014  3/10: created /mnt/gv2/mailtest/mbx0/msg000004, 250056 bytes in 0.166s
Mon Sep 29 10:50:05 2014  4/10: created /mnt/gv2/mailtest/mbx0/msg000005, 205757 bytes in 0.152s
Mon Sep 29 10:50:05 2014  5/10: created /mnt/gv2/mailtest/mbx0/msg000006, 7555 bytes in 0.112s
Mon Sep 29 10:50:05 2014  6/10: created /mnt/gv2/mailtest/mbx0/msg000007, 2507883 bytes in 0.228s
Mon Sep 29 10:50:05 2014  7/10: created /mnt/gv2/mailtest/mbx0/msg000008, 3256424 bytes in 0.244s
Mon Sep 29 10:50:05 2014  8/10: created /mnt/gv2/mailtest/mbx0/msg000009, 855269 bytes in 0.242s
Mon Sep 29 10:50:06 2014  9/10: deleted /mnt/gv2/mailtest/mbx0/msg000005 in 0.167s
Mon Sep 29 10:50:06 2014  10/10: deleted /mnt/gv2/mailtest/mbx0/msg000003 in 0.214s

This resulted in the following files within the mailtest directory on
gv2 (the master); remember msg000001 already existsed:

drwxr-xr-x 3 root root    4096 Sep 29 10:28 .
drwxr-xr-x 2 root root    4096 Sep 29 10:50 ./mbx0
-rw-r--r-- 1 root root  997207 Sep 29 10:40 ./mbx0/msg000001
-rw-r--r-- 1 root root 3217034 Sep 29 10:50 ./mbx0/msg000002
-rw-r--r-- 1 root root  250056 Sep 29 10:50 ./mbx0/msg000003
-rw-r--r-- 1 root root    7555 Sep 29 10:50 ./mbx0/msg000004
-rw-r--r-- 1 root root 2507883 Sep 29 10:50 ./mbx0/msg000005
-rw-r--r-- 1 root root 3256424 Sep 29 10:50 ./mbx0/msg000006
-rw-r--r-- 1 root root  855269 Sep 29 10:50 ./mbx0/msg000007

but this was what ended up in the gv2-slave volume - not the same list:

drwxr-xr-x 3 root root    4096 Sep 29 10:29 .
drwxr-xr-x 2 root root    4096 Sep 29 10:50 ./mbx0
-rw-r--r-- 1 root root  997207 Sep 29 10:40 ./mbx0/msg000001
-rw-r--r-- 1 root root 3217034 Sep 29 10:50 ./mbx0/msg000002
-rw-r--r-- 0 root root 3256424 Sep 29 10:50 ./mbx0/msg000003
-rw-r--r-- 0 root root 3256424 Sep 29 10:50 ./mbx0/msg000004
-rw-r--r-- 1 root root  855269 Sep 29 10:50 ./mbx0/msg000005
-rw-r--r-- 0 root root 3256424 Sep 29 10:50 ./mbx0/msg000006
-rw-r--r-- 1 root root  855269 Sep 29 10:50 ./mbx0/msg000007

Although it seems that the issue I have may have been patched, I've
copied log extracts seen on the master and slave bricks and uploaded
those in case you want to download them (links below). FWIW, we're
running in timezone GMT+1 (UTC+1); the logs on the master show local
time but the logs on the slave show UTC:

http://gluster.dogwind.com/files/georep-master.log
http://gluster.dogwind.com/files/georep-slave.log

The master log was taken from a file
called  /var/log/glusterfs/geo-replication/gv2/ssh%3A%2F%2Froot%
4088.151.43.14%3Agluster%3A%2F%2F127.0.0.1%3Agv2-slave.log

The slave log was taken from a file
called /var/log/glusterfs/geo-replication-slaves/50d67d25-c159-4900-aa2e-123669020477:gluster%3A%2F%2F127.0.0.1%3Agv2-slave.gluster.log

> Geo-rep had 
> issue with short lived renamed files(Now fixed in Master 
> http://review.gluster.org/#/c/8761/).

OK, I'll generate myself an OpenID and go and have a look at that,
thanks.

> > I tried this with the slave volume containing just one brick, and also
> > with it containing 2 bricks replicating each other. Each time, it broke.
> >
> > On the slave, I noticed that the geo-replication logs contained entries
> > like these:
> >
> > [2014-09-26 16:32:23.995539] W [fuse-bridge.c:1214:fuse_err_cbk] 0-glusterfs-fuse: 6384: SETXATTR() /.gfid/5f9b6d20-a062-4168-9333-8d28f2ba2d57 => -1 (File exists)
> > [2014-09-26 16:32:23.995798] W [client-rpc-fops.c:256:client3_3_mknod_cbk] 0-gv2-slave-client-0: remote operation failed: File exists. Path: <gfid:855b5eda-f694-487c-adae-a4723fe6c316>/msg000002
> > [2014-09-26 16:32:23.996042] W [fuse-bridge.c:1214:fuse_err_cbk] 0-glusterfs-fuse: 6385: SETXATTR() /.gfid/855b5eda-f694-487c-adae-a4723fe6c316 => -1 (File exists)
> > [2014-09-26 16:32:24.550009] W [fuse-bridge.c:1911:fuse_create_cbk] 0-glusterfs-fuse: 6469: /.gfid/05a27020-5931-4890-9b74-a77cb1aca918 => -1 (Operation not permitted)
> > [2014-09-26 16:32:24.550533] W [defaults.c:1381:default_release] (-->/usr/lib64/glusterfs/3.5.2/xlator/mount/fuse.so(+0x1e7d0) [0x7fb2ebd1e7d0] (-->/usr/lib64/glusterfs/3.5.2/xlator/mount/fuse.so(free_fuse_state+0x93) [0x7fb2ebd07063] (-->/usr/lib64/libglusterfs.so.0(fd_unref+0x10e) [0x7fb2eef36fbe]))) 0-fuse: xlator does not implement release_cbk
> File exists errors can be ignored as these are soft errors, which are 
> already handled in Geo-replication.

OK.

> > I also noticed that at some point, rsync was returning error code 23.
> Above mentioned patch also handles error code 23.

Excellent.

> >
> > Now ... I noted from the page I linked above that it requires rsync
> > version 3.0.7 and the version that ships with CentOS 6.5 is, wait for
> > it ... 3.0.6. Is this going to be the issue, or is the problem something
> > else?
> No issue with rsync version.

OK, thanks for confirming that.

-- 
Cheers,
Kingsley.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: georep-master.log
Type: text/x-log
Size: 11033 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140929/f74e85fa/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: georep-slave.log
Type: text/x-log
Size: 32993 bytes
Desc: not available
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140929/f74e85fa/attachment-0001.bin>