[Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

Benjamin Turner bennyturns at gmail.com
Sat Jul 19 14:02:33 UTC 2014


On Fri, Jul 18, 2014 at 10:43 PM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:

>
> On 07/18/2014 07:57 PM, Anders Blomdell wrote:
>
>> During testing of a 3*4 gluster (from master as of yesterday), I
>> encountered
>> two major weirdnesses:
>>
>>    1. A 'rm -rf <some_dir>' needed several invocations to finish, each
>> time
>>       reporting a number of lines like these:
>>                 rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty
>>
>
This is reproducible for me when running dbench on nfs mounts.  I think I
may have seen it on glusterfs mounts as well but it seems more reproducible
on nfs.  I should have caught it sooner but it doesn't error out client
side when cleaning up, and the next test I run the deletes are successful.
 When this happens in the nfs.log I see:

This spams the log, from what I can tell it happens when dbench is creating
the files:
[2014-07-19 13:37:03.271651] I [MSGID: 109036]
[dht-common.c:5694:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht:
Setting layout of /clients/client3/~dmtmp/SEED with [Subvol_name:
testvol-replicate-0, Err: -1 , Start: 2147483647 , Stop: 4294967295 ],
[Subvol_name: testvol-replicate-1, Err: -1 , Start: 0 , Stop: 2147483646 ],

Then when the deletes fail I see the following when the client is removing
the files:
[2014-07-18 23:31:44.272465] W [nfs3.c:3518:nfs3svc_rmdir_cbk] 0-nfs:
74a6541a: /run8063_dbench/clients => -1 (Directory not empty)
.
.
[2014-07-18 23:31:44.452988] W [nfs3.c:3518:nfs3svc_rmdir_cbk] 0-nfs:
7ea9541a: /run8063_dbench/clients => -1 (Directory not empty)
[2014-07-18 23:31:45.262651] W
[client-rpc-fops.c:1354:client3_3_access_cbk] 0-testvol-client-0: remote
operation failed: Stale file handle
[2014-07-18 23:31:45.263151] W [MSGID: 108008]
[afr-read-txn.c:218:afr_read_txn] 0-testvol-replicate-0: Unreadable
subvolume -1 found with e
vent generation 2. (Possible split-brain)
[2014-07-18 23:31:45.264196] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs:
32ac541a: <gfid:b073a189-91ea-46b2-b757-5b320591b848> => -1 (Stale fi
le handle)
[2014-07-18 23:31:45.264217] W [nfs3-helpers.c:3401:nfs3_log_common_res]
0-nfs-nfsv3: XID: 32ac541a, ACCESS: NFS: 70(Invalid file handle), P
OSIX: 116(Stale file handle)
[2014-07-18 23:31:45.266818] W [nfs3.c:1532:nfs3svc_access_cbk] 0-nfs:
33ac541a: <gfid:b073a189-91ea-46b2-b757-5b320591b848> => -1 (Stale fi
le handle)
[2014-07-18 23:31:45.266853] W [nfs3-helpers.c:3401:nfs3_log_common_res]
0-nfs-nfsv3: XID: 33ac541a, ACCESS: NFS: 70(Invalid file handle), P
OSIX: 116(Stale file handle)

Occasionally I see:
[2014-07-19 13:50:46.091429] W [socket.c:529:__socket_rwv] 0-NLM-client:
readv on 192.168.11.102:45823 failed (No data available)
[2014-07-19 13:50:46.091570] E [rpc-transport.c:485:rpc_transport_unref]
(-->/usr/lib64/glusterfs/3.5qa2/xlator/nfs/server.so(nlm_rpcclnt_notify+0x5a)
[0x7f53775128ea]
(-->/usr/lib64/glusterfs/3.5qa2/xlator/nfs/server.so(nlm_unset_rpc_clnt+0x75)
[0x7f537750e3e5] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_unref+0x63)
[0x7f5388914693]))) 0-rpc_transport: invalid argument: this

I'm opening a BZ now, I'll leave systems up and put the repro steps +
hostnames in the BZ in case anyone wants to poke around.

-b



>
>>    2. After having successfully deleted all files from the volume,
>>       i have a single directory that is duplicated in gluster-fuse,
>>       like this:
>>         # ls -l /mnt/gluster
>>          total 24
>>          drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
>>          drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/
>>
>> any idea on how to debug this issue?
>>
> What are the steps to recreate? We need to first find what lead to this.
> Then probably which xlator leads to this.
>

I have not seen this but I am running on a 6x2 volume.  I wonder if this
may only happen with replica > 2?


>
> Pranith
>
>>
>> /Anders
>>
>>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140719/c36f84aa/attachment.html>


More information about the Gluster-devel mailing list