[Gluster-users] dht_layout_dir_mismatch after simultaneously mkdir
kenji kondo
kkay.jp at gmail.com
Tue Dec 4 08:04:15 UTC 2012
Hello Avati,
I got the log as below.
We can see that one gfid (g13) differs from others.
I sometimes found same situation so far.
Best regards
Kondo
------------------------
backend/ai/anchor/3:
g10:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g11:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g12:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g13:trusted.gfid=0x93c1e7a2797a4b888e17d9b6582da365
g14:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g15:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g16:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g17:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g18:trusted.gfid=0xd1366c6a09c345e8bab26b7953ac6405
g10:trusted.glusterfs.dht=0x00000001000000003ffffffe5ffffffc
g11:trusted.glusterfs.dht=0x00000001000000005ffffffd7ffffffb
g12:trusted.glusterfs.dht=0x00000001000000007ffffffc9ffffffa
g13:trusted.glusterfs.dht=0x000000010000000000000000ffffffff
g14:trusted.glusterfs.dht=0x00000001000000009ffffffbbffffff9
g15:trusted.glusterfs.dht=0x0000000100000000bffffffadffffff8
g16:trusted.glusterfs.dht=0x0000000100000000dffffff9ffffffff
g17:trusted.glusterfs.dht=0x0000000100000000000000001ffffffe
g18:trusted.glusterfs.dht=0x00000001000000001fffffff3ffffffd
/backend/ai/anchor:
g10:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g11:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g12:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g13:trusted.gfid=0x6ee644c825dd4806b2680bfeb985a27d
g14:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g15:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g16:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g17:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g18:trusted.gfid=0x89a33935d5da4acd85586e075eca221d
g10:trusted.glusterfs.dht=0x00000001000000005555555471c71c6f
g11:trusted.glusterfs.dht=0x000000010000000071c71c708e38e38b
g12:trusted.glusterfs.dht=0x00000001000000008e38e38caaaaaaa7
g13:trusted.glusterfs.dht=0x0000000100000000aaaaaaa8c71c71c3
g14:trusted.glusterfs.dht=0x0000000100000000c71c71c4e38e38df
g15:trusted.glusterfs.dht=0x0000000100000000e38e38e0ffffffff
g16:trusted.glusterfs.dht=0x0000000100000000000000001c71c71b
g17:trusted.glusterfs.dht=0x00000001000000001c71c71c38e38e37
g18:trusted.glusterfs.dht=0x000000010000000038e38e3855555553
/backend/ai
g10.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g11.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g12.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g13.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g14.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g15.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g16.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g17.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g18.gfid:trusted.gfid=0x258ba47b30ee41a984ceea1a491b1669
g10.gfid:trusted.glusterfs.dht=0x00000001000000008e38e38caaaaaaa7
g11.gfid:trusted.glusterfs.dht=0x0000000100000000aaaaaaa8c71c71c3
g12.gfid:trusted.glusterfs.dht=0x0000000100000000c71c71c4e38e38df
g13.gfid:trusted.glusterfs.dht=0x0000000100000000e38e38e0ffffffff
g14.gfid:trusted.glusterfs.dht=0x0000000100000000000000001c71c71b
g15.gfid:trusted.glusterfs.dht=0x00000001000000001c71c71c38e38e37
g16.gfid:trusted.glusterfs.dht=0x000000010000000038e38e3855555553
g17.gfid:trusted.glusterfs.dht=0x00000001000000005555555471c71c6f
g18.gfid:trusted.glusterfs.dht=0x000000010000000071c71c708e38e38b
On 2012/12/04, at 9:20, Anand Avati <anand.avati at gmail.com> wrote:
On Mon, Dec 3, 2012 at 5:14 AM, kenji kondo <kkay.jp at gmail.com> wrote:
> Hello Avati, thank you for your reply.
>
> I tried to test your suggestion in 3.2.7, but I could not test in 3.3.x
> because I don't have.
> For the results, unfortunately new similar problems occurred as following:
>
> gluster> volume set vol22 performance.stat-prefetch off
> gluster> volume info vol22
>
> Volume Name: vol22
> Type: Distribute
> Status: Started
> Number of Bricks: 9
> Transport-type: tcp
> Bricks:
> Brick1: gluster10:/export22/brick
> Brick2: gluster11:/export22/brick
> Brick3: gluster12:/export22/brick
> Brick4: gluster13:/export22/brick
> Brick5: gluster14:/export22/brick
> Brick6: gluster15:/export22/brick
> Brick7: gluster16:/export22/brick
> Brick8: gluster17:/export22/brick
> Brick9: gluster18:/export22/brick
> Options Reconfigured:
> performance.stat-prefetch: off
>
>
> After this setting, I ran same simulation in the vol22.
> But strange directories were made, that I could not remove the some
> directories as below:
>
> $ ls -a ai/anchor/3
> . ..
>
> $ rmdir ai/anchor/3
> rmdir: ai/anchor/3: No such file or directory
>
>
> then I found error messages:
>
> [2012-12-03 18:08:14.816313] E [client3_1-fops.c:2228:client3_1_lookup_cbk]
> 0-vol22-client-3: remote operation failed: Stale NFS file handle
> [2012-12-03 18:08:14.817196] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-4
> [2012-12-03 18:08:14.817258] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-0
> [2012-12-03 18:08:14.817322] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-2
> [2012-12-03 18:08:14.817367] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-1
> [2012-12-03 18:08:14.817398] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-5
> [2012-12-03 18:08:14.817430] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-8
> [2012-12-03 18:08:14.817460] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-7
> [2012-12-03 18:08:14.817506] W [dht-common.c:178:dht_lookup_dir_cbk]
> 0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-6
> [2012-12-03 18:08:14.818865] E [client3_1-fops.c:2132:client3_1_opendir_cbk]
> 0-vol22-client-3: remote operation failed: No such file or directory
> [2012-12-03 18:08:14.819198] W [fuse-bridge.c:1016:fuse_unlink_cbk]
> 0-glusterfs-fuse: 1684950: RMDIR()
> /test1130/ai/anchor/3 => -1 (No such file or directory)
>
>
> And, I found strange dht with getfattr command,
>
> $ sudo getfattr -d -m '.*' -n trusted.glusterfs.pathinfo ai/anchor/3
>
> trusted.glusterfs.pathinfo="(vol22-dht-layout (vol22-client-0 1073741822
> 1610612732) (vol22-client-1 1610612733 2147483643) (vol22-client-2 2147483644
> 2684354554)
>
> (vol22-client-3 0 0)
>
> (vol22-client-4 2684354555 3221225465)
> (vol22-client-5 3221225466 3758096376) (vol22-client-6 3758096377 4294967295)
> (vol22-client-7 0 536870910) (vol22-client-8 536870911 1073741821))"
>
>
> vol22-client-3 is 0 to 0?: this would be incorrect.
>
> Above problem is found in all of our clients.
> I expect that the problems relate to exclusive control for mkdir, but
> I don't enough understand this phenomenon.
>
> Can you have some idea?
> I will try.
>
>
In such a state, can you get the backend xattr dumps of ai/anchor/ and
ai/anchor/3/ with the following commands run on ALL servers -
sh# getfattr -d -e hex -m . /backend/ai/anchor/
sh# getfattr -d -e hex -m . /backend/ai/anchor/3/
Thanks,
Avati
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121204/b52bd6ce/attachment.html>
More information about the Gluster-users
mailing list