[Gluster-users] dht_layout_dir_mismatch after simultaneously mkdir

kenji kondo kkay.jp at gmail.com
Mon Dec 3 13:14:18 UTC 2012


Hello Avati, thank you for your reply.

I tried to test your suggestion in 3.2.7, but I could not test in 3.3.x
because I don't have.
For the results, unfortunately new similar problems occurred as following:

gluster> volume set vol22 performance.stat-prefetch off
gluster> volume info vol22

Volume Name: vol22
Type: Distribute
Status: Started
Number of Bricks: 9
Transport-type: tcp
Bricks:
Brick1: gluster10:/export22/brick
Brick2: gluster11:/export22/brick
Brick3: gluster12:/export22/brick
Brick4: gluster13:/export22/brick
Brick5: gluster14:/export22/brick
Brick6: gluster15:/export22/brick
Brick7: gluster16:/export22/brick
Brick8: gluster17:/export22/brick
Brick9: gluster18:/export22/brick
Options Reconfigured:
performance.stat-prefetch: off


After this setting, I ran same simulation in the vol22.
But strange directories were made, that I could not remove the some
directories as below:

$ ls -a ai/anchor/3
.  ..

$ rmdir ai/anchor/3
rmdir: ai/anchor/3: No such file or directory


then I found error messages:

[2012-12-03 18:08:14.816313] E [client3_1-fops.c:2228:client3_1_lookup_cbk]
0-vol22-client-3: remote operation failed: Stale NFS file handle
[2012-12-03 18:08:14.817196] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-4
[2012-12-03 18:08:14.817258] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-0
[2012-12-03 18:08:14.817322] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-2
[2012-12-03 18:08:14.817367] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-1
[2012-12-03 18:08:14.817398] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-5
[2012-12-03 18:08:14.817430] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-8
[2012-12-03 18:08:14.817460] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-7
[2012-12-03 18:08:14.817506] W [dht-common.c:178:dht_lookup_dir_cbk]
0-vol22-dht: /test1130/ai/anchor: gfid different on vol22-client-6
[2012-12-03 18:08:14.818865] E [client3_1-fops.c:2132:client3_1_opendir_cbk]
0-vol22-client-3: remote operation failed: No such file or directory
[2012-12-03 18:08:14.819198] W [fuse-bridge.c:1016:fuse_unlink_cbk]
0-glusterfs-fuse: 1684950: RMDIR()
/test1130/ai/anchor/3 => -1 (No such file or directory)


And, I found strange dht with getfattr command,

$ sudo getfattr -d -m '.*' -n trusted.glusterfs.pathinfo ai/anchor/3

trusted.glusterfs.pathinfo="(vol22-dht-layout (vol22-client-0 1073741822
1610612732) (vol22-client-1 1610612733 2147483643) (vol22-client-2 2147483644
2684354554)

 (vol22-client-3 0 0)

(vol22-client-4 2684354555 3221225465)
(vol22-client-5 3221225466 3758096376) (vol22-client-6 3758096377 4294967295)
(vol22-client-7 0 536870910) (vol22-client-8 536870911 1073741821))"


vol22-client-3 is 0 to 0?: this would be incorrect.

Above problem is found in all of our clients.
I expect that the problems relate to exclusive control for mkdir, but
I don't enough understand this phenomenon.

Can you have some idea?
I will try.


Best regards,
Kondo



2012/12/2 Anand Avati <anand.avati at gmail.com>

> Can you try removing stat-prefetch with "gluster volume set <name>
> performance.stat-prefetch" and try the same? Also, does this problem exist
> in 3.3.x for you?
>
> Avati
>
> On Sat, Dec 1, 2012 at 12:06 AM, kenji kondo <kkay.jp at gmail.com> wrote:
>
>> Dear experts,
>>
>> I'm using gluster 3.2.7. It believe it has good performance. That's good,
>> but troubles sometimes occur for mkdir.
>> The scenario is as below
>> 1: A volume is created by 9 bricks on 9 gluster server.
>> 2: Many client hosts mount it with fuse.
>> 3: Several clients simultaneously  make one directory.
>> 4: Except with one host, all other hosts fail to be make the directory.
>> (This is usual.)
>> 5: But, problem host is found here, it becomes to be not able to make
>> directory and create file in the directory of step 1.
>>
>> At the time, in problem host I found error message:
>>
>> mkdir: cannot create directory `/gluster/test/x': Invalid argument.
>>
>> touch  /gluster/test/x
>> touch: cannot touch `/gluster/test/x': No such file or directory
>>
>> then I can find some warning log in /var/log/gluster/[logs] as following:
>>
>> [2012-11-29 19:36:50.52787] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-1; inode layout - 0 - 0; disk layout - 477218588 - 954437175
>> [2012-11-29 19:36:50.52824] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.52873] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-7; inode layout - 0 - 0; disk layout - 3340530116 - 3817748703
>> [2012-11-29 19:36:50.52886] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.52901] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-2; inode layout - 0 - 0; disk layout - 954437176 - 1431655763
>> [2012-11-29 19:36:50.52917] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.52936] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-5; inode layout - 0 - 0; disk layout - 2386092940 - 2863311527
>> [2012-11-29 19:36:50.52947] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.52961] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-3; inode layout - 0 - 0; disk layout - 1431655764 - 1908874351
>> [2012-11-29 19:36:50.52970] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.52983] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-0; inode layout - 0 - 0; disk layout - 0 - 477218587
>> [2012-11-29 19:36:50.52993] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.53007] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-6; inode layout - 0 - 0; disk layout - 2863311528 - 3340530115
>> [2012-11-29 19:36:50.53016] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.53029] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-8; inode layout - 0 - 0; disk layout - 3817748704 - 4294967295
>> [2012-11-29 19:36:50.53038] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.53052] I [dht-layout.c:682:dht_layout_dir_mismatch] 0-vol18-dht: subvol: vol18-client-4; inode layout - 0 - 0; disk layout - 1908874352 - 2386092939
>> [2012-11-29 19:36:50.53060] I [dht-common.c:525:dht_revalidate_cbk] 0-vol18-dht: mismatching layouts for /test/x
>> [2012-11-29 19:36:50.53923] I [dht-layout.c:192:dht_layout_search] 0-vol18-dht: no subvolume for hash (value) = 3127134579
>> [2012-11-29 19:36:50.54422] I [dht-layout.c:192:dht_layout_search] 0-vol18-dht: no subvolume for hash (value) = 3127134579
>> [2012-11-29 19:36:50.54442] W [fuse-bridge.c:231:fuse_entry_cbk] 0-glusterfs-fuse: 127332: MKDIR() /test/x => -1 (Invalid argument)
>>
>>
>> So I tried to check the DHT with getfattr command in problem host.
>>
>> [host1]$ sudo getfattr -m . -n trusted.glusterfs.pathinfo  /gluster/test
>> getfattr: Removing leading '/' from absolute path names
>> # file: gluster/test
>> trusted.glusterfs.pathinfo="(vol18-dht-layout (vol18-client-7 0 0) (vol18-client-8 0 0) (vol18-client-4 0 0) (vol18-client-0 0 0) (vol18-client-6 0 0) (vol18-client-1 0 0) (vol18-client-2 0 0) (vol18-client-3 0 0) (vol18-client-5 0 0))"
>>
>>
>> I seems there is incorrect table.
>>
>> If it's checked in no problem host, below tables are displayed.
>>
>> [host2]$ sudo getfattr -m . -n trusted.glusterfs.pathinfo  /gluster/test
>> getfattr: Removing leading '/' from absolute path names
>> # file: gluster/test
>> trusted.glusterfs.pathinfo="(vol18-dht-layout (vol18-client-0 0 477218587) (vol18-client-1 477218588 954437175) (vol18-client-2 954437176 1431655763) (vol18-client-3 1431655764 1908874351) (vol18-client-4 1908874352 2386092939) (vol18-client-5 2386092940 2863311527) (vol18-client-6 2863311528 3340530115) (vol18-client-7 3340530116 3817748703) (vol18-client-8 3817748704 4294967295))"
>>
>>
>> In my experience, if re-mount is tried on problem host, this problem
>> disappear and it becomes to be able to make the directory.
>> Is this problem a bug?
>>
>> Best regards,
>> Kondo
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121203/83f73109/attachment.html>


More information about the Gluster-users mailing list