[Gluster-users] GlusterFS mount does not list directory content until parent directory is listed

elvinas.piliponis at barclays.com elvinas.piliponis at barclays.com
Thu May 2 12:27:28 UTC 2013


Hello, 

Have spotted strange behaviour of GlusterFS fuse mount. I am unable to list files in a directory until parent directory is listed. However if I do list file with full path it is listed on some client nodes. 

Example: 
localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/_base/
ls: cannot access /var/lib/nova/instances/_base/: No such file or directory
localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/
total 32271483
drwxr-xr-x 413 nova nova      622592 May  2 10:48 .
drwxr-xr-x   9 nova nova        4096 Nov 14 13:37 ..
drwxrwxr-x   2 nova nova      102406 Apr 30 12:19 _base
drwxr-xr-x   2 root root         132 Mar 12 13:58 _base.unused
-rw-r--r--   1 root root      224182 Mar 15 11:42 --help
drwxr-xr-x   2 root root         226 Mar 29 03:42 instance-000001eb
drwxr-xr-x   2 root root         208 Mar 29 03:42 instance-000001ec
drwxrwxr-x   2 nova nova         226 Mar 29 03:33 instance-00000296
..... [skipped lots of lines]...
 drwxr-xr-x   2 root root         226 Mar 29 03:18 win-src8
drwxr-xr-x   2 root root         226 Mar 29 03:18 win-src9
localadmin at ldgpsua00000038:~$ ls -al /var/lib/nova/instances/_base/
total 8523873965
drwxrwxr-x   2 nova         nova      102406 Apr 30 12:19 .
drwxr-xr-x 413 nova         nova      622592 May  2 10:48 ..
-rw-r--r--   1 nova         nova 75161927680 Apr 24 19:51 049a236d5e3b297288507788369c705d3d46c17a
-rw-r--r--   1 libvirt-qemu kvm  75161927680 Apr 24 20:14 049a236d5e3b297288507788369c705d3d46c17a_70
-rw-r--r--   1 nova         nova 75161927680 Apr  9 12:37 055c0e6190c5d5629c0b1e8c1865a60d14005471
-rw-r--r--   1 libvirt-qemu kvm  75161927680 Apr  9 12:49 055c0e6190c5d5629c0b1e8c1865a60d14005471_70
...[skipped lots of line]....
-rw-r--r--   1 nova         nova 75161927680 Mar 25 13:13 ff8ad6c675c84df6f70f9bd0ac04fb4189b3c899
-rw-r--r--   1 libvirt-qemu kvm  75161927680 Mar 25 13:46 ff8ad6c675c84df6f70f9bd0ac04fb4189b3c899_70
localadmin at ldgpsua00000038:~$

Before parent directory relisting  log file for the mount point was full of:
--------------------------
[2013-05-02 12:20:01.376593] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for /
[2013-05-02 12:20:51.975861] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:52.077131] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:52.096745] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f
-88da-cf28c7e5fadc>. holes=0 overlaps=2
[2013-05-02 12:20:52.096840] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv
e (Invalid argument)
[2013-05-02 12:20:52.096868] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null)
[2013-05-02 12:20:52.102258] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:52.118880] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f
-88da-cf28c7e5fadc>. holes=0 overlaps=2
[2013-05-02 12:20:52.118936] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv
e (Invalid argument)
[2013-05-02 12:20:52.118958] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null)
[2013-05-02 12:20:54.550788] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:54.651945] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:54.666836] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f
-88da-cf28c7e5fadc>. holes=0 overlaps=2
[2013-05-02 12:20:54.666897] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv
e (Invalid argument)
[2013-05-02 12:20:54.666919] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null)
[2013-05-02 12:20:54.672033] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in /_base. holes=0 overlaps
=2
[2013-05-02 12:20:54.692809] I [dht-layout.c:593:dht_layout_normalize] 3-glustervmstore-dht: found anomalies in <gfid:bec8d3c8-fb57-4e0f
-88da-cf28c7e5fadc>. holes=0 overlaps=2
[2013-05-02 12:20:54.692869] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: bec8d3c8-fb57-4e0f-88da-cf28c7e5fadc: failed to resolv
e (Invalid argument)
[2013-05-02 12:20:54.692891] E [fuse-bridge.c:352:fuse_lookup_resume] 0-fuse: failed to resolve path (null)
-------------

After relist Errors changed to info

[2013-05-02 12:21:22.145739] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for /
[2013-05-02 12:21:22.145839] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-6; inode layout - 1561806288 - 1757032073; disk layout - 0 - 330382098
[2013-05-02 12:21:22.145865] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for /
[2013-05-02 12:21:22.145991] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-7; inode layout - 1757032074 - 1952257859; disk layout - 1636178016 - 1840700267
[2013-05-02 12:21:22.146021] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for /
[2013-05-02 12:21:22.146177] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-8; inode layout - 1952257860 - 2147483645; disk layout - 1840700268 - 2045222519
[2013-05-02 12:21:22.146277] I [dht-common.c:596:dht_revalidate_cbk] 3-glustervmstore-dht: mismatching layouts for /
[2013-05-02 12:21:22.146419] I [dht-layout.c:698:dht_layout_dir_mismatch] 3-glustervmstore-dht: subvol: glustervmstore-replicate-10; inode layout - 2733161004 - 2928386789; disk layout - 2658789276 - 2863311527

I am using Semiosis package on Ubuntu 12.04 :
ii  glusterfs-client                 3.3.1-ubuntu1~precise8                     clustered file-system (client package)
ii  glusterfs-common                 3.3.1-ubuntu1~precise8                     GlusterFS common libraries and translator modules
ii  glusterfs-server                 3.3.1-ubuntu1~precise8                     clustered file-system (server package)

This morning I have recovered from cluster lockup when one node got stuck with "CPU #X stuck on task for more than XY seconds". For some reason Gluster did attempted blindly to continue IO operations although there was lots of RPC connection errors to that stuck node in the log. Node did allowed to initiate connections but nothing was happening afterwards. This have stalled all IO operations on shared file system.  Potentially this might have caused issue above. 

However I have run volume heal full command for serveral times and no split-brain files were listed and only several heal failed files was listed in / directory for glusterfs volume. Most likely these are stale records as at least some of them are deleted some time ago. 

Any ideas where to look further?

Thank you
______________________________________________________________________________

Elvinas Piliponis  I  UNIX Engineer  I  GTIS UNIX Engineering 
Tel +370 5 251 1218, 7 2249 1218  I   Mobile +370 656 69249 I  Email  elvinas.piliponis at barclays.com
Barclays, GreenHall 9th floor 09.E4.1, Upės g. 21, Vilnius, Lithuania LT-081218 
barclays.com

This e-mail and any attachments are confidential and intended
solely for the addressee and may also be privileged or exempt from
disclosure under applicable law. If you are not the addressee, or
have received this e-mail in error, please notify the sender
immediately, delete it from your system and do not copy, disclose
or otherwise act upon any part of this e-mail or its attachments.

Internet communications are not guaranteed to be secure or
virus-free.
The Barclays Group does not accept responsibility for any loss
arising from unauthorised access to, or interference with, any
Internet communications by any third party, or from the
transmission of any viruses. Replies to this e-mail may be
monitored by the Barclays Group for operational or business
reasons.

Any opinion or other information in this e-mail or its attachments
that does not relate to the business of the Barclays Group is
personal to the sender and is not given or endorsed by the Barclays
Group.

Barclays Bank PLC. Registered in England and Wales (registered no.
1026167).
Registered Office: 1 Churchill Place, London, E14 5HP, United
Kingdom.

Barclays Bank PLC is authorised and regulated by the Financial
Services Authority.



More information about the Gluster-users mailing list