[Gluster-users] Files from one brick missing from readdir

Hans Henrik Happe happe at nbi.dk
Mon Jul 9 11:00:57 UTC 2018


Hi Nithya,

Now, we have the same situation as we did the last time. It fixed itself.

Do you have any insights into what might trigger a fix. It must be
related to use of the dirs, but it's almost 24 hours since we started
poking around in that path.

According the the rebalance log the dir has not been touched.

Cheers,
Hans Henrik

On 09-07-2018 11:25, Nithya Balachandran wrote:
> Hi Hans,
> 
> Never mind - I found it. It looks like the same problem as reported by
> the other user. In both cases, this is a pure distribute volume.
> 
> See packet 154.The iatt is null for all entries. It looks like a
> .glusterfs gfid link is missing on that brick.
> 
> Would you prefer that I send the steps to recover in a private email?
> 
> Regards,
> Nithya
> 
> 
> On 9 July 2018 at 14:49, Nithya Balachandran <nbalacha at redhat.com
> <mailto:nbalacha at redhat.com>> wrote:
> 
>     Or even better, the brick on which those files exist and the gluster
>     volume status output for the volume.
> 
>     Thanks,
>     Nithya
> 
>     On 9 July 2018 at 14:42, Nithya Balachandran <nbalacha at redhat.com
>     <mailto:nbalacha at redhat.com>> wrote:
> 
>         Thanks Hans. What are the names of the "missing" files?
> 
>         Regards,
>         Nithya
> 
>         On 9 July 2018 at 13:30, Nithya Balachandran
>         <nbalacha at redhat.com <mailto:nbalacha at redhat.com>> wrote:
> 
>             Hi Hans,
> 
>             Another user has reported something similar and we are still
>             debugging this.
> 
>             Would you mind taking a tcpdump of the client while listing
>             the directory from a FUSE client and sending it to me?
>             Please use 
>             tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22
> 
> 
>             Also, please send the output of gluster volume info and
>             gluster volume get <volname> all.
> 
>             Thanks,
>             Nithya
> 
> 
>             On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk
>             <mailto:happe at nbi.dk>> wrote:
> 
>                 Hi,
> 
>                 After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed
>                 to go smoothly,
>                 we have experienced missing files and dirs when listing
>                 directories.
> 
>                 We are using a distributed setup with 20 bricks (no
>                 redundance from
>                 glusterfs).
> 
>                 The dirs and files can be referenced directly, but does
>                 not show up in
>                 listings (readdir, i.e. ls). Renaming them works, but
>                 they still does
>                 not show up.
> 
>                 The first time we discovered this, we noticed that files
>                 slowly
>                 reappeared and finally all were there. After that we
>                 started a
>                 fix-layout which is still running (5mio dirs). After
>                 this we would
>                 compare brick files to the mounted fs.
> 
>                 Yesterday we again discovered some missing files in a
>                 dir. After some
>                 poking around we found that all missing files were
>                 located on the same
>                 brick.
> 
>                 Comparing dir xattr did not give us a clue:
> 
> 
>                 Brick with missing files:
> 
>                 # getfattr  -m . -d -e hex backup
>                 # file: backup
>                 trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce
>                 trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7
>                 trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1=0x00000000d65e340000000000000000750000000000000001
>                 trusted.glusterfs.quota.dirty=0x3000
>                 trusted.glusterfs.quota.size.1=0x00000000d65e340000000000000000750000000000000001
> 
>                 Other brick:
> 
>                 # getfattr  -m . -d -e hex backup
>                 # file: backup
>                 trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce
>                 trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488
>                 trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1=0x00000000b03aa80000000000000000700000000000000001
>                 trusted.glusterfs.quota.dirty=0x3000
>                 trusted.glusterfs.quota.size.1=0x00000000b03aa80000000000000000700000000000000001
> 
> 
>                 Anyone who experienced this or have some clues to what
>                 might be wrong?
> 
>                 Cheers,
>                 Hans Henrik
>                 _______________________________________________
>                 Gluster-users mailing list
>                 Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>                 https://lists.gluster.org/mailman/listinfo/gluster-users
>                 <https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> 
> 
> 
> 


More information about the Gluster-users mailing list