[Gluster-users] Files Missing on Client Side; Still available on bricks

Thu Jun 8 07:34:51 UTC 2017

+Raghavendra/Nithya

On Tue, Jun 6, 2017 at 7:41 PM, Jarsulic, Michael [CRI] <
mjarsulic at bsd.uchicago.edu> wrote:

> Hello,
>
> I am still working at recovering from a few failed OS hard drives on my
> gluster storage and have been removing, and re-adding bricks quite a bit. I
> noticed yesterday night that some of the directories are not visible when I
> access them through the client, but are still on the brick. For example:
>
> Client:
>
> # ls /scratch/dw
> Ethiopian_imputation  HGDP  Rolwaling  Tibetan_Alignment
>
> Brick:
>
> # ls /data/brick1/scratch/dw
> 1000GP_Phase3  Ethiopian_imputation  HGDP  Rolwaling  SGDP
> Siberian_imputation  Tibetan_Alignment  mapata
>
>
> However, the directory is accessible on the client side (just not visible):
>
> # stat /scratch/dw/SGDP
>   File: `/scratch/dw/SGDP'
>   Size: 212992      Blocks: 416        IO Block: 131072 directory
> Device: 21h/33d Inode: 11986142482805280401  Links: 2
> Access: (0775/drwxrwxr-x)  Uid: (339748621/dw)   Gid: (339748621/dw)
> Access: 2017-06-02 16:00:02.398109000 -0500
> Modify: 2017-06-06 06:59:13.004947703 -0500
> Change: 2017-06-06 06:59:13.004947703 -0500
>
>
> The only place I see the directory mentioned in the log files are in the
> rebalance logs. The following piece may provide a clue as to what is going
> on:
>
> [2017-06-05 20:46:51.752726] E [MSGID: 109010] [dht-rebalance.c:2259:gf_defrag_get_entry]
> 0-hpcscratch-dht: /dw/SGDP/HGDP00476_chr6.tped gfid not present
> [2017-06-05 20:46:51.752742] E [MSGID: 109010] [dht-rebalance.c:2259:gf_defrag_get_entry]
> 0-hpcscratch-dht: /dw/SGDP/LP6005441-DNA_B08_chr4.tmp gfid not present
> [2017-06-05 20:46:51.752773] E [MSGID: 109010] [dht-rebalance.c:2259:gf_defrag_get_entry]
> 0-hpcscratch-dht: /dw/SGDP/LP6005441-DNA_B08.geno.tmp gfid not present
> [2017-06-05 20:46:51.752789] E [MSGID: 109010] [dht-rebalance.c:2259:gf_defrag_get_entry]
> 0-hpcscratch-dht: /dw/SGDP/LP6005443-DNA_D02_chr4.out gfid not present
>
> This happened yesterday during a rebalance that failed. However, doing a
> rebalance fix-layout allowed my to clean up these errors and successfully
> complete a migration to a re-added brick.
>
>
> Here is the information for my storage cluster:
>
> # gluster volume info
>
> Volume Name: hpcscratch
> Type: Distribute
> Volume ID: 80b8eeed-1e72-45b9-8402-e01ae0130105
> Status: Started
> Number of Bricks: 6
> Transport-type: tcp
> Bricks:
> Brick1: fs001-ib:/data/brick2/scratch
> Brick2: fs003-ib:/data/brick5/scratch
> Brick3: fs003-ib:/data/brick6/scratch
> Brick4: fs004-ib:/data/brick7/scratch
> Brick5: fs001-ib:/data/brick1/scratch
> Brick6: fs004-ib:/data/brick8/scratch
> Options Reconfigured:
> server.event-threads: 8
> performance.client-io-threads: on
> client.event-threads: 8
> performance.cache-size: 32MB
> performance.readdir-ahead: on
> diagnostics.client-log-level: INFO
> diagnostics.brick-log-level: INFO
>
>
> Mount points for the bricks:
>
> /dev/sdb on /data/brick2 type xfs (rw,noatime,nobarrier)
> /dev/sda on /data/brick1 type xfs (rw,noatime,nobarrier)
>
>
> Mount point on the client:
>
> 10.xx.xx.xx:/hpcscratch on /scratch type fuse.glusterfs
> (rw,default_permissions,allow_other,max_read=131072)
>
>
> My question is what are some of the possibilities for the root cause of
> this issue and what is the recommended way of recovering from it? Let me
> know if you need any more information.
>
>
> --
> Mike Jarsulic
> Sr. HPC Administrator
> Center for Research Informatics | University of Chicago
> 773.702.2066
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170608/93070941/attachment.html>