[Gluster-users] Cannot remove-brick/migrate data

Thu Mar 9 03:35:12 UTC 2017

On 8 March 2017 at 23:34, Jarsulic, Michael [CRI] <
mjarsulic at bsd.uchicago.edu> wrote:

> I am having issues with one of my systems that houses two bricks and want
> to bring it down for maintenance. I was able to remove the first brick
> successfully and committed the changes. The second brick is giving me a lot
> of problems with the rebalance when I try to remove it. It seems like it is
> stuck somewhere in that process:
>
> # gluster volume remove-brick hpcscratch cri16fs002-ib:/data/brick4/scratch
> status
>                                     Node Rebalanced-files          size
>    scanned      failures       skipped               status   run time in
> secs
>                                ---------      -----------   -----------
>  -----------   -----------   -----------         ------------
>  --------------
>                                localhost                0        0Bytes
>        522             0             0          in progress
>  915.00
>
>
> The rebalance logs show the following error message.
>
> [2017-03-08 17:48:19.329934] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr]
> 0-hpcscratch-dht: fixing the layout of /userx/Ethiopian_imputation
> [2017-03-08 17:48:19.329960] I [MSGID: 109045]
> [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht:
> subvolume 0 (hpcscratch-client-0): 45778954 chunks
> [2017-03-08 17:48:19.329968] I [MSGID: 109045]
> [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht:
> subvolume 1 (hpcscratch-client-1): 45778954 chunks
> [2017-03-08 17:48:19.329974] I [MSGID: 109045]
> [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht:
> subvolume 2 (hpcscratch-client-4): 45778954 chunks
> [2017-03-08 17:48:19.329979] I [MSGID: 109045]
> [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht:
> subvolume 3 (hpcscratch-client-5): 45778954 chunks
> [2017-03-08 17:48:19.329983] I [MSGID: 109045]
> [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-hpcscratch-dht:
> subvolume 4 (hpcscratch-client-7): 45778954 chunks
> [2017-03-08 17:48:19.400394] I [MSGID: 109036] [dht-common.c:7869:dht_log_new_layout_for_dir_selfheal]
> 0-hpcscratch-dht: Setting layout of /userx/Ethiopian_imputation with
> [Subvol_name: hpcscratch-client-0, Err: -1 , Start: 1052915942 , Stop:
> 2105831883 , Hash: 1 ], [Subvol_name: hpcscratch-client-1, Err: -1 , Start:
> 3158747826 , Stop: 4294967295 , Hash: 1 ], [Subvol_name:
> hpcscratch-client-4, Err: -1 , Start: 0 , Stop: 1052915941 , Hash: 1 ],
> [Subvol_name: hpcscratch-client-5, Err: -1 , Start: 2105831884 , Stop:
> 3158747825 , Hash: 1 ], [Subvol_name: hpcscratch-client-7, Err: 22 , Start:
> 0 , Stop: 0 , Hash: 0 ],
> [2017-03-08 17:48:19.480882] I [dht-rebalance.c:2446:gf_defrag_process_dir]
> 0-hpcscratch-dht: migrate data called on /userx/Ethiopian_imputation
>
>

These are not error messages - these are info messages logged when the
layout for a directory is being set and can be ignored.

The remove-brick operation is still in progress according to the status.
What is it that makes you feel it is stuck? Is there no difference in the
status output even after a considerable interval?

Regards,
Nithya

>
> Any suggestions on how I can get this brick out of play and preserve the
> data?
>
> --
> Mike Jarsulic
> Sr. HPC Administrator
> Center for Research Informatics | University of Chicago
> 773.702.2066
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170309/41efc3b3/attachment.html>