[Gluster-users] Remove Brick Rebalance Hangs With No Activity

Strahil hunter86_bg at yahoo.com
Mon Oct 28 18:15:33 UTC 2019


Start by adding gluster-devel and providing links to your logs.
You can join the gluster community meeting and ask for help.

Best Regards,
Strahil NikolovOn Oct 28, 2019 20:07, Timothy Orme <torme at ancestry.com> wrote:
>
> I had tried increasing the log level, but didn't find anything of note.
>
> However, after trying a number of different things over the weekend, it turned out that simply starting and stopping the volume seemed to have fixed this.
>
> It does then seem like a bug perhaps, or some confused state, given that it doesn't seem to be any issue with communication between nodes.  I'm not really sure how to report it though, given that I don't have steps to reproduce, or much insight into what the cause might be from logging.
> ________________________________
> From: Strahil <hunter86_bg at yahoo.com>
> Sent: Sunday, October 27, 2019 10:19 AM
> To: Timothy Orme <torme at ancestry.com>; gluster-users <gluster-users at gluster.org>
> Subject: [EXTERNAL] Re: Re: [Gluster-users] Remove Brick Rebalance Hangs With No Activity
>  
>
> I guess you can increase loglevel ( check       https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3/html/administration_guide/configuring_the_log_level )
>
> Also, have you checked if new and old servers can communicate properly ?
>
> Also consider a tcpdump (for a short time) on the problematic node  can prove if communication is OK.
>
> I would go with the logs first.
>
> Best Regards,
> Strahil Nikolov
>
> On Oct 26, 2019 20:25, Timothy Orme <torme at ancestry.com> wrote:
>>
>> Thats what I thought as well.  All instances seem to be responding and alive according to the volume status.  I also was able to run a `rebalance fix-layout` without any issues, so it seems that communication between the nodes is OK.  I also tried replacing the 10.158.10.1 brick with an entirely new server since that seemed to be the common one between in the logs.  Self heal ran just fine in that replica set.  However, it still is just hanging on the removal when I try and then remove those bricks.
>>
>> I might try and full rebalance as well, just to verify that it works.
>>
>> Only other thing I can think to note is that I'm using SSL for both client and server, and maybe thats obfuscating some more important error message, but it would still seem odd given that other communication between the nodes is just fine.
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191028/710ee86f/attachment.html>


More information about the Gluster-users mailing list