[Gluster-users] Brick on just one host constantly going offline

Andrew Lau andrew at andrewklau.com
Tue Jun 3 01:12:44 UTC 2014


Hi Pranith,

On Tue, Jun 3, 2014 at 10:56 AM, Pranith Kumar Karampuri
<pkarampu at redhat.com> wrote:
>
>
> ----- Original Message -----
>> From: "Andrew Lau" <andrew at andrewklau.com>
>> To: "gluster-users at gluster.org List" <gluster-users at gluster.org>
>> Sent: Tuesday, June 3, 2014 4:10:25 AM
>> Subject: [Gluster-users] Brick on just one host constantly going offline
>>
>> Hi,
>>
>> Just a short post as I've since nuked the test environment.
>>
>> I've had this case where in a 2 node gluster replica, the brick of the
>> first host is constantly going offline.
>>
>> gluster volume status
>>
>> would report host 1's brick is offline. The quorum would kick in,
>> putting the whole cluster into a read only state. This has only
>> recently been happening w/ gluster 3.5 and it normally happens after
>> about 3-4 days of 500GB or so data transfer.
>
> Could you check mount logs to see if there are ping timer expiry messages for disconnects?
> If you see them, then it is very likely that you are hitting throttling problem fixed by http://review.gluster.org/7531
>

Ah, that makes sense as it was the only volume which had that ping
timeout setting. I also did see the timeout messages in the logs when
I was checking. So is this merged in 3.5.1 ?

> Pranith
>
>>
>> Has anyone noticed this before? The only way to bring it back was to:
>>
>> killall glusterfsd ; killall -9 glusterfsd ; killall glusterd ; glusterd
>>
>>
>> Thanks,
>> Andrew
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>



More information about the Gluster-users mailing list