[Gluster-users] GlusterFS cluster peer stuck in state: Sent and Received peer request (Connected)

Atin Mukherjee atin.mukherjee83 at gmail.com
Tue Mar 22 14:27:16 UTC 2016


-Atin
Sent from one plus one
On 22-Mar-2016 6:54 pm, "tommy.yardley at baesystems.com" <
tommy.yardley at baesystems.com> wrote:
>
> Hi Atin,
>
> Setting 'state=3' on the instances and restarting the service seems to
have fixed the problem.

Great!

>
> Is this an 'issue' with glusterfs?

No, its not. As I said the handshaking was incomplete which lead to this
issue. I'd also recommend you to upgrade to latest gluster version i.e.
3.7.x?

>
> I will implement an automated solution for my problem, just would be good
to know if this is something that will be patched in the future?
>
> Thanks,
> Tommy
>
> -----Original Message-----
> From: Atin Mukherjee [mailto:amukherj at redhat.com]
> Sent: 22 March 2016 13:10
> To: Yardley, Tommy (UK Guildford); gluster-users at gluster.org
> Subject: Re: [Gluster-users] GlusterFS cluster peer stuck in state: Sent
and Received peer request (Connected)
>
> Tommy,
>
> It seems like that there were frequent disconnect events which may have
caused the peer handshaking to remain incomplete and leading to an
inconsistency in the cluster state.
>
> Further follow up questions:
>
> 1. Restarting glusterd instances doesn't solve the problem?
>
> 2. If answer to 1 is yes can we try to set state=3 in all the
/var/lib/glusterd/peers/<UUID> files and then restart glusterd to see
whether the problem persists?
>
> If the above still doesn't solve the problem output of 'cat
/var/lib/glusterd/peers/*' from all the nodes should help us in figuring
out the correct workaround.
>
> ~Atin
>
> On 03/22/2016 02:51 PM, Atin Mukherjee wrote:
> > Gaurav is looking into it and he will get back with his analysis.
> >
> > ~Atin
> >
> > On 03/22/2016 02:42 PM, tommy.yardley at baesystems.com wrote:
> >> Hi,
> >>
> >> Is anyone able to help with this issue?
> >>
> >> Thanks,
> >> Tommy
> >>
> >> -----Original Message-----
> >> From: gluster-users-bounces at gluster.org
> >> [mailto:gluster-users-bounces at gluster.org] On Behalf Of
> >> tommy.yardley at baesystems.com
> >> Sent: 17 March 2016 08:49
> >> To: gluster-users at gluster.org
> >> Subject: Re: [Gluster-users] GlusterFS cluster peer stuck in state:
> >> Sent and Received peer request (Connected)
> >>
> >> Hi,
> >>
> >> Sorry I had sent them directly to Atin
> >>
> >> I've trimmed down the larger log files a bit and attached all of them
to this email.
> >>
> >> Many thanks,
> >> Tommy
> >>
> >> -----Original Message-----
> >> From: Gaurav Garg [mailto:ggarg at redhat.com]
> >> Sent: 17 March 2016 07:07
> >> To: Yardley, Tommy (UK Guildford)
> >> Cc: gluster-users at gluster.org
> >> Subject: Re: [Gluster-users] GlusterFS cluster peer stuck in state:
> >> Sent and Received peer request (Connected)
> >>
> >>>>> I’ve sent the logs directly as they push this message over the size
limit.
> >>
> >> Where have you send logs. i could not able to find. could you send
glusterd logs so that we can start analyzing this issue.
> >>
> >> Thanks,
> >>
> >> Regards,
> >> Gaurav
> >>
> >> ----- Original Message -----
> >> From: "Atin Mukherjee" <amukherj at redhat.com>
> >> To: "tommy yardley" <tommy.yardley at baesystems.com>,
> >> gluster-users at gluster.org
> >> Sent: Wednesday, March 16, 2016 5:49:05 PM
> >> Subject: Re: [Gluster-users] GlusterFS cluster peer stuck in state:
> >> Sent and Received peer request (Connected)
> >>
> >> I couldn't look into this today, sorry about that. I can only look
into this case on Monday. Anyone else to take this up?
> >>
> >> ~Atin
> >>
> >> On 03/15/2016 09:57 PM, tommy.yardley at baesystems.com wrote:
> >>> Hi Atin,
> >>>
> >>>
> >>>
> >>> All nodes are running 3.5.8 – the probe sequence is:
> >>> 172.31.30.64
> >>>
> >>> 172.31.27.27 (node having issue)
> >>>
> >>> 172.31.26.134 (node the peer probe is ran on)
> >>>
> >>> 172.31.19.46
> >>>
> >>>
> >>>
> >>> I’ve sent the logs directly as they push this message over the size
limit.
> >>>
> >>>
> >>>
> >>> look forward to your reply,
> >>>
> >>>
> >>>
> >>> Tommy
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *From:*Atin Mukherjee [mailto:atin.mukherjee83 at gmail.com]
> >>> *Sent:* 15 March 2016 15:58
> >>> *To:* Yardley, Tommy (UK Guildford)
> >>> *Cc:* gluster-users at gluster.org
> >>> *Subject:* Re: [Gluster-users] GlusterFS cluster peer stuck in state:
> >>> Sent and Received peer request (Connected)
> >>>
> >>>
> >>>
> >>> This indicates the peer handshaking didn't go through properly and
> >>> your cluster is messed up. Are you running 3.5.8 version in all the
nodes?
> >>> Could you get me the glusterd log from all the nodes and mention the
> >>> peer probe sequence? I'd be able to look at it tomorrow only and get
back.
> >>>
> >>> -Atin
> >>> Sent from one plus one
> >>>
> >>> On 15-Mar-2016 9:16 pm, "tommy.yardley at baesystems.com
> >>> <mailto:tommy.yardley at baesystems.com>" <tommy.yardley at baesystems.com
> >>> <mailto:tommy.yardley at baesystems.com>> wrote:
> >>>
> >>> Hi All,
> >>>
> >>>
> >>>
> >>> I’m running GlusterFS on a cluster hosted in AWS. I have a script
> >>> which provisions my instances and thus will set up GlusterFS
(specifically:
> >>> glusterfs 3.5.8).
> >>>
> >>> My issue is that this only works ~50% of the time and the other 50%
> >>> of the time one of the peers will be ‘stuck’ in the following state:
> >>>
> >>> /root at ip-xx-xx-xx-1:/home/ubuntu# gluster peer status/
> >>>
> >>> /Number of Peers: 3/
> >>>
> >>> / /
> >>>
> >>> /Hostname: xx.xx.xx.2/
> >>>
> >>> /Uuid: 3b4c1fb9-b325-4204-98fd-2eb739fa867f/
> >>>
> >>> /State: Peer in Cluster (Connected)/
> >>>
> >>> / /
> >>>
> >>> /Hostname: xx.xx.xx.3/
> >>>
> >>> /Uuid: acfc1794-9080-4eb0-8f69-3abe78bbee16/
> >>>
> >>> /State: Sent and Received peer request (Connected)/
> >>>
> >>> / /
> >>>
> >>> /Hostname: xx.xx.xx.4/
> >>>
> >>> /Uuid: af33463d-1b32-4ffb-a4f0-46ce16151e2f/
> >>>
> >>> /State: Peer in Cluster (Connected)/
> >>>
> >>>
> >>>
> >>> Running gluster peer status on the instance that is affected yields:
> >>>
> >>>
> >>>
> >>> /root at ip-xx-xx-xx-3:/var/log/glusterfs# gluster peer status Number
> >>> of
> >>> Peers: 1/
> >>>
> >>> / /
> >>>
> >>> /Hostname: xx.xx.xx.1/
> >>>
> >>> /Uuid: c4f17e9a-893b-48f0-a014-1a05cca09d01/
> >>>
> >>> /State: Peer is connected and Accepted (Connected)/
> >>>
> >>> / /
> >>>
> >>> Of which the status (Connected) in this case, will fluctuate between
> >>> ‘Connected’ and ‘Disconnected’.
> >>>
> >>>
> >>>
> >>> I have been unable to locate the cause of this issue. Has this been
> >>> encountered before, and if so is there a general fix? I haven’t been
> >>> able to find anything as of yet.
> >>>
> >>>
> >>>
> >>> Many thanks,
> >>>
> >>>
> >>>
> >>> *Tommy*
> >>>
> >>>
> >>>
> >>> Please consider the environment before printing this email. This
> >>> message should be regarded as confidential. If you have received
> >>> this email in error please notify the sender and destroy it
immediately.
> >>> Statements of intent shall only become binding when confirmed in
> >>> hard copy by an authorised signatory. The contents of this email may
> >>> relate to dealings with other companies under the control of BAE
> >>> Systems Applied Intelligence Limited, details of which can be found
> >>> at http://www.baesystems.com/Businesses/index.htm.
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-users mailing list
> >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> >>> http://www.gluster.org/mailman/listinfo/gluster-users
> >>>
> >>> Please consider the environment before printing this email. This
> >>> message should be regarded as confidential. If you have received
> >>> this email in error please notify the sender and destroy it
immediately.
> >>> Statements of intent shall only become binding when confirmed in
> >>> hard copy by an authorised signatory. The contents of this email may
> >>> relate to dealings with other companies under the control of BAE
> >>> Systems Applied Intelligence Limited, details of which can be found
> >>> at http://www.baesystems.com/Businesses/index.htm.
> >>>
> >>>
> >>> _______________________________________________
> >>> Gluster-users mailing list
> >>> Gluster-users at gluster.org
> >>> http://www.gluster.org/mailman/listinfo/gluster-users
> >>>
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users at gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >> Please consider the environment before printing this email. This
message should be regarded as confidential. If you have received this email
in error please notify the sender and destroy it immediately. Statements of
intent shall only become binding when confirmed in hard copy by an
authorised signatory. The contents of this email may relate to dealings
with other companies under the control of BAE Systems Applied Intelligence
Limited, details of which can be found at
http://www.baesystems.com/Businesses/index.htm.
> >> Please consider the environment before printing this email. This
message should be regarded as confidential. If you have received this email
in error please notify the sender and destroy it immediately. Statements of
intent shall only become binding when confirmed in hard copy by an
authorised signatory. The contents of this email may relate to dealings
with other companies under the control of BAE Systems Applied Intelligence
Limited, details of which can be found at
http://www.baesystems.com/Businesses/index.htm.
> >> _______________________________________________
> >> Gluster-users mailing list
> >> Gluster-users at gluster.org
> >> http://www.gluster.org/mailman/listinfo/gluster-users
> >>
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> Please consider the environment before printing this email. This message
should be regarded as confidential. If you have received this email in
error please notify the sender and destroy it immediately. Statements of
intent shall only become binding when confirmed in hard copy by an
authorised signatory. The contents of this email may relate to dealings
with other companies under the control of BAE Systems Applied Intelligence
Limited, details of which can be found at
http://www.baesystems.com/Businesses/index.htm.
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160322/54325688/attachment.html>


More information about the Gluster-users mailing list