[Bugs] [Bug 1051992] Peer stuck on "accepted peer request"

bugzilla at redhat.com bugzilla at redhat.com
Thu Apr 16 04:42:51 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1051992



--- Comment #4 from krishnan parthasarathi <kparthas at redhat.com> ---
Root cause analysis
--------------------

The following sequence of events leads to the issue observed.

Let us take 4 nodes, namely A, B, C and D for forming a cluster with them.
- From A, probe B.
- After A and B are part of the cluster, say B goes offline.
- From A, probe C.
- From A, probe D.
- After C and D are part of the cluster, say B comes online.

At this point, C and D share their view of the cluster with B, as part of
glusterd's handshake algorithm. This is to ensure that the members' view of the
cluster are consistent. If this happens before A informs B of the addition of
C and D to the cluster, B would reject requests from C and D as 'illegal' (i.e,
out of cluster). This would result in C and D to see B in "Accepted Peer
Request" state, due to a bug in the internal state machine transitions that
didn't anticipate this sequence of events.

Analogy
--------

Imagine 4 like-minded people, namely A, B, C and D, who register for a
conference. Only A and B make it and become friends. A meets C and D, on a
different occasion where B isn't present, and become friends. A introduces B to
C and D. C and D being their enthusiastic selves introduce themselves to B,
where A isn't present. B didn't entertain C and D since she didn't know them.
Later, A informs B about C and D, but it was too late.

N B This analogy is an aid to explain the internal algorithm at a high-level.
Like all analogies this is bound to break soon.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=YIx0qNY0g1&a=cc_unsubscribe


More information about the Bugs mailing list