[Bugs] [Bug 1262964] Cannot access volume when network down

bugzilla at redhat.com bugzilla at redhat.com
Wed Sep 16 16:17:16 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1262964

Ravishankar N <ravishankar at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|CLOSED                      |ASSIGNED
         Resolution|NOTABUG                     |---



--- Comment #16 from Ravishankar N <ravishankar at redhat.com> ---
(In reply to Huy VU from comment #15)
> (In reply to Ravishankar N from comment #14)
> > Hi VU,
> > 
> > Writing to the same file from the clients on both nodes where the node can
> > only see itself and not the other can result in split-brains. Is that not
> > what you did? If not I may have misunderstood the steps.
> 
> Ravi,
> I am sorry for not making the steps clearer.
> 
> I used vi to add a few lines to the file on one node while the NIC card of
> the other node was forced down. Then I brought the NIC card up. That was
> enough to cause split brain.

Ah when you edit using vi, it creates a new swap file (with different gfid) and
renames it to the original file. But when node2 comes up, it should be healed
from node 1. But instead it is trying to do a conservative merge, which means
some kind of modification was done from the mount on node2 when its eth0 was
down. But you say that isn't the case. Let me see the logs and figure out.

> 
> I am also interested in knowing why there was a 30 second hang on both nodes
> when the NIC card was brought down.

When you brought the interface down, I'm guessing the mount on node 1 is not
notified immediately (unlike a case when the brick process is killed etc in
which case the mount immediately gets a disconnect event for that brick), So it
waits until network,ping-timeout value (42 seconds by default).
> 
> NOTE: I tested directly on the two nodes. i.e. the vi command was run
> directly on node 1. I don't think this should have any bearing on the
> behaviour.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list