[Gluster-users] One node goes offline, the other node loses its connection to its local Gluster volume

Chalcogen chalcogen_eg_oxygen at yahoo.com
Sun Feb 23 10:46:19 UTC 2014


I'm not from the glusterfs development team or anything, but I, too 
started with glusterfs somewhere around the time frame you mention, and 
also work with a twin-replicated setup just like yours.

When I do what you indicate here on my setup, the command initially 
hangs, and on both servers for about as long as the peer ping timeout 
thing (which is defaulted at 48 secs or so). After that it works.

If we can see new bugs in this setup then I would be interested, in part 
because the stability of my product depends upon this, too. Do you think 
you could share your glulster volume info and gluster volume status?

Also, what did heal info say before you performed this exercise?

Thanks,
Anirban

On Sunday 23 February 2014 07:14 AM, Greg Scott wrote:
>
> We first went down this path back in July 2013 and now I'm back again 
> for more.  It's a similar situation but now with new versions of 
> everything.   I'm using glusterfs 3.4.2 with Fedora 20.
>
> I have 2 nodes named fw1 and fw2.  When I ifdown the NIC I'm using for 
> Gluster on either node, that node cannot see  its Gluster volume, but 
> the other node can see it after a timeout.  As soon as I ifup that 
> NIC, everyone can see everything again.
>
> Is this expected behavior?  When that interconnect drops, I want both 
> nodes to see their own local copy and then sync everything back up 
> when the interconnect connects again.
>
> Here are details.  Node fw1 has an XFS filesystem named gluster-fw1.  
> Node fw2 has an XFS filesystem named gluster-fw2.   Those are both 
> gluster bricks and both nodes mount the bricks as /firewall-scripts.  
> So anything one node does in /firewall-scripts should also be on the 
> other node within a few milliseconds.   The test is to isolate the 
> nodes from each other and see if they can still access their own local 
> copy of /firewall-scripts.  The easiest way to do this is to ifdown 
> the interconnect NIC.  But this doesn't work.
>
> Here is what happens when I ifdown the NIC on node fw1.  Node fw2 can 
> see /firewall-scripts but fw1 shows an error.  When I ifdown on fw2, 
> the behavior is identical, but swapping fw1 and fw2.
>
> On fw1, after an ifdown  I lose connection with my Gluster filesystem.
>
> [root at stylmark-fw1 firewall-scripts]# ifdown enp5s4
>
> [root at stylmark-fw1 firewall-scripts]# ls /firewall-scripts
>
> ls: cannot access /firewall-scripts: Transport endpoint is not connected
>
> [root at stylmark-fw1 firewall-scripts]# df -h
>
> df: â/firewall-scriptsâ: Transport endpoint is not connected
>
> Filesystem                       Size  Used Avail Use% Mounted on
>
> /dev/mapper/fedora-root           17G 2.2G   14G  14% /
>
> devtmpfs                         989M 0  989M   0% /dev
>
> tmpfs                            996M   0  996M   0% /dev/shm
>
> tmpfs                            996M 564K  996M   1% /run
>
> tmpfs                            996M 0  996M   0% /sys/fs/cgroup
>
> tmpfs                            996M 0  996M   0% /tmp
>
> /dev/sda2                        477M 87M  362M  20% /boot
>
> /dev/sda1                        200M 9.6M  191M   5% /boot/efi
>
> /dev/mapper/fedora-gluster--fw1  9.8G 33M  9.8G   1% /gluster-fw1
>
> 10.10.10.2:/fwmaster             214G 75G  128G  37% /mnt/fwmaster
>
> [root at stylmark-fw1 firewall-scripts]#
>
> But on fw2, I can still look at it:
>
> [root at stylmark-fw2 ~]# ls /firewall-scripts
>
> allow-all           failover-monitor.sh rcfirewall.conf
>
> allow-all-with-nat  initial_rc.firewall start-failover-monitor.sh
>
> etc                 rc.firewall var
>
> [root at stylmark-fw2 ~]#
>
> [root at stylmark-fw2 ~]#
>
> [root at stylmark-fw2 ~]# df -h
>
> Filesystem                       Size  Used Avail Use% Mounted on
>
> /dev/mapper/fedora-root           17G 2.3G   14G  14% /
>
> devtmpfs                         989M  0  989M   0% /dev
>
> tmpfs                            996M 0  996M   0% /dev/shm
>
> tmpfs                            996M 560K  996M   1% /run
>
> tmpfs                            996M 0  996M   0% /sys/fs/cgroup
>
> tmpfs                            996M 0  996M   0% /tmp
>
> /dev/sda2                        477M 87M  362M  20% /boot
>
> /dev/sda1                        200M 9.6M  191M   5% /boot/efi
>
> /dev/mapper/fedora-gluster--fw2  9.8G 33M  9.8G   1% /gluster-fw2
>
> 192.168.253.2:/firewall-scripts  9.8G 33M  9.8G   1% /firewall-scripts
>
> 10.10.10.2:/fwmaster             214G 75G  128G  37% /mnt/fwmaster
>
> [root at stylmark-fw2 ~]#
>
> And back to fw1 -- after an ifup, I can see it again:
>
> [root at stylmark-fw1 firewall-scripts]# ifup enp5s4
>
> [root at stylmark-fw1 firewall-scripts]#
>
> [root at stylmark-fw1 firewall-scripts]# ls /firewall-scripts
>
> allow-all           failover-monitor.sh rcfirewall.conf
>
> allow-all-with-nat  initial_rc.firewall start-failover-monitor.sh
>
> etc                 rc.firewall var
>
> [root at stylmark-fw1 firewall-scripts]# df -h
>
> Filesystem                       Size  Used Avail Use% Mounted on
>
> /dev/mapper/fedora-root           17G 2.2G   14G  14% /
>
> devtmpfs                         989M 0  989M   0% /dev
>
> tmpfs                            996M  0  996M   0% /dev/shm
>
> tmpfs                            996M 564K  996M   1% /run
>
> tmpfs                            996M 0  996M   0% /sys/fs/cgroup
>
> tmpfs                            996M 0  996M   0% /tmp
>
> /dev/sda2                        477M 87M  362M  20% /boot
>
> /dev/sda1                        200M 9.6M  191M   5% /boot/efi
>
> /dev/mapper/fedora-gluster--fw1  9.8G 33M  9.8G   1% /gluster-fw1
>
> 192.168.253.1:/firewall-scripts  9.8G 33M  9.8G   1% /firewall-scripts
>
> 10.10.10.2:/fwmaster             214G 75G  128G  37% /mnt/fwmaster
>
> [root at stylmark-fw1 firewall-scripts]#
>
> What can I do about this?
>
> Thanks
>
> -Greg Scott
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140223/17533a00/attachment.html>


More information about the Gluster-users mailing list