[Bugs] [Bug 1168080] All the bricks on one of the node goes offline and doesn't comes back up when one of the node is shutdown and the other node is rebooted in 2X2 gluster volume.

Wed Nov 26 04:19:54 UTC 2014

https://bugzilla.redhat.com/show_bug.cgi?id=1168080

--- Comment #1 from Poornima G <pgurusid at redhat.com> ---
+++ This bug was initially created as a clone of Bug #1164222 +++

Description of problem:
****************************
On a 2 node cluster with 2X2 volume , when one node is brought down(shutdown)
and the other node is rebooted,the bricks on the rebooted node goes offline and
never comes back up.

Version-Release number of selected component (if applicable):
[root at rhsauto026 bricks]# rpm -qa | grep glusterfs
glusterfs-api-3.6.0.29-3.el6rhs.x86_64
glusterfs-geo-replication-3.6.0.29-3.el6rhs.x86_64
glusterfs-libs-3.6.0.29-3.el6rhs.x86_64
glusterfs-cli-3.6.0.29-3.el6rhs.x86_64
glusterfs-rdma-3.6.0.29-3.el6rhs.x86_64
glusterfs-3.6.0.29-3.el6rhs.x86_64
glusterfs-fuse-3.6.0.29-3.el6rhs.x86_64
glusterfs-server-3.6.0.29-3.el6rhs.x86_64
samba-glusterfs-3.6.509-169.1.el6rhs.x86_64

How reproducible:
Tried twice

Steps to Reproduce:
1.create a 2X2 volume on 2 node cluster
2.shutdown node 1 , reboot node 2
3.Check volume status once the node 2 comes up

Actual results:
********************
Once the rebooted node comes up , the bricks on this node are offline.

Expected results:
***********************
Once the rebooted node comes up the brick on this node should be online.

[root at rhsauto025 /]# gluster vol status
Status of volume: gluster-vol
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick 10.70.37.0:/rhs/brick1/gluster-vol/b1        49152    Y    3973
Brick 10.70.37.1:/rhs/brick1/gluster-vol/b2        49152    Y    3721
Brick 10.70.37.0:/rhs/brick1/gluster-vol/b3        49153    Y    3984
Brick 10.70.37.1:/rhs/brick1/gluster-vol/b4        49153    Y    3732
NFS Server on localhost                    2049    Y    3999
Self-heal Daemon on localhost                N/A    Y    4007
NFS Server on 10.70.37.1                2049    Y    3746
Self-heal Daemon on 10.70.37.1                N/A    Y    3754

Task Status of Volume gluster-vol
------------------------------------------------------------------------------

Volume Name: gluster-vol
Type: Distributed-Replicate
Volume ID: 5843bd43-10ad-4b10-a210-69d2b015dd60
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.0:/rhs/brick1/gluster-vol/b1
Brick2: 10.70.37.1:/rhs/brick1/gluster-vol/b2
Brick3: 10.70.37.0:/rhs/brick1/gluster-vol/b3
Brick4: 10.70.37.1:/rhs/brick1/gluster-vol/b4
Options Reconfigured:
performance.readdir-ahead: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256
[root at rhsauto026 bricks]# chkconfig glusterd --list
glusterd           0:off    1:off    2:on    3:on    4:on    5:on    6:off

Once the rebooted node came up:

[root at rhsauto026 ~]# gluster vol status
Status of volume: gluster-vol
Gluster process                        Port    Online    Pid
------------------------------------------------------------------------------
Brick 10.70.37.1:/rhs/brick1/gluster-vol/b2        N/A    N    N/A
Brick 10.70.37.1:/rhs/brick1/gluster-vol/b4        N/A    N    N/A
NFS Server on localhost                    N/A    N    N/A
Self-heal Daemon on localhost                N/A    N    N/A

Task Status of Volume gluster-vol
------------------------------------------------------------------------------
There are no active volume tasks

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.