[Bugs] [Bug 1402172] New: Peer unexpectedly disconnected

Wed Dec 7 00:15:15 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1402172

            Bug ID: 1402172
           Summary: Peer unexpectedly disconnected
           Product: GlusterFS
           Version: 3.8
         Component: glusterd
          Assignee: bugs at gluster.org
          Reporter: dgallowa at redhat.com
                CC: bugs at gluster.org

Description of problem:
I have a 1 x (2 + 1) = 3 volume.

One host in the cluster is reporting one of the peers is down but the other two
hosts don't show any problems.

[root at store01 ~]# gluster volume status shardvol1
Status of volume: shardvol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick store01:/srv/gluster/shardbrick1      49153     0          Y       23039
Brick store03:/srv/gluster/shardbrick1      49153     0          Y       7120 
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       27426
NFS Server on store03                       N/A       N/A        N       N/A  
Self-heal Daemon on store03                 N/A       N/A        Y       31088

[root at store02 ~]# gluster volume status shardvol1
Status of volume: shardvol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick store01:/srv/gluster/shardbrick1      49153     0          Y       23039
Brick store02:/srv/gluster/shardbrick1      49153     0          Y       5660 
Brick store03:/srv/gluster/shardbrick1      49153     0          Y       7120 
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       31843
NFS Server on store03                       N/A       N/A        N       N/A  
Self-heal Daemon on store03                 N/A       N/A        Y       31088
NFS Server on store01                       N/A       N/A        N       N/A  
Self-heal Daemon on store01                 N/A       N/A        Y       27426

[root at store03 ~]# gluster volume status shardvol1
Status of volume: shardvol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick store01:/srv/gluster/shardbrick1      49153     0          Y       23039
Brick store02:/srv/gluster/shardbrick1      49153     0          Y       5660 
Brick store03:/srv/gluster/shardbrick1      49153     0          Y       7120 
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       31088
NFS Server on store01                       N/A       N/A        N       N/A  
Self-heal Daemon on store01                 N/A       N/A        Y       27426
NFS Server on store02                       N/A       N/A        N       N/A  
Self-heal Daemon on store02                 N/A       N/A        Y       31843

[root at store01 ~]# gluster peer status
Number of Peers: 2

Hostname: store02
Uuid: 97d6abee-f2ac-47d8-bb96-738ffb99b38f
State: Peer in Cluster (Disconnected)

Hostname: store03
Uuid: 2cbf1e99-4fb7-410d-b1f1-e385000b20ec
State: Peer in Cluster (Connected)

[root at store02 ~]# gluster peer status
Number of Peers: 2

Hostname: store01
Uuid: 7a931ab9-6075-4fd4-868d-9deeb91295c0
State: Peer in Cluster (Connected)

Hostname: store03
Uuid: 2cbf1e99-4fb7-410d-b1f1-e385000b20ec
State: Peer in Cluster (Connected)

[root at store03 ~]# gluster peer status
Number of Peers: 2

Hostname: store01
Uuid: 7a931ab9-6075-4fd4-868d-9deeb91295c0
State: Peer in Cluster (Connected)

Hostname: store02
Uuid: 97d6abee-f2ac-47d8-bb96-738ffb99b38f
State: Peer in Cluster (Connected)

Version-Release number of selected component (if applicable):
glusterfs 3.8.6

How reproducible:
Probably not very

Steps to Reproduce:
1. Create 1 x (2 + 1) = 3 volume
2. ???

Additional info:
I've uploaded gluster sosreports from each of the hosts here:
http://drop.ceph.com/qa/dgalloway/

As far as I can tell, store02 started showing as disconnected from store01
today.  No changes (iptables or otherwise) were made to the gluster hosts.  I
attempted a 'service glusterd restart' on store01 and store02 without
improvement.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.