[Bugs] [Bug 1749625] New: [GlusterFS 6.1] GlusterFS client process crash

Fri Sep 6 03:06:05 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1749625

            Bug ID: 1749625
           Summary: [GlusterFS 6.1] GlusterFS client process crash
           Product: GlusterFS
           Version: 6
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: glusterd
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: joe.chan at neuralt.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community

Created attachment 1612173
  --> https://bugzilla.redhat.com/attachment.cgi?id=1612173&action=edit
bricks log file

Description of problem:
The gluster client is somehow crash after running for a period (1~2 weeks), I
would like to know is it a bug exist in GlusterFS 6.1 or other problem. 

Firstly, The glusterfs is installed by using the rpm from
http://mirror.centos.org/centos/7/storage/x86_64/gluster-4.0/

client rpm:
glusterfs-6.1-1.el7.x86_64.rpm

server rpm:
glusterfs-server-6.1-1.el7.x86_64.rpm

Below is the structure of the GlusterFS in our servers:

Volume Name: k8s-volume
Type: Replicate
Volume ID: d5d673d6-a1bb-4d14-bc91-1ceab7ad761d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: hk2hp057:/gfs/k8s_pv
Brick2: hk2hp058:/gfs/k8s_pv
Brick3: hk2hp059:/gfs/k8s_pv
Options Reconfigured:
auth.reject: 172.20.117.144
auth.allow: 172.20.117.*
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on

As shown above, there are total three servers using the "Replicated Volume". It
is created three replicas in pursuit of better reliability and data redundancy
with regard to the official recommendation. 

Version-Release number of selected component (if applicable):

However, the gluster process go down after running for a period. it is easily
to see by executing the "gluster" command - $ gluster volume status

Status of volume: k8s-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick hk2hp057:/gfs/k8s_pv                  49152     0          Y       25141
Brick hk2hp058:/gfs/k8s_pv                  N/A       N/A        N       N/A
Brick hk2hp059:/gfs/k8s_pv                  49152     0          Y       6031
Self-heal Daemon on localhost               N/A       N/A        Y       6065
Self-heal Daemon on hk2hp057                N/A       N/A        Y       25150
Self-heal Daemon on hk2hp059                N/A       N/A        Y       6048

Task Status of Volume k8s-volume
------------------------------------------------------------------------------
There are no active volume tasks

the brick under "hk2hp058" is offline.
When i check the log file under "/var/log/glusterfs/bricks/", the message as
below:

[2019-09-04 07:33:50.004661] W [MSGID: 113117]
[posix-metadata.c:627:posix_set_ctime] 0-k8s-volume-posix: posix set mdata
failed, No ctime :
/gfs/k8s_pv/.glusterfs/dd/56/dd56eca4-e7e8-4208-b589-763a428408e1
gfid:dd56eca4-e7e8-4208-b589-763a428408e1
[2019-09-04 07:33:50.004776] W [MSGID: 113117]
[posix-metadata.c:627:posix_set_ctime] 0-k8s-volume-posix: posix set mdata
failed, No ctime :
/gfs/k8s_pv/.glusterfs/dd/56/dd56eca4-e7e8-4208-b589-763a428408e1
gfid:dd56eca4-e7e8-4208-b589-763a428408e1
pending frames:
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2019-09-04 10:53:02
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.1
/lib64/libglusterfs.so.0(+0x26db0)[0x7fa1fa2d1db0]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7fa1fa2dc7b4]
/lib64/libc.so.6(+0x36340)[0x7fa1f8911340]
/usr/lib64/glusterfs/6.1/rpc-transport/socket.so(+0xa4cc)[0x7fa1ee6964cc]
/lib64/libglusterfs.so.0(+0x8c286)[0x7fa1fa337286]
/lib64/libpthread.so.0(+0x7dd5)[0x7fa1f9111dd5]
/lib64/libc.so.6(clone+0x6d)[0x7fa1f89d902d]
---------

The glusterfs process is resumed after manually restart the process -
"glusterd".
[root at hk2hp058 /var/log/glusterfs/bricks] systemctl restart glusterd
[root at hk2hp058 /var/log/glusterfs/bricks] gluster volume status
Status of volume: k8s-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick hk2hp057:/gfs/k8s_pv                  49152     0          Y       25141
Brick hk2hp058:/gfs/k8s_pv                  49152     0          Y       18814
Brick hk2hp059:/gfs/k8s_pv                  49152     0          Y       6031
Self-heal Daemon on localhost               N/A       N/A        Y       18846
Self-heal Daemon on hk2hp059                N/A       N/A        Y       6048
Self-heal Daemon on hk2hp057                N/A       N/A        Y       25150

Task Status of Volume k8s-volume
------------------------------------------------------------------------------
There are no active volume tasks

Thus, is it a bug or other reason lead to this issue?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.