[Bugs] [Bug 1739884] New: glusterfsd process crashes with SIGSEGV

bugzilla at redhat.com bugzilla at redhat.com
Sun Aug 11 15:55:35 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1739884

            Bug ID: 1739884
           Summary: glusterfsd process crashes with SIGSEGV
           Product: GlusterFS
           Version: 6
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: transport
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: cfeller at gmail.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community



Description of problem:
glusterfsd process crashes with SIGSEGV. A glusterfsd process crashed during an
normal workload.

Version-Release number of selected component (if applicable):
glusterfs-6.4-1.el7.x86_64
glusterfs-server-6.4-1.el7.x86_64

How reproducible:
Seldom, but twice in less than 24 hours. 

This Gluster setup had been running reliably for several weeks, but crashed
twice for the same reason in less than 24 hours.  
(I captured the data on the first crash, but it crashed a second time before I
created the bug report on the first one.)  

Steps to Reproduce:
?


Additional info:
This is a two node cluster, behind the same switch connected via fiber SFPs,
10GE.

First crash:

###########################
# journalctl -u glusterd
-- Logs begin at Thu 2019-08-08 05:48:10 PDT, end at Thu 2019-08-08 17:40:01
PDT. --
Aug 08 05:48:31 gluster00 systemd[1]: Starting GlusterFS, a clustered
file-system server...
Aug 08 05:48:31 gluster00 systemd[1]: Started GlusterFS, a clustered
file-system server.
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: pending frames:
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: frame : type(1) op(XATTROP)
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: frame : type(1) op(INODELK)
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: frame : type(1) op(XATTROP)
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: frame : type(1) op(FXATTROP)
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: frame : type(1) op(TRUNCATE)
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: patchset:
git://git.gluster.org/glusterfs.git
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: signal received: 11
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: time of crash:
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: 2019-08-09 00:05:00
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: configuration details:
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: argp 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: backtrace 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: dlfcn 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: libpthread 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: llistxattr 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: setfsid 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: spinlock 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: epoll.h 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: xattr.h 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: st_atim.tv_nsec 1
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: package-string: glusterfs
6.4
Aug 08 17:05:00 gluster00 export-brick0-srv[6563]: ---------

###########################
# log from first node
[2019-08-09 00:04:58.959057] I [MSGID: 115036] [server.c:499:server_rpc_notify]
0-gv0-server: disconnecting connection from
CTX_ID:783468e6-b1c9-461d-861a-469c4aba45a6-GRAPH_ID:0-PID:30955-HOST:gluster01-PC_NAME:gv0-client-0-RECON_NO:-0
[2019-08-09 00:04:58.959250] I [MSGID: 101055] [client_t.c:436:gf_client_unref]
0-gv0-server: Shutting down connection
CTX_ID:783468e6-b1c9-461d-861a-469c4aba45a6-GRAPH_ID:0-PID:30955-HOST:gluster01-PC_NAME:gv0-client-0-RECON_NO:-0
[2019-08-09 00:04:58.992241] I [addr.c:54:compare_addr_and_update]
0-/export/brick0/srv: allowed = "*", received addr = "192.168.0.21"             
[2019-08-09 00:04:58.992294] I [login.c:110:gf_auth] 0-auth/login: allowed user
names: ad85e1b1-89f4-44ba-b098-b941f0b0a0bb                                     
[2019-08-09 00:04:58.992304] I [MSGID: 115029]
[server-handshake.c:550:server_setvolume] 0-gv0-server: accepted client from
CTX_ID:f0c57ea3-fd1d-433a-985c-e6e3dfa014f1-GRAPH_ID:0-PID:30974-HOST:gluster01-PC_NAME:gv0-client-0-RECON_NO:-0
(version: 6.4) with subvol /export/brick0/srv
[2019-08-09 00:05:00.953515] E [MSGID: 101064]
[event-epoll.c:618:event_dispatch_epoll_handler] 0-epoll: generation mismatch
on idx=5, gen=9316, slot->gen=9317, slot->fd=19                                 
[2019-08-09 00:05:00.970841] E [socket.c:1303:socket_event_poll_err]
(-->/lib64/libglusterfs.so.0(+0x8b4d6) [0x7fb437d884d6]
-->/usr/lib64/glusterfs/6.4/rpc-transport/socket.so(+0xa48a) [0x7fb42c0e848a]
-->/usr/lib64/glusterfs/6.4/rpc-transport/socket.so(+0x81fc) [0x7fb42c0e61fc] )
0-socket: invalid argument: this->private [Invalid argument]                    
pending frames:
frame : type(1) op(XATTROP)
frame : type(1) op(INODELK)
frame : type(1) op(XATTROP)
frame : type(1) op(FXATTROP)
frame : type(1) op(TRUNCATE)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2019-08-09 00:05:00
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.4
/lib64/libglusterfs.so.0(+0x26e00)[0x7fb437d23e00]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7fb437d2e804]
/lib64/libc.so.6(+0x36340)[0x7fb436363340]
/usr/lib64/glusterfs/6.4/rpc-transport/socket.so(+0xa4cc)[0x7fb42c0e84cc]
/lib64/libglusterfs.so.0(+0x8b4d6)[0x7fb437d884d6]
/lib64/libpthread.so.0(+0x7dd5)[0x7fb436b63dd5]
/lib64/libc.so.6(clone+0x6d)[0x7fb43642b02d]
---------

# log from second node (crashed ~5 minutes later)
[2019-08-09 00:09:34.882722] I [MSGID: 115036] [server.c:499:server_rpc_notify]
0-gv0-server: disconnecting connection from
CTX_ID:3e3f26ce-f682-4631-a058-11fd08414c81-GRAPH_ID:0-PID:31527-HOST:gluster01-PC_NAME:gv0-client-1-RECON_NO:-0
[2019-08-09 00:09:34.882878] I [MSGID: 101055] [client_t.c:436:gf_client_unref]
0-gv0-server: Shutting down connection
CTX_ID:3e3f26ce-f682-4631-a058-11fd08414c81-GRAPH_ID:0-PID:31527-HOST:gluster01-PC_NAME:gv0-client-1-RECON_NO:-0
[2019-08-09 00:09:39.916899] I [addr.c:54:compare_addr_and_update]
0-/export/brick0/srv: allowed = "*", received addr = "192.168.0.21"             
[2019-08-09 00:09:39.916929] I [login.c:110:gf_auth] 0-auth/login: allowed user
names: ad85e1b1-89f4-44ba-b098-b941f0b0a0bb                                     
[2019-08-09 00:09:39.916946] I [MSGID: 115029]
[server-handshake.c:550:server_setvolume] 0-gv0-server: accepted client from
CTX_ID:1401c14a-b2f7-421a-89a1-9acfdaffeda0-GRAPH_ID:0-PID:31660-HOST:gluster01-PC_NAME:gv0-client-1-RECON_NO:-0
(version: 6.4) with subvol /export/brick0/srv
pending frames:
frame : type(1) op(LOOKUP)
frame : type(1) op(OPEN)
frame : type(1) op(READ)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2019-08-09 00:10:34
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.4
/lib64/libglusterfs.so.0(+0x26e00)[0x7f50f2b20e00]
/lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f50f2b2b804]
/lib64/libc.so.6(+0x36340)[0x7f50f1160340]
/usr/lib64/glusterfs/6.4/rpc-transport/socket.so(+0xa4cc)[0x7f50e6ee54cc]
/lib64/libglusterfs.so.0(+0x8b4d6)[0x7f50f2b854d6]
/lib64/libpthread.so.0(+0x7dd5)[0x7f50f1960dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f50f122802d]
---------

###########################
# backtrace in core dump on first node:
(gdb) bt
#0  0x00007fb42c0e84cc in socket_event_handler () from
/usr/lib64/glusterfs/6.4/rpc-transport/socket.so
#1  0x00007fb437d884d6 in event_dispatch_epoll_worker () from
/lib64/libglusterfs.so.0
#2  0x00007fb436b63dd5 in start_thread (arg=0x7fb413fff700) at
pthread_create.c:307
#3  0x00007fb43642b02d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) 

###########################
# backtrace in core dump on second node:
(gdb) bt
#0  0x00007f50e6ee54cc in socket_event_handler () from
/usr/lib64/glusterfs/6.4/rpc-transport/socket.so
#1  0x00007f50f2b854d6 in event_dispatch_epoll_worker () from
/lib64/libglusterfs.so.0
#2  0x00007f50f1960dd5 in start_thread (arg=0x7f50dd365700) at
pthread_create.c:307
#3  0x00007f50f122802d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list