[Bugs] [Bug 1272436] New: glusterd crashing

bugzilla at redhat.com bugzilla at redhat.com
Fri Oct 16 12:05:28 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1272436

            Bug ID: 1272436
           Summary: glusterd crashing
           Product: GlusterFS
           Version: 3.7.4
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: gliverma at westga.edu
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:
glusted is crashing as described in the thread at
https://www.gluster.org/pipermail/gluster-users/2015-October/023783.html
Community members looked at the core dump and said it looks like a glibc
corruption. Vijay Bellur requested a bug report be opened.

Version-Release number of selected component (if applicable):
# rpm -qa | grep gluster
glusterfs-geo-replication-3.7.4-2.el6.x86_64
glusterfs-client-xlators-3.7.4-2.el6.x86_64
glusterfs-3.7.4-2.el6.x86_64
glusterfs-libs-3.7.4-2.el6.x86_64
glusterfs-api-3.7.4-2.el6.x86_64
glusterfs-fuse-3.7.4-2.el6.x86_64
glusterfs-server-3.7.4-2.el6.x86_64
glusterfs-cli-3.7.4-2.el6.x86_64

How reproducible:
It has core dumped on multiple nodes multiple times.

Steps to Reproduce:
Not sure of how to reproduce

Actual results:
Gluster to keep running

Expected results:
Gluster crashing

Additional info:
# gluster volume info
Volume Name: gv0
Type: Replicate
Volume ID: fc50d049-cebe-4a3f-82a6-748847226099
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: eapps-gluster01.uwg.westga.edu:/export/sdb1/gv0
Brick2: eapps-gluster02.uwg.westga.edu:/export/sdb1/gv0
Brick3: eapps-gluster03.uwg.westga.edu:/export/sdb1/gv0
Options Reconfigured:
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
nfs.drc: off

# gluster volume status
Status of volume: gv0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick eapps-gluster01.uwg.westga.edu:/expor
t/sdb1/gv0                                  49152     0          Y       36149
Brick eapps-gluster02.uwg.westga.edu:/expor
t/sdb1/gv0                                  49152     0          Y       24797
Brick eapps-gluster03.uwg.westga.edu:/expor
t/sdb1/gv0                                  N/A       N/A        N       N/A
NFS Server on localhost                     2049      0          Y       26812
Self-heal Daemon on localhost               N/A       N/A        Y       26820
NFS Server on eapps-gluster03.uwg.westga.ed
u                                           2049      0          Y       47314
Self-heal Daemon on eapps-gluster03.uwg.wes
tga.edu                                     N/A       N/A        Y       47322
NFS Server on eapps-gluster02.uwg.westga.ed
u                                           2049      0          Y       52522
Self-heal Daemon on eapps-gluster02.uwg.wes
tga.edu                                     N/A       N/A        Y       52535

Task Status of Volume gv0
------------------------------------------------------------------------------
There are no active volume tasks

# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.7 (Santiago)


Core dump info requested in the thread:

Both of the requested trace commands are below:

Core was generated by `/usr/sbin/glusterd --pid-file=/var/run/glusterd.pid'.
Program terminated with signal 6, Aborted.
#0  0x0000003b91432625 in raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);



(gdb) bt
#0  0x0000003b91432625 in raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003b91433e05 in abort () at abort.c:92
#2  0x0000003b91470537 in __libc_message (do_abort=2, fmt=0x3b915588c0 "***
glibc detected *** %s: %s: 0x%s ***\n") at
../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3  0x0000003b91475f4e in malloc_printerr (action=3, str=0x3b9155687d
"corrupted double-linked list", ptr=<value optimized out>, ar_ptr=<value
optimized out>) at malloc.c:6350
#4  0x0000003b914763d3 in malloc_consolidate (av=0x7fee90000020) at
malloc.c:5216
#5  0x0000003b91479c28 in _int_malloc (av=0x7fee90000020, bytes=<value
optimized out>) at malloc.c:4415
#6  0x0000003b9147a7ed in __libc_calloc (n=<value optimized out>,
elem_size=<value optimized out>) at malloc.c:4093
#7  0x0000003b9345c81f in __gf_calloc (nmemb=<value optimized out>, size=<value
optimized out>, type=59, typestr=0x7fee9ed2d708 "gf_common_mt_rpc_trans_t") at
mem-pool.c:117
#8  0x00007fee9ed2830b in socket_server_event_handler (fd=<value optimized
out>, idx=<value optimized out>, data=0xf3eca0, poll_in=1, poll_out=<value
optimized out>,
    poll_err=<value optimized out>) at socket.c:2622
#9  0x0000003b9348b0a0 in event_dispatch_epoll_handler (data=0xf408b0) at
event-epoll.c:575
#10 event_dispatch_epoll_worker (data=0xf408b0) at event-epoll.c:678
#11 0x0000003b91807a51 in start_thread (arg=0x7fee9db3b700) at
pthread_create.c:301
#12 0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115




(gdb) t a a bt

Thread 9 (Thread 0x7fee9e53c700 (LWP 37122)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
#1  0x00007fee9fffcf93 in hooks_worker (args=<value optimized out>) at
glusterd-hooks.c:534
#2  0x0000003b91807a51 in start_thread (arg=0x7fee9e53c700) at
pthread_create.c:301
#3  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 8 (Thread 0x7feea0c99700 (LWP 36996)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
#1  0x0000003b9346cbdb in syncenv_task (proc=0xefa8c0) at syncop.c:607
#2  0x0000003b93472cb0 in syncenv_processor (thdata=0xefa8c0) at syncop.c:699
#3  0x0000003b91807a51 in start_thread (arg=0x7feea0c99700) at
pthread_create.c:301
#4  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 7 (Thread 0x7feea209b700 (LWP 36994)):
#0  do_sigwait (set=<value optimized out>, sig=0x7feea209ae5c) at
../sysdeps/unix/sysv/linux/sigwait.c:65
#1  __sigwait (set=<value optimized out>, sig=0x7feea209ae5c) at
../sysdeps/unix/sysv/linux/sigwait.c:100
#2  0x0000000000405dfb in glusterfs_sigwaiter (arg=<value optimized out>) at
glusterfsd.c:1989
#3  0x0000003b91807a51 in start_thread (arg=0x7feea209b700) at
pthread_create.c:301
#4  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 6 (Thread 0x7feea2a9c700 (LWP 36993)):
#0  0x0000003b9180efbd in nanosleep () at ../sysdeps/unix/syscall-template.S:82
#1  0x0000003b934473ea in gf_timer_proc (ctx=0xecc010) at timer.c:205
#2  0x0000003b91807a51 in start_thread (arg=0x7feea2a9c700) at
pthread_create.c:301
#3  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 5 (Thread 0x7feea9e04740 (LWP 36992)):
#0  0x0000003b918082ad in pthread_join (threadid=140662814254848,
thread_return=0x0) at pthread_join.c:89
#1  0x0000003b9348ab4d in event_dispatch_epoll (event_pool=0xeeb5b0) at
event-epoll.c:762
#2  0x0000000000407b24 in main (argc=2, argv=0x7fff5294adc8) at
glusterfsd.c:2333

Thread 4 (Thread 0x7feea169a700 (LWP 36995)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
#1  0x0000003b9346cbdb in syncenv_task (proc=0xefa500) at syncop.c:607
#2  0x0000003b93472cb0 in syncenv_processor (thdata=0xefa500) at syncop.c:699
#3  0x0000003b91807a51 in start_thread (arg=0x7feea169a700) at
pthread_create.c:301
#4  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 3 (Thread 0x7fee9d13a700 (LWP 37124)):
#0  0x0000003b914e8f33 in epoll_wait () at
../sysdeps/unix/syscall-template.S:82
#1  0x0000003b9348aed1 in event_dispatch_epoll_worker (data=0xf405b0) at
event-epoll.c:668
#2  0x0000003b91807a51 in start_thread (arg=0x7fee9d13a700) at
pthread_create.c:301
#3  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 2 (Thread 0x7fee97fff700 (LWP 37125)):
#0  0x0000003b914e8f33 in epoll_wait () at
../sysdeps/unix/syscall-template.S:82
#1  0x0000003b9348aed1 in event_dispatch_epoll_worker (data=0xf6b4d0) at
event-epoll.c:668
#2  0x0000003b91807a51 in start_thread (arg=0x7fee97fff700) at
pthread_create.c:301
#3  0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Thread 1 (Thread 0x7fee9db3b700 (LWP 37123)):
#0  0x0000003b91432625 in raise (sig=<value optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003b91433e05 in abort () at abort.c:92
#2  0x0000003b91470537 in __libc_message (do_abort=2, fmt=0x3b915588c0 "***
glibc detected *** %s: %s: 0x%s ***\n") at
../sysdeps/unix/sysv/linux/libc_fatal.c:198

---Type <return> to continue, or q <return> to quit---
#3  0x0000003b91475f4e in malloc_printerr (action=3, str=0x3b9155687d
"corrupted double-linked list", ptr=<value optimized out>, ar_ptr=<value
optimized out>) at malloc.c:6350
#4  0x0000003b914763d3 in malloc_consolidate (av=0x7fee90000020) at
malloc.c:5216
#5  0x0000003b91479c28 in _int_malloc (av=0x7fee90000020, bytes=<value
optimized out>) at malloc.c:4415
#6  0x0000003b9147a7ed in __libc_calloc (n=<value optimized out>,
elem_size=<value optimized out>) at malloc.c:4093
#7  0x0000003b9345c81f in __gf_calloc (nmemb=<value optimized out>, size=<value
optimized out>, type=59, typestr=0x7fee9ed2d708 "gf_common_mt_rpc_trans_t") at
mem-pool.c:117
#8  0x00007fee9ed2830b in socket_server_event_handler (fd=<value optimized
out>, idx=<value optimized out>, data=0xf3eca0, poll_in=1, poll_out=<value
optimized out>,
    poll_err=<value optimized out>) at socket.c:2622
#9  0x0000003b9348b0a0 in event_dispatch_epoll_handler (data=0xf408b0) at
event-epoll.c:575
#10 event_dispatch_epoll_worker (data=0xf408b0) at event-epoll.c:678
#11 0x0000003b91807a51 in start_thread (arg=0x7fee9db3b700) at
pthread_create.c:301
#12 0x0000003b914e893d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:115

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list