[Gluster-users] Gluster server crashes with signal 11 after probing peers.

Ernie Dunbar maillist at lightspeed.ca
Thu Mar 31 17:48:57 UTC 2016


Oops. I replied to Mohammed and not the whole list. Here's the backtrace 
and the full backtrace too:

root at nfs3:/home/ernied# gdb /usr/sbin/glusterd /core
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show 
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/glusterd...(no debugging symbols 
found)...done.

warning: core file may not match specified executable file.
[New LWP 1519]
[New LWP 1520]
[New LWP 1516]
[New LWP 1780]
[New LWP 1518]
[New LWP 1517]
[New LWP 1781]
[Thread debugging using libthread_db enabled]
Using host libthread_db library 
"/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  pthread_spin_lock () at 
../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
24	../nptl/sysdeps/x86_64/pthread_spin_lock.S: No such file or 
directory.

(gdb) bt

#0  pthread_spin_lock () at 
../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
#1  0x00007fb81dee520d in __gf_free () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2  0x00007fb81deaa625 in data_destroy () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#3  0x00007fb81dead1cd in dict_get_str () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#4  0x00007fb8193f52f9 in glusterd_xfer_cli_probe_resp ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
#5  0x00007fb8193f6017 in __glusterd_handle_cli_probe ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
#6  0x00007fb8193ee9a0 in glusterd_big_locked_handler ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
#7  0x00007fb81def38d2 in synctask_wrap () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#8  0x00007fb81d2c38b0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#9  0x0000000000000000 in ?? ()

(gdb) bt full

#0  pthread_spin_lock () at 
../nptl/sysdeps/x86_64/pthread_spin_lock.S:24
No locals.
#1  0x00007fb81dee520d in __gf_free () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
No symbol table info available.
#2  0x00007fb81deaa625 in data_destroy () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
No symbol table info available.
#3  0x00007fb81dead1cd in dict_get_str () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
No symbol table info available.
#4  0x00007fb8193f52f9 in glusterd_xfer_cli_probe_resp ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
No symbol table info available.
#5  0x00007fb8193f6017 in __glusterd_handle_cli_probe ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
No symbol table info available.
#6  0x00007fb8193ee9a0 in glusterd_big_locked_handler ()
    from 
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.9/xlator/mgmt/glusterd.so
No symbol table info available.
#7  0x00007fb81def38d2 in synctask_wrap () from 
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
No symbol table info available.
#8  0x00007fb81d2c38b0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#9  0x0000000000000000 in ?? ()
No symbol table info available.


On 2016-03-30 23:15, Mohammed Rafi K C wrote:
> Hi Ernie,
> 
> Can you please paste the back trace from the core file.
> 
> Regards
> Rafi KC
> 
> On 03/31/2016 02:31 AM, Ernie Dunbar wrote:
>> Hi everyone.
>> 
>> I'm trying to add a new Gluster node to our cluster, and when trying
>> to probing the first node in the cluster, the new node crashes with
>> the following report (logs start when the daemon starts):
>> 
>> ---------
>> [2016-03-30 20:32:05.191659] I [MSGID: 100030]
>> [glusterfsd.c:2332:main] 0-/usr/sbin/glusterd: Started running
>> /usr/sbin/glusterd version 3.7.9 (args: /usr/sbin/glusterd -p
>> /var/run/glusterd.pid)
>> [2016-03-30 20:32:05.195695] I [MSGID: 106478] [glusterd.c:1337:init]
>> 0-management: Maximum allowed open file descriptors set to 65536
>> [2016-03-30 20:32:05.195752] I [MSGID: 106479] [glusterd.c:1386:init]
>> 0-management: Using /var/lib/glusterd as working directory
>> [2016-03-30 20:32:05.200609] W [MSGID: 103071]
>> [rdma.c:4594:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event
>> channel creation failed [No such device]
>> [2016-03-30 20:32:05.200648] W [MSGID: 103055] [rdma.c:4901:init]
>> 0-rdma.management: Failed to initialize IB Device
>> [2016-03-30 20:32:05.200662] W
>> [rpc-transport.c:359:rpc_transport_load] 0-rpc-transport: 'rdma'
>> initialization failed
>> [2016-03-30 20:32:05.200723] W [rpcsvc.c:1597:rpcsvc_transport_create]
>> 0-rpc-service: cannot create listener, initing the transport failed
>> [2016-03-30 20:32:05.200743] E [MSGID: 106243] [glusterd.c:1610:init]
>> 0-management: creation of 1 listeners failed, continuing with
>> succeeded transport
>> [2016-03-30 20:32:07.135310] I [MSGID: 106513]
>> [glusterd-store.c:2062:glusterd_restore_op_version] 0-glusterd:
>> retrieved op-version: 30501
>> [2016-03-30 20:32:07.135775] I [MSGID: 106498]
>> [glusterd-handler.c:3640:glusterd_friend_add_from_peerinfo]
>> 0-management: connect returned 0
>> [2016-03-30 20:32:07.135876] I
>> [rpc-clnt.c:984:rpc_clnt_connection_init] 0-management: setting
>> frame-timeout to 600
>> [2016-03-30 20:32:07.136651] W [socket.c:870:__socket_keepalive]
>> 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 13, Invalid
>> argument
>> [2016-03-30 20:32:07.136673] E [socket.c:2966:socket_connect]
>> 0-management: Failed to set keep-alive: Invalid argument
>> [2016-03-30 20:32:07.136908] I [MSGID: 106194]
>> [glusterd-store.c:3523:glusterd_store_retrieve_missed_snaps_list]
>> 0-management: No missed snaps list.
>> Final graph:
>> +------------------------------------------------------------------------------+
>> 
>>   1: volume management
>>   2:     type mgmt/glusterd
>>   3:     option rpc-auth.auth-glusterfs on
>>   4:     option rpc-auth.auth-unix on
>>   5:     option rpc-auth.auth-null on
>>   6:     option rpc-auth-allow-insecure on
>>   7:     option transport.socket.listen-backlog 128
>>   8:     option event-threads 1
>>   9:     option ping-timeout 0
>>  10:     option transport.socket.read-fail-log off
>>  11:     option transport.socket.keepalive-interval 2
>>  12:     option transport.socket.keepalive-time 10
>>  13:     option transport-type rdma
>>  14:     option working-directory /var/lib/glusterd
>>  15: end-volume
>>  16:
>> +------------------------------------------------------------------------------+
>> 
>> [2016-03-30 20:32:07.138287] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
>> thread with index 1
>> [2016-03-30 20:32:07.138980] I [MSGID: 106544]
>> [glusterd.c:159:glusterd_uuid_init] 0-management: retrieved UUID:
>> ae191e96-9cd6-4e2b-acae-18f2cc45e6ed
>> [2016-03-30 20:32:07.139422] I [MSGID: 106163]
>> [glusterd-handshake.c:1194:__glusterd_mgmt_hndsk_versions_ack]
>> 0-management: using the op-version 30501
>> [2016-03-30 20:32:14.394056] I [MSGID: 106487]
>> [glusterd-handler.c:1239:__glusterd_handle_cli_probe] 0-glusterd:
>> Received CLI probe req nfs1 24007
>> pending frames:
>> frame : type(0) op(0)
>> patchset: git://git.gluster.com/glusterfs.git
>> signal received: 11
>> time of crash:
>> 2016-03-30 20:32:14
>> configuration details:
>> argp 1
>> backtrace 1
>> dlfcn 1
>> libpthread 1
>> llistxattr 1
>> setfsid 1
>> spinlock 1
>> epoll.h 1
>> xattr.h 1
>> st_atim.tv_nsec 1
>> package-string: glusterfs 3.7.9
>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x92)[0x7f0401a78562]
>> 
>> /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x31d)[0x7f0401a9464d]
>> 
>> /lib/x86_64-linux-gnu/libc.so.6(+0x36d40)[0x7f0400e76d40]
>> /lib/x86_64-linux-gnu/libpthread.so.0(pthread_spin_lock+0x0)[0x7f04012120f0]
>> 
>> ---------
>> 
>> 
>> Both nodes are running GlusterFS 3.7.9 on Ubuntu Trusty Tahr (14.04
>> LTS). Node 1 is running Linux kernel 3.13.0-55-generic #94-Ubuntu SMP,
>> and node 3 is running Linux kernel 3.13.0-77-generic #121-Ubuntu SMP.
>> To me, this seems to be the only difference between the systems,
>> although the new node has the very latest version of the Gluster
>> packages from the launchpad.net PPA. I would imagine that Node 1 has
>> the same update, but it's hard to tell.
>> 
>> Any help would be much appreciated.
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list