[Bugs] [Bug 1447523] New: Glusterd segmentation fault while running peer probe

bugzilla at redhat.com bugzilla at redhat.com
Wed May 3 05:22:15 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1447523

            Bug ID: 1447523
           Summary: Glusterd segmentation fault while running peer probe
           Product: GlusterFS
           Version: 3.8
         Component: glusterd
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: ben at apcera.com
                CC: bugs at gluster.org



Description of problem:

Issuing a peer probe results in a glusterd segmentation fault. Once in this
state, if the peer is removed from /var/lib/glusterd/peers, glusterd will
start.  Probing a peer again leads to the same problem.

Problematic peer entry:
cat /var/lib/glusterd/peers/ip-10-0-50-25.us-west-1.compute.internal 
uuid=00000000-0000-0000-0000-000000000000
state=0
hostname1=ip-10-0-50-25.us-west-1.compute.internal


Core was generated by `/usr/sbin/glusterd -p /var/run/glusterd.pid
--log-level=TRACE --log-buf-size=0'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  x86_64_fallback_frame_state (context=0x7ffe5d9a3b50,
context=0x7ffe5d9a3b50, fs=0x7ffe5d9a3c40) at ./md-unwind-support.h:58
58      ./md-unwind-support.h: No such file or directory.
(gdb) bt
#0  x86_64_fallback_frame_state (context=0x7ffe5d9a3b50,
context=0x7ffe5d9a3b50, fs=0x7ffe5d9a3c40) at ./md-unwind-support.h:58
#1  uw_frame_state_for (context=context at entry=0x7ffe5d9a3b50,
fs=fs at entry=0x7ffe5d9a3c40) at ../../../src/libgcc/unwind-dw2.c:1253
#2  0x00007f6371b2f6d8 in _Unwind_Backtrace (trace=0x7f6378bc2440
<backtrace_helper>, trace_argument=0x7ffe5d9a3e00) at
../../../src/libgcc/unwind.inc:290
#3  0x00007f6378bc25b6 in __GI___backtrace (array=array at entry=0x7ffe5d9a3e40,
size=size at entry=200) at ../sysdeps/x86_64/backtrace.c:109
#4  0x00007f63796f3f42 in _gf_msg_backtrace_nomem
(level=level at entry=GF_LOG_ALERT, stacksize=stacksize at entry=200) at
logging.c:1094
#5  0x00007f63796fd494 in gf_print_trace (signum=11, ctx=0x7f637a3ac010) at
common-utils.c:737
#6  <signal handler called>
#7  0x00000001725cc6c8 in ?? ()
#8  0x0000000000000000 in ?? ()



Version-Release number of selected component (if applicable):

$ glusterd --version 
glusterfs 3.8.11

from package glusterfs-server 3.8.11-ubuntu1~trusty1

How reproducible:

1:1


Steps to Reproduce:
1. Install gluster on Ubuntu 14.04
2. sudo /usr/sbin/gluster --log-level=TRACE peer probe
ip-10-0-50-25.us-west-1.compute.internal
Connection failed. Please check if gluster daemon is operational.

Actual results:

Glusterd crashes on peer probe.

Expected results:

Glusterd should not crash on peer probe.


Additional info:

There's another issue which may be related. I noticed that glusterd.info was
not self-populating. As a workaround I issue 'gluster pool list' which triggers
glusterd to generate and store a UUID:

cat /var/lib/glusterd/glusterd.info 
UUID=ad7b8337-ec4d-4917-ad6b-ca0e4d0eba42
operating-version=30800

This looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=1293594

Crash loop:
$ sudo gdb glusterd                                                             
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:

<http://www.gnu.org/software/gdb/documentation/>.                              
                                                                               
                                                                          
[44/1919]
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from glusterd...Reading symbols from
/usr/lib/debug/.build-id/fa/703514fceaf89ef9f0626ee95d362c941cc158.debug...done.
done.
(gdb) r --debug -p /var/run/glusterd.pid 
Starting program: /usr/sbin/glusterd --debug -p /var/run/glusterd.pid
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: the debug information found in
"/usr/lib/debug//lib/x86_64-linux-gnu/libuuid.so.1.3.0" does not match
"/lib/x86_64-linux-gnu/libuuid.so.1" (CRC mismatch).

warning: the debug information found in
"/usr/lib/debug/lib/x86_64-linux-gnu/libuuid.so.1.3.0" does not match
"/lib/x86_64-linux-gnu/libuuid.so.1" (CRC mismatch).

[2017-05-03 05:12:29.642538] I [MSGID: 100030] [glusterfsd.c:2454:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.8.11 (args:
/usr/sbin/glusterd --debug -p /var/run/glusterd.pid)
[2017-05-03 05:12:29.642706] D [MSGID: 0]
[glusterfsd.c:2072:glusterfs_pidfile_update] 0-glusterfsd: pidfile
/var/run/glusterd.pid updated with pid 2588
[2017-05-03 05:12:29.642760] D [logging.c:1791:__gf_log_inject_timer_event]
0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[New Thread 0x7ffff4e19700 (LWP 2592)]
[New Thread 0x7ffff4618700 (LWP 2593)]
[New Thread 0x7ffff3e17700 (LWP 2594)]
[New Thread 0x7ffff3616700 (LWP 2595)]
[2017-05-03 05:12:29.647173] D [MSGID: 0] [glusterfsd.c:660:get_volfp]
0-glusterfsd: loading volume file /etc/glusterfs/glusterd.vol
[2017-05-03 05:12:29.758748] I [MSGID: 106478] [glusterd.c:1381:init]
0-management: Maximum allowed open file descriptors set to 65536
[2017-05-03 05:12:29.758849] I [MSGID: 106479] [glusterd.c:1430:init]
0-management: Using /var/lib/glusterd as working directory
[2017-05-03 05:12:29.758938] D [MSGID: 0]
[glusterd.c:408:glusterd_rpcsvc_options_build] 0-glusterd: listen-backlog
value: 128
[2017-05-03 05:12:29.759078] D [rpcsvc.c:2345:rpcsvc_init] 0-rpc-service: RPC
service inited.
[2017-05-03 05:12:29.759127] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2017-05-03 05:12:29.759184] D [rpc-transport.c:283:rpc_transport_load]
0-rpc-transport: attempt to load file
/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/rpc-transport/socket.so
[2017-05-03 05:12:29.780522] D [socket.c:3938:socket_init] 0-socket.management:
Configued transport.tcp-user-timeout=60
[2017-05-03 05:12:29.780610] D [socket.c:4021:socket_init] 0-socket.management:
SSL support on the I/O path is NOT enabled
[2017-05-03 05:12:29.780649] D [socket.c:4024:socket_init] 0-socket.management:
SSL support for glusterd is NOT enabled
[2017-05-03 05:12:29.780697] D [socket.c:4041:socket_init] 0-socket.management:
using system polling thread
[2017-05-03 05:12:29.780819] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GlusterD svc peer, Num: 1238437, Ver: 2,
Port: 0
[2017-05-03 05:12:29.780871] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GlusterD svc cli read-only, Num:
1238463, Ver: 2, Port: 0
[2017-05-03 05:12:29.780915] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GlusterD svc mgmt, Num: 1238433, Ver: 2,
Port: 0
[2017-05-03 05:12:29.780960] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GlusterD svc mgmt v3, Num: 1238433, Ver:
3, Port: 0
[2017-05-03 05:12:29.781004] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: Gluster Portmap, Num: 34123456, Ver: 1,
Port: 0
[2017-05-03 05:12:29.781048] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: Gluster Handshake, Num: 14398633, Ver:
2, Port: 0
[2017-05-03 05:12:29.781086] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: Gluster MGMT Handshake, Num: 1239873,
Ver: 1, Port: 0
[2017-05-03 05:12:29.781145] D [rpcsvc.c:2345:rpcsvc_init] 0-rpc-service: RPC
service inited.
[2017-05-03 05:12:29.781173] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2017-05-03 05:12:29.781209] D [rpc-transport.c:283:rpc_transport_load]
0-rpc-transport: attempt to load file
/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/rpc-transport/socket.so
[2017-05-03 05:12:29.781254] D [socket.c:3887:socket_init] 0-socket.management:
disabling nodelay
[2017-05-03 05:12:29.781280] D [socket.c:3938:socket_init] 0-socket.management:
Configued transport.tcp-user-timeout=0
[2017-05-03 05:12:29.781304] D [socket.c:4021:socket_init] 0-socket.management:
SSL support on the I/O path is NOT enabled
[2017-05-03 05:12:29.781327] D [socket.c:4024:socket_init] 0-socket.management:
SSL support for glusterd is NOT enabled
[2017-05-03 05:12:29.781350] D [socket.c:4041:socket_init] 0-socket.management:
using system polling thread
[2017-05-03 05:12:29.781410] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: GlusterD svc cli, Num: 1238463, Ver: 2,
Port: 0
[2017-05-03 05:12:29.781442] D [rpcsvc.c:1895:rpcsvc_program_register]
0-rpc-service: New program registered: Gluster Handshake (CLI Getspec), Num:
14398633, Ver: 2, Port: 0
[2017-05-03 05:12:29.781487] D [MSGID: 0]
[glusterd-utils.c:6384:glusterd_sm_tr_log_init] 0-glusterd: returning 0
[2017-05-03 05:12:29.781519] I [MSGID: 106060] [glusterd.c:1714:init]
0-management: base-port override: 49152
[2017-05-03 05:12:29.781547] D [MSGID: 0] [glusterd.c:1722:init] 0-management:
cannot get run-with-valgrind value
[2017-05-03 05:12:29.836271] D [MSGID: 0]
[glusterd.c:467:glusterd_check_gsync_present] 0-glusterd: Returning 0
[2017-05-03 05:12:29.836406] D [MSGID: 0]
[glusterd.c:603:glusterd_crt_georep_folders] 0-glusterd: Returning 0
[2017-05-03 05:12:31.196124] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-:
Returning 0
[2017-05-03 05:12:31.196227] D [MSGID: 0]
[store.c:438:gf_store_handle_retrieve] 0-: Returning 0
[2017-05-03 05:12:31.196281] D [MSGID: 0] [store.c:306:gf_store_retrieve_value]
0-: key operating-version found
[2017-05-03 05:12:31.196353] I [MSGID: 106513]
[glusterd-store.c:2098:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 30800
[2017-05-03 05:12:31.196421] D [MSGID: 0]
[glusterd-store.c:3190:glusterd_store_retrieve_volumes] 0-management: Returning
with 0
[2017-05-03 05:12:31.196489] D [MSGID: 0] [store.c:420:gf_store_handle_new] 0-:
Returning 0
[2017-05-03 05:12:31.196525] D [MSGID: 0]
[store.c:438:gf_store_handle_retrieve] 0-: Returning 0
[2017-05-03 05:12:31.196558] D [MSGID: 0] [store.c:500:gf_store_iter_new] 0-:
Returning with 0
[2017-05-03 05:12:31.196594] D [MSGID: 0] [store.c:613:gf_store_iter_get_next]
0-: Returning with 0
[2017-05-03 05:12:31.196625] D [MSGID: 0]
[glusterd-utils.c:6384:glusterd_sm_tr_log_init] 0-glusterd: returning 0
[2017-05-03 05:12:31.196664] D [MSGID: 0] [store.c:613:gf_store_iter_get_next]
0-: Returning with 0
[2017-05-03 05:12:31.196698] D [logging.c:1953:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to
flush least recently used log message to disk
[2017-05-03 05:12:31.196694] D [MSGID: 0] [store.c:613:gf_store_iter_get_next]
0-: Returning with 0
[2017-05-03 05:12:31.196697] D [MSGID: 0]
[glusterd-peer-utils.c:487:glusterd_peer_hostname_new] 0-glusterd: Returning 0
[2017-05-03 05:12:31.196778] D [MSGID: 0] [store.c:613:gf_store_iter_get_next]
0-: Returning with -1
[2017-05-03 05:12:31.196813] I [MSGID: 106498]
[glusterd-handler.c:3649:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2017-05-03 05:12:31.196871] D [MSGID: 0]
[glusterd-handler.c:3461:glusterd_transport_inet_options_build] 0-glusterd:
Returning 0
[2017-05-03 05:12:31.196914] I [rpc-clnt.c:1046:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2017-05-03 05:12:31.196940] D [rpc-clnt.c:1058:rpc_clnt_connection_init]
0-management: setting ping-timeout to 42
[2017-05-03 05:12:31.196967] D [rpc-transport.c:283:rpc_transport_load]
0-rpc-transport: attempt to load file
/usr/lib/x86_64-linux-gnu/glusterfs/3.8.11/rpc-transport/socket.so
[2017-05-03 05:12:31.197020] D [socket.c:3938:socket_init] 0-management:
Configued transport.tcp-user-timeout=60
[2017-05-03 05:12:31.197050] D [socket.c:4021:socket_init] 0-management: SSL
support on the I/O path is NOT enabled
[2017-05-03 05:12:31.197073] D [socket.c:4024:socket_init] 0-management: SSL
support for glusterd is NOT enabled
[2017-05-03 05:12:31.197095] D [socket.c:4041:socket_init] 0-management: using
system polling thread
[2017-05-03 05:12:31.197125] D [name.c:168:client_fill_address_family]
0-management: address-family not specified, marking it as unspec for
getaddrinfo to resolve from (remote-host:
ip-10-0-50-25.us-west-1.compute.internal)
[2017-05-03 05:12:31.209975] D [MSGID: 0] [common-utils.c:321:gf_resolve_ip6]
0-resolver: returning ip-10.0.50.25 (port-24007) for hostname:
ip-10-0-50-25.us-west-1.compute.internal and port: 24007
[2017-05-03 05:12:31.210052] D [socket.c:2899:socket_fix_ssl_opts]
0-management: disabling SSL for portmapper connection
[2017-05-03 05:12:31.210219] D [MSGID: 0]
[common-utils.c:3086:gf_ports_reserved] 0-glusterfs: lower: 24007, higher:
24008
[2017-05-03 05:12:31.210267] D [MSGID: 0]
[common-utils.c:3086:gf_ports_reserved] 0-glusterfs: lower: 32765, higher: -1
[2017-05-03 05:12:31.210340] D [MSGID: 0]
[common-utils.c:3086:gf_ports_reserved] 0-glusterfs: lower: -1, higher: -1

Program received signal SIGSEGV, Segmentation fault.
0x00000001f08176c8 in ?? ()
(gdb) bt
#0  0x00000001f08176c8 in ?? ()
#1  0x0000000000000000 in ?? ()

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list