[Bugs] [Bug 1654917] cleanup resources in server_init in case of failure

bugzilla at redhat.com bugzilla at redhat.com
Mon Dec 3 17:14:08 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1654917

Raghavendra Bhat <rabhat at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rabhat at redhat.com



--- Comment #3 from Raghavendra Bhat <rabhat at redhat.com> ---


Glusterd crashed in init.

Because, in glusterd init it tries to initialize rdma transport (because it is
mentioned in the volfile). But if rdma libraries are not present in a machine,
then initializing rdma transport fails in rdma_trasport_load. It exactly fails
at this dlopen call

                                                                               
                         |
    handle = dlopen(name, RTLD_NOW);                                           
                         |
    if (handle == NULL) {                                                      
                         |
        gf_log("rpc-transport", GF_LOG_ERROR, "%s", dlerror());                
                         |
        gf_log("rpc-transport", GF_LOG_WARNING,                                
                         |
               "volume '%s': transport-type '%s' is not valid or "             
                         |
               "not found on this machine",                                    
                         |
               trans_name, type);                                              
                         |
        goto fail;                                                             
                         |
    }                                  


[2018-12-03 16:48:04.557840] I [socket.c:931:__socket_server_bind]
0-socket.management: process started listening on port (24007)
[2018-12-03 16:48:04.558021] E [rpc-transport.c:295:rpc_transport_load]
0-rpc-transport: /usr/local/lib/glusterfs/6dev/rpc-transport/rdma.so: cannot
open shared object file: No such file or directory
[2018-12-03 16:48:04.558046] W [rpc-transport.c:299:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid
or not found on this machine


As part of this failure handling, rpc_trasport_cleanup is called. And there,
transport->fini is called unconditionally.

But, between the allocation of memory and loading the fini symbol from the
shared object, there are several other places where things can fail and
rpc_traport_load can enter the failure mode (including the failure to load the
fini symbol itself). Eventually this can result in crash.


pending frames:
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2018-12-03 16:48:04
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6dev
/usr/local/lib/libglusterfs.so.0(+0x2d9bb)[0x7fe10646d9bb]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x259)[0x7fe106476ef3]
glusterd(glusterfsd_print_trace+0x1f)[0x40b15e]
/lib64/libc.so.6(+0x385c0)[0x7fe105d395c0]


This is the backtrace of the core:

Core was generated by `glusterd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000000000 in ?? ()
[Current thread is 1 (Thread 0x7f3bc6182880 (LWP 21751))]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.28-20.fc29.x86_64 keyutils-libs-1.5.10-8.fc29.x86_64
krb5-libs-1.16.1-21.fc29.x86_64 libcom_err-1.44.3-1.fc29.x86_64
libselinux-2.8-4.fc29.x86_64 libtirpc-1.1.4-2.rc2.fc29.x86_64
libxml2-2.9.8-4.fc29.x86_64 openssl-libs-1.1.1-3.fc29.x86_64
pcre2-10.32-4.fc29.x86_64 userspace-rcu-0.10.1-4.fc29.x86_64
xz-libs-5.2.4-3.fc29.x86_64 zlib-1.2.11-14.fc29.x86_64
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007f3bc6adb4b7 in rpc_transport_cleanup (trans=0x182a680) at
../../../../rpc/rpc-lib/src/rpc-transport.c:168
#2  0x00007f3bc6adbfb5 in rpc_transport_load (ctx=0x1791280, options=0x17de4b8,
trans_name=0x18095b0 "rdma.management") at
../../../../rpc/rpc-lib/src/rpc-transport.c:375
#3  0x00007f3bc6ad8456 in rpcsvc_create_listener (svc=0x17d9300,
options=0x17de4b8, name=0x18095b0 "rdma.management") at
../../../../rpc/rpc-lib/src/rpcsvc.c:1991
#4  0x00007f3bc6ad87b8 in rpcsvc_create_listeners (svc=0x17d9300,
options=0x17de4b8, name=0x17d1b00 "management") at
../../../../rpc/rpc-lib/src/rpcsvc.c:2083
#5  0x00007f3bb5a91338 in init (this=0x17dd480) at
../../../../../xlators/mgmt/glusterd/src/glusterd.c:1774
#6  0x00007f3bc6b3d179 in __xlator_init (xl=0x17dd480) at
../../../libglusterfs/src/xlator.c:718
#7  0x00007f3bc6b3d2c3 in xlator_init (xl=0x17dd480) at
../../../libglusterfs/src/xlator.c:745
#8  0x00007f3bc6b8dce8 in glusterfs_graph_init (graph=0x17d16c0) at
../../../libglusterfs/src/graph.c:359
#9  0x00007f3bc6b8e8f8 in glusterfs_graph_activate (graph=0x17d16c0,
ctx=0x1791280) at ../../../libglusterfs/src/graph.c:722
#10 0x000000000040b8fe in glusterfs_process_volfp (ctx=0x1791280, fp=0x17d09c0)
at ../../../glusterfsd/src/glusterfsd.c:2597
#11 0x000000000040bace in glusterfs_volumes_init (ctx=0x1791280) at
../../../glusterfsd/src/glusterfsd.c:2670
#12 0x000000000040bf9e in main (argc=1, argv=0x7ffc12766108) at
../../../glusterfsd/src/glusterfsd.c:2823
(gdb) frame 1
#1  0x00007f3bc6adb4b7 in rpc_transport_cleanup (trans=0x182a680) at
../../../../rpc/rpc-lib/src/rpc-transport.c:168
168        trans->fini(trans);
(gdb) l
163    rpc_transport_cleanup(rpc_transport_t *trans)
164    {
165        if (!trans)
166            return;
167    
168        trans->fini(trans);
169        GF_FREE(trans->name);
170    
171        if (trans->xl)
172            pthread_mutex_destroy(&trans->lock);
(gdb) p trans->fini
$1 = (void (*)(rpc_transport_t *)) 0x0
(gdb) frame 2
#2  0x00007f3bc6adbfb5 in rpc_transport_load (ctx=0x1791280, options=0x17de4b8,
trans_name=0x18095b0 "rdma.management") at
../../../../rpc/rpc-lib/src/rpc-transport.c:375
375            rpc_transport_cleanup(trans);
(gdb) l
370    
371        success = _gf_true;
372    
373    fail:
374        if (!success) {
375            rpc_transport_cleanup(trans);
376            GF_FREE(name);
377    
378            return_trans = NULL;
379        }
(gdb) p success
$2 = false
(gdb) p trans->fini
$3 = (void (*)(rpc_transport_t *)) 0x0

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=e3IFuYYPZT&a=cc_unsubscribe


More information about the Bugs mailing list