[Bugs] [Bug 1354262] New: （Quota on）When glusterfsd init failed and exit， sometime lead to crash

Mon Jul 11 05:25:58 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1354262

            Bug ID: 1354262
           Summary: （Quota on）When glusterfsd init failed and exit，
                    sometime lead to crash
           Product: Red Hat Gluster Storage
           Version: 3.1
         Component: quota
          Keywords: Triaged
          Severity: low
          Priority: medium
          Assignee: mselvaga at redhat.com
          Reporter: mselvaga at redhat.com
        QA Contact: storage-qa-internal at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org, iesool at 126.com,
                    mselvaga at redhat.com, rhs-bugs at redhat.com
        Depends On: 1346549

+++ This bug was initially created as a clone of Bug #1346549 +++

Description of problem:
Create a volume like this：
Volume Name: test
Type: Distributed-Disperse
Volume ID: 78bd1b85-cfe9-401e-ac1e-dc9e072ed4db
Status: Started
Number of Bricks: 2 x (2 + 1) = 6
Transport-type: tcp
Bricks:
Brick1: node-1:/disk1
Brick2: node-2:/disk1
Brick3: node-3:/disk1
Brick4: node-1:/disk2
Brick5: node-2:/disk2
Brick6: node-3:/disk2
Options Reconfigured:
performance.readdir-ahead: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on

Then I umount /disk{1..3} and set /disk{1..3} readonly.

I  several attempt to gluster vol start test force， sometimes the glusterfsd
crash.

glusterfsd's log:
[2016-06-15 09:44:20.567687] I [MSGID: 100030] [glusterfsd.c:2338:main]
0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.12
(args: /usr/sbin/glusterfsd -s node-1 --volfile-id test.node-1.disk2 -p
/var/lib/glusterd/vols/test/run/node-1-disk2.pid -S
/var/run/gluster/bddd1d1330cb529b05a3a9266879baee.socket --brick-name /disk2 -l
/var/log/glusterfs/bricks/disk2.log --xlator-option
*-posix.glusterd-uuid=dee1dcb8-280b-4b4c-b5a6-6ad7dbd0360a --brick-port 49153
--xlator-option test-server.listen-port=49153)
[2016-06-15 09:44:20.575048] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2016-06-15 09:44:20.580116] I [graph.c:269:gf_add_cmdline_options]
0-test-server: adding option 'listen-port' for volume 'test-server' with value
'49153'
[2016-06-15 09:44:20.580187] I [graph.c:269:gf_add_cmdline_options]
0-test-posix: adding option 'glusterd-uuid' for volume 'test-posix' with value
'dee1dcb8-280b-4b4c-b5a6-6ad7dbd0360a'
[2016-06-15 09:44:20.580607] I [MSGID: 115034]
[server.c:403:_check_for_auth_option] 0-/disk2: skip format check for non-addr
auth option auth.login./disk2.allow
[2016-06-15 09:44:20.580765] I [MSGID: 115034]
[server.c:403:_check_for_auth_option] 0-/disk2: skip format check for non-addr
auth option auth.login.8306814a-3bf6-49b0-b75a-95665c2ba483.password
[2016-06-15 09:44:20.582297] I [rpcsvc.c:2196:rpcsvc_set_outstanding_rpc_limit]
0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2016-06-15 09:44:20.582499] W [MSGID: 101002] [options.c:957:xl_opt_validate]
0-test-server: option 'listen-port' is deprecated, preferred is
'transport.socket.listen-port', continuing with correction
[2016-06-15 09:44:20.583012] W [socket.c:3759:reconfigure] 0-test-quota: NBIO
on -1 failed (Bad file descriptor)
[2016-06-15 09:44:20.583141] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 2
[2016-06-15 09:44:20.588207] E [index.c:188:index_dir_create] 0-test-index:
/disk2/.glusterfs/indices/xattrop: Failed to create (Permission denied)
[2016-06-15 09:44:20.588401] E [MSGID: 101019] [xlator.c:435:xlator_init]
0-test-index: Initialization of volume 'test-index' failed, review your volfile
again
[2016-06-15 09:44:20.588512] E [graph.c:322:glusterfs_graph_init] 0-test-index:
initializing translator failed
[2016-06-15 09:44:20.588613] E [graph.c:662:glusterfs_graph_activate] 0-graph:
init failed
[2016-06-15 09:44:20.590554] W [glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x307) [0x40dbe7]
-->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x13a) [0x408c7a]
-->/usr/sbin/glusterfsd(cleanup_and_exit+0x5f) [0x40831f] ) 0-: received signum
(1), shutting down
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2016-06-15 09:44:20
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1

gdb bt:
Core was generated by `/usr/sbin/glusterfsd -s node-1 --volfile-id
test.node-1.disk2 -p /var/lib/glust'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fd73c6ff688 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
(gdb) bt
#0  0x00007fd73c6ff688 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#1  0x00007fd73c7006f8 in _Unwind_Backtrace () from
/lib/x86_64-linux-gnu/libgcc_s.so.1
#2  0x00007fd7418dae26 in __GI___backtrace (array=array at entry=0x7fd735bbfb80,
size=size at entry=200) at ../sysdeps/x86_64/backtrace.c:109
#3  0x00007fd742411ea2 in _gf_msg_backtrace_nomem
(level=level at entry=GF_LOG_ALERT, stacksize=stacksize at entry=200) at
logging.c:1095
#4  0x00007fd74243713d in gf_print_trace (signum=11, ctx=0x2049010) at
common-utils.c:615
#5  <signal handler called>
#6  0x00007fd73644adb0 in ?? ()
#7  0x00007fd7421df8e4 in rpc_clnt_notify (trans=<optimized out>,
mydata=0x7fd73803ef80, event=<optimized out>, data=0x7fd7380420f0) at
rpc-clnt.c:957
#8  0x00007fd7421db593 in rpc_transport_notify (this=this at entry=0x7fd7380420f0,
event=event at entry=RPC_TRANSPORT_CONNECT, data=data at entry=0x7fd7380420f0) at
rpc-transport.c:546
#9  0x00007fd73d579f8f in socket_connect_finish
(this=this at entry=0x7fd7380420f0) at socket.c:2429
#10 0x00007fd73d57a3af in socket_event_handler (fd=fd at entry=12,
idx=idx at entry=3, data=0x7fd7380420f0, poll_in=0, poll_out=4, poll_err=0) at
socket.c:2459
#11 0x00007fd74247f9fa in event_dispatch_epoll_handler (event=0x7fd735bc0e90,
event_pool=0x2067da0) at event-epoll.c:575
#12 event_dispatch_epoll_worker (data=0x7fd73801e670) at event-epoll.c:678
#13 0x00007fd741b9d182 in start_thread (arg=0x7fd735bc1700) at
pthread_create.c:312
#14 0x00007fd7418ca47d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

--- Additional comment from jiademing.dd on 2016-06-14 22:04:29 EDT ---

After analysis, rpc_clnt_notify() will call quota_enforcer_notify(),because
rpc_clnt_register_notify (rpc, quota_enforcer_notify, this) in quota.
glusterfsd exit will call glusterfs_graph_destroy(),in
glusterfs_graph_destroy() will         dlclose (xl->dlhandle).

so if dlclose(xl->dlhandle) before rpc_clnt_notify(),quota.so's
quota_enforcer_notify（） invalid, then lead to crash.

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1346549
[Bug 1346549] （Quota on）When glusterfsd init failed and exit， sometime lead
to crash
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=uORiJEKuyp&a=cc_unsubscribe