[Bugs] [Bug 1206134] New: glusterd :- after volume create command time out, deadlock has been observed among glusterd and all command keep failing with error "Another transaction is in progress"

bugzilla at redhat.com bugzilla at redhat.com
Thu Mar 26 12:11:29 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1206134

            Bug ID: 1206134
           Summary: glusterd :- after volume create command time out,
                    deadlock has been observed among glusterd and all
                    command keep failing with error "Another transaction
                    is in progress"
           Product: GlusterFS
           Version: mainline
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: racpatel at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:
=======================
Gluster volume creation command failes with time out error and after that
gluster commands are failing with error "Another transaction is in progress" as
there is a deadlock.

Version-Release number of selected component (if applicable):
=============================================================
3.7dev-0.803.gitf64666f.el6.x86_64

How reproducible:
================
intermittent


Steps to Reproduce:
1.Installed 3.7dev-0.803.gitf64666f.el6.x86_64 on cluster

2. create a volume using below command which gave time out error :-
root at rhs-client38 ~]# gluster v create BitRot1 replica 3
rhs-client44:/pavanbrick6/br1 rhs-client38://pavanbrick6/br1
rhs-client37:/pavanbrick6/br1 rhs-client44:/pavanbrick7/br1
rhs-client38://pavanbrick7/br1 rhs-client37:/pavanbrick7/br1
Error : Request timed out

3. after a while (10-15 min) while checking status found that all commands are
failing as below

[root at rhs-client38 ~]# gluster v create BitRot1 replica 3
rhs-client44:/pavanbrick6/br1 rhs-client38://pavanbrick6/br1
rhs-client37:/pavanbrick6/br1 rhs-client44:/pavanbrick7/br1
rhs-client38://pavanbrick7/br1 rhs-client37:/pavanbrick7/br1
volume create: BitRot1: failed: Volume BitRot1 already exists

[root at rhs-client38 ~]# gluster volume bitrot BitRot1 enable
Bitrot command failed : Another transaction is in progress for BitRot1. Please
try again after sometime.
[root at rhs-client38 ~]# gluster volume bitrot BitRot1 enable
Bitrot command failed : Another transaction is in progress for BitRot1. Please
try again after sometime.
[root at rhs-client38 ~]# less /var/log/glusterfs/etc-glusterfs-glusterd.vol.log 
[root at rhs-client38 ~]# gluster volume bitrot BitRot1 enable
Bitrot command failed : Another transaction is in progress for BitRot1. Please
try again after sometime.
[root at rhs-client38 ~]# gluster volume bitrot BitRot1 enable
Bitrot command failed : Another transaction is in progress for BitRot1. Please
try again after sometime.


Actual results:
===============
Due to Deadlock commands are failing saying "another  transaction is in
progress "



Additional info:
===============
(gdb) bt
#0  0x0000003291a0e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003291a09508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003291a093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f96281958bb in rpc_clnt_disable (rpc=0x7f9618001860) at
rpc-clnt.c:1712
#4  0x00007f962819587e in rpc_clnt_trigger_destroy (rpc=<value optimized out>)
at rpc-clnt.c:1634
#5  rpc_clnt_unref (rpc=<value optimized out>) at rpc-clnt.c:1670
#6  0x00007f962819a765 in rpc_clnt_start_ping (rpc_ptr=0x7f9618001860) at
rpc-clnt-ping.c:265
#7  0x00007f96283e1d30 in gf_timer_proc (ctx=0x2080010) at timer.c:183
#8  0x0000003291a079d1 in start_thread () from /lib64/libpthread.so.0
#9  0x00000032912e88fd in clone () from /lib64/libc.so.6


log snippet:-
[2015-03-26 07:35:49.746017] E
[glusterd-volume-ops.c:321:__glusterd_handle_create_volume] 0-management:
Volume BitRot1 already exists
[2015-03-26 07:35:57.377967] I
[glusterd-handler.c:1321:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-03-26 07:38:53.196387] W [glusterd-locks.c:550:glusterd_mgmt_v3_lock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f96283c4540] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x1ca)[0x7f961e158f3a]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x4ff)[0x7f961e1549df]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7f961e154d1b]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_bitrot+0x2c2)[0x7f961e12c652]
))))) 0-management: Lock for BitRot1 held by
d25fd6c1-bc55-4ba8-befb-3f0f7623a504
[2015-03-26 07:38:53.196419] E [glusterd-syncop.c:1694:gd_sync_task_begin]
0-management: Unable to acquire lock for BitRot1
[2015-03-26 07:39:10.912649] W [glusterd-locks.c:550:glusterd_mgmt_v3_lock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f96283c4540] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x1ca)[0x7f961e158f3a]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x4ff)[0x7f961e1549df]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7f961e154d1b]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_bitrot+0x2c2)[0x7f961e12c652]
))))) 0-management: Lock for BitRot1 held by
d25fd6c1-bc55-4ba8-befb-3f0f7623a504
[2015-03-26 07:39:10.912682] E [glusterd-syncop.c:1694:gd_sync_task_begin]
0-management: Unable to acquire lock for BitRot1
[2015-03-26 07:40:08.276495] W [glusterd-locks.c:550:glusterd_mgmt_v3_lock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f96283c4540] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x1ca)[0x7f961e158f3a]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x4ff)[0x7f961e1549df]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7f961e154d1b]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_bitrot+0x2c2)[0x7f961e12c652]
))))) 0-management: Lock for BitRot1 held by
d25fd6c1-bc55-4ba8-befb-3f0f7623a504
[2015-03-26 07:40:08.276534] E [glusterd-syncop.c:1694:gd_sync_task_begin]
0-management: Unable to acquire lock for BitRot1
[2015-03-26 07:40:57.076025] W [glusterd-locks.c:550:glusterd_mgmt_v3_lock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x7f96283c4540] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_lock+0x1ca)[0x7f961e158f3a]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(gd_sync_task_begin+0x4ff)[0x7f961e1549df]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_op_begin_synctask+0x3b)[0x7f961e154d1b]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_handle_bitrot+0x2c2)[0x7f961e12c652]
))))) 0-management: Lock for BitRot1 held by
d25fd6c1-bc55-4ba8-befb-3f0f7623a504
[2015-03-26 07:40:57.076056] E [glusterd-syncop.c:1694:gd_sync_task_begin]
0-management: Unable to acquire lock for BitRot1

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list