[Bugs] [Bug 1247917] New: ./tests/basic/volume-snapshot.t spurious fail causing glusterd crash.

bugzilla at redhat.com bugzilla at redhat.com
Wed Jul 29 08:49:33 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1247917

            Bug ID: 1247917
           Summary: ./tests/basic/volume-snapshot.t  spurious fail causing
                    glusterd crash.
           Product: GlusterFS
           Version: 3.7.3
         Component: tests
          Keywords: Triaged
          Severity: medium
          Priority: medium
          Assignee: bugs at gluster.org
          Reporter: anekkunt at redhat.com
                CC: amukherj at redhat.com, bugs at gluster.org,
                    gluster-bugs at redhat.com
        Depends On: 1246432



+++ This bug was initially created as a clone of Bug #1246432 +++

(gdb) bt 
#0  0x00007f4078ca4e2c in vfprintf () from ./lib64/libc.so.6
#1  0x00007f4078ccc752 in vsnprintf () from ./lib64/libc.so.6
#2  0x00007f4078cac223 in snprintf () from ./lib64/libc.so.6
#3  0x00007f406f582e19 in glusterd_volume_stop_glusterfs (volinfo=0x1e93d90,
brickinfo=0x1e9fdc0, del_brick=_gf_false) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-utils.c:1754
#4  0x00007f406f58fda4 in glusterd_brick_stop (volinfo=0x1e93d90,
brickinfo=0x1e9fdc0, del_brick=_gf_false) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-utils.c:5458
#5  0x00007f406f61a84c in glusterd_snap_volume_remove (rsp_dict=0x7f405800100c,
snap_vol=0x1e93d90, remove_lvm=_gf_false, force=_gf_false)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot.c:2897
#6  0x00007f406f61adf7 in glusterd_snap_remove (rsp_dict=0x7f405800100c,
snap=0x1e8bab0, remove_lvm=_gf_false, force=_gf_false)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot.c:3005
#7  0x00007f406f646ea0 in glusterd_compare_and_update_snap
(peer_data=0x7f405800176c, snap_count=2, peername=0x7f40580015e0 "127.1.1.3",
peerid=0x7f4058001650
"8\370\365\253\313\vN7\226\067\246\020\212\211'W\340\025")
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:1849
#8  0x00007f406f647167 in glusterd_compare_friend_snapshots
(peer_data=0x7f405800176c, peername=0x7f40580015e0 "127.1.1.3",
peerid=0x7f4058001650
"8\370\365\253\313\vN7\226\067\246\020\212\211'W\340\025")
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:1904
#9  0x00007f406f5689f3 in glusterd_ac_handle_friend_add_req
(event=0x7f4058001640, ctx=0x7f40580016d0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-sm.c:831
#10 0x00007f406f569290 in glusterd_friend_sm () at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-sm.c:1253
#11 0x00007f406f55ee14 in __glusterd_handle_incoming_friend_req
(req=0x7f405800511c) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:2541
#12 0x00007f406f5576ea in glusterd_big_locked_handler (req=0x7f405800511c,
actor_fn=0x7f406f55ec78 <__glusterd_handle_incoming_friend_req>)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:79
#13 0x00007f406f55ee4a in glusterd_handle_incoming_friend_req
(req=0x7f405800511c) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:2551
#14 0x00007f4079ebb06d in rpcsvc_handle_rpc_call (svc=0x1e1e430,
trans=0x7f4058004570, msg=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:699
#15 0x00007f4079ebb3e0 in rpcsvc_notify (trans=0x7f4058004570,
mydata=0x1e1e430, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:793
#16 0x00007f4079ec0aeb in rpc_transport_notify (this=0x7f4058004570,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:538
#17 0x00007f406dbe587b in socket_event_poll_in (this=0x7f4058004570) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2285
#18 0x00007f406dbe5dd1 in socket_event_handler (fd=16, idx=7,
data=0x7f4058004570, poll_in=1, poll_out=0, poll_err=0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2398
#19 0x00007f407a1749f0 in event_dispatch_epoll_handler (event_pool=0x1e04c90,
event=0x7f4063ffee70) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:570
#20 0x00007f407a174dde in event_dispatch_epoll_worker (data=0x1eb4610) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:673
#21 0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#22 0x00007f4078d458fd in clone () from ./lib64/libc.so.6
(gdb) t a a bt

Thread 9 (LWP 2819):
#0  0x00007f40793e23f5 in __lll_unlock_wake () from ./lib64/libpthread.so.0
#1  0x00007f40793de877 in _L_unlock_657 () from ./lib64/libpthread.so.0
#2  0x00007f40793de7df in pthread_mutex_unlock () from ./lib64/libpthread.so.0
#3  0x00007f407a1541c0 in synclock_unlock (lock=0x7f407a4637d8) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:1069
#4  0x00007f406f5576ff in glusterd_big_locked_handler (req=0x7f405c00093c,
actor_fn=0x7f406f55ec78 <__glusterd_handle_incoming_friend_req>)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:80
#5  0x00007f406f55ee4a in glusterd_handle_incoming_friend_req
(req=0x7f405c00093c) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:2551
#6  0x00007f4079ebb06d in rpcsvc_handle_rpc_call (svc=0x1e1e430,
trans=0x7f4058006b70, msg=0x7f405c005fb0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:699
#7  0x00007f4079ebb3e0 in rpcsvc_notify (trans=0x7f4058006b70,
mydata=0x1e1e430, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f405c005fb0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:793
#8  0x00007f4079ec0aeb in rpc_transport_notify (this=0x7f4058006b70,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f405c005fb0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:538
#9  0x00007f406dbe587b in socket_event_poll_in (this=0x7f4058006b70) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2285
#10 0x00007f406dbe5dd1 in socket_event_handler (fd=20, idx=4,
data=0x7f4058006b70, poll_in=1, poll_out=0, poll_err=0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2398
#11 0x00007f407a1749f0 in event_dispatch_epoll_handler (event_pool=0x1e04c90,
event=0x7f406b25ae70) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:570
#12 0x00007f407a174dde in event_dispatch_epoll_worker (data=0x1e0f7f0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:673
#13 0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#14 0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 8 (LWP 2818):
#0  0x00007f4078d45ef3 in epoll_wait () from ./lib64/libc.so.6
#1  0x00007f407a174dac in event_dispatch_epoll_worker (data=0x1eb3e50) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:663
#2  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#3  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 7 (LWP 2686):
#0  0x00007f40793df98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x00007f407a1532af in syncenv_task (proc=0x1e0bb20) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:603
#2  0x00007f407a153556 in syncenv_processor (thdata=0x1e0bb20) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:695
#3  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#4  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 6 (LWP 2684):
#0  0x00007f40793e34b5 in sigwait () from ./lib64/libpthread.so.0
#1  0x0000000000409705 in glusterfs_sigwaiter (arg=0x7ffedf2effa0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd.c:1984
#2  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#3  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 5 (LWP 2683):
#0  0x00007f40793e2f3d in nanosleep () from ./lib64/libpthread.so.0
#1  0x00007f407a122b80 in gf_timer_proc (ctx=0x1de6010) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/timer.c:200
#2  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#3  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 4 (LWP 2815):
#0  0x00007f40793df5bc in pthread_cond_wait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x00007f406f60c3ed in hooks_worker (args=0x1e13cb0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-hooks.c:529
#2  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#3  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 3 (LWP 2685):
#0  0x00007f40793df98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
./lib64/libpthread.so.0
#1  0x00007f407a1532af in syncenv_task (proc=0x1e0b760) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:603
#2  0x00007f407a153556 in syncenv_processor (thdata=0x1e0b760) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/syncop.c:695
#3  0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#4  0x00007f4078d458fd in clone () from ./lib64/libc.so.6

Thread 2 (LWP 2682):
#0  0x00007f40793dc22d in pthread_join () from ./lib64/libpthread.so.0
#1  0x00007f407a175006 in event_dispatch_epoll (event_pool=0x1e04c90) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:757
#2  0x00007f407a13da06 in event_dispatch (event_pool=0x1e04c90) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event.c:123
#3  0x000000000040a272 in main (argc=9, argv=0x7ffedf2f1208) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/glusterfsd/src/glusterfsd.c:2328

Thread 1 (LWP 2816):
#0  0x00007f4078ca4e2c in vfprintf () from ./lib64/libc.so.6
---Type <return> to continue, or q <return> to quit---
#1  0x00007f4078ccc752 in vsnprintf () from ./lib64/libc.so.6
#2  0x00007f4078cac223 in snprintf () from ./lib64/libc.so.6
#3  0x00007f406f582e19 in glusterd_volume_stop_glusterfs (volinfo=0x1e93d90,
brickinfo=0x1e9fdc0, del_brick=_gf_false) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-utils.c:1754
#4  0x00007f406f58fda4 in glusterd_brick_stop (volinfo=0x1e93d90,
brickinfo=0x1e9fdc0, del_brick=_gf_false) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-utils.c:5458
#5  0x00007f406f61a84c in glusterd_snap_volume_remove (rsp_dict=0x7f405800100c,
snap_vol=0x1e93d90, remove_lvm=_gf_false, force=_gf_false)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot.c:2897
#6  0x00007f406f61adf7 in glusterd_snap_remove (rsp_dict=0x7f405800100c,
snap=0x1e8bab0, remove_lvm=_gf_false, force=_gf_false)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot.c:3005
#7  0x00007f406f646ea0 in glusterd_compare_and_update_snap
(peer_data=0x7f405800176c, snap_count=2, peername=0x7f40580015e0 "127.1.1.3",
peerid=0x7f4058001650
"8\370\365\253\313\vN7\226\067\246\020\212\211'W\340\025")
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:1849
#8  0x00007f406f647167 in glusterd_compare_friend_snapshots
(peer_data=0x7f405800176c, peername=0x7f40580015e0 "127.1.1.3",
peerid=0x7f4058001650
"8\370\365\253\313\vN7\226\067\246\020\212\211'W\340\025")
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-snapshot-utils.c:1904
#9  0x00007f406f5689f3 in glusterd_ac_handle_friend_add_req
(event=0x7f4058001640, ctx=0x7f40580016d0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-sm.c:831
#10 0x00007f406f569290 in glusterd_friend_sm () at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-sm.c:1253
#11 0x00007f406f55ee14 in __glusterd_handle_incoming_friend_req
(req=0x7f405800511c) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:2541
#12 0x00007f406f5576ea in glusterd_big_locked_handler (req=0x7f405800511c,
actor_fn=0x7f406f55ec78 <__glusterd_handle_incoming_friend_req>)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:79
#13 0x00007f406f55ee4a in glusterd_handle_incoming_friend_req
(req=0x7f405800511c) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/mgmt/glusterd/src/glusterd-handler.c:2551
#14 0x00007f4079ebb06d in rpcsvc_handle_rpc_call (svc=0x1e1e430,
trans=0x7f4058004570, msg=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:699
#15 0x00007f4079ebb3e0 in rpcsvc_notify (trans=0x7f4058004570,
mydata=0x1e1e430, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpcsvc.c:793
#16 0x00007f4079ec0aeb in rpc_transport_notify (this=0x7f4058004570,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f4058001140) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-lib/src/rpc-transport.c:538
#17 0x00007f406dbe587b in socket_event_poll_in (this=0x7f4058004570) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2285
#18 0x00007f406dbe5dd1 in socket_event_handler (fd=16, idx=7,
data=0x7f4058004570, poll_in=1, poll_out=0, poll_err=0) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/rpc/rpc-transport/socket/src/socket.c:2398
#19 0x00007f407a1749f0 in event_dispatch_epoll_handler (event_pool=0x1e04c90,
event=0x7f4063ffee70) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:570
#20 0x00007f407a174dde in event_dispatch_epoll_worker (data=0x1eb4610) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/event-epoll.c:673
#21 0x00007f40793db9d1 in start_thread () from ./lib64/libpthread.so.0
#22 0x00007f4078d458fd in clone () from ./lib64/libc.so.6

--- Additional comment from Anand Avati on 2015-07-24 06:55:23 EDT ---

REVIEW: http://review.gluster.org/11757 (glusterd: glusterd crash due to race
between handshake and snapshot remove threads) posted (#2) for review on master
by Anand Nekkunti (anekkunt at redhat.com)

--- Additional comment from Anand Avati on 2015-07-28 09:26:48 EDT ---

COMMIT: http://review.gluster.org/11757 committed in master by Raghavendra
Talur (rtalur at redhat.com) 
------
commit 51f48bc9a41a5e2004d9051ff90517b01626b08f
Author: anand <anekkunt at redhat.com>
Date:   Fri Jul 24 15:48:50 2015 +0530

    glusterd: glusterd crash due to race between handshake and snapshot remove
threads

    Issue : glusterd was crashing due to race between handshake thread and
snapshot
    remove
    RCA : Snapshot  thread referring  voinfo and same time volinfo is modified
during handshake,
    glusterd was crashing  due to this inconsistent data of volinfo .

    Note: Sending commands without checking cluster status may lead to crash

    Fix:.Wait for handshake complete/cluster ready before proceeding commands.

    Change-Id: Iefd986664bd9dd225f0abf8f85476d6afd206914
    BUG: 1246432
    Signed-off-by: anand <anekkunt at redhat.com>
    Reviewed-on: http://review.gluster.org/11757
    Tested-by: Gluster Build System <jenkins at build.gluster.com>
    Reviewed-by: Atin Mukherjee <amukherj at redhat.com>
    Tested-by: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Raghavendra Talur <rtalur at redhat.com>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1246432
[Bug 1246432] ./tests/basic/volume-snapshot.t  spurious fail causing
glusterd crash.
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list