[Bugs] [Bug 1231617] New: Scrubber crash upon pause

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 15 05:50:29 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1231617

            Bug ID: 1231617
           Summary: Scrubber crash upon pause
           Product: GlusterFS
           Version: mainline
         Component: bitrot
          Assignee: bugs at gluster.org
          Reporter: vshankar at redhat.com
                CC: anekkunt at redhat.com, bugs at gluster.org,
                    ggarg at redhat.com, nsathyan at redhat.com,
                    rmekala at redhat.com
        Depends On: 1226666, 1226830
      Docs Contact: bugs at gluster.org



+++ This bug was initially created as a clone of Bug #1226830 +++

Description of problem:
Pausing scrubber results in scrubber process crashing at times.

Version-Release number of selected component (if applicable):
3.7.0

How reproducible:
Sometimes

Steps to Reproduce:
1. Create & start a Gluster volume
2. Enable bitrot on the volume
3. Pause scrubber for this volume as per below:

# gluster volume bitrot <vol> scrub pause

Actual results:
Scrubber process crashes at times

Expected results:
Scrubber process should be running (although it should not scrub the filesystem
for the volume)

BT (reported by anekkunt:
http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html)

(gdb) bt
#0  0x00007f89d6224731 in gf_tw_mod_timer_pending (base=0xf2fbc0, timer=0x0,
expires=233889) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/contrib/timer-wheel/timer-wheel.c:239
#1  0x00007f89c82ce7e8 in br_fsscan_reschedule (this=0x7f89c4008980,
child=0x7f89c4011238, fsscan=0x7f89c4012290, fsscrub=0x7f89c4010010,
pendingcheck=_gf_true)
    at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/bit-rot/src/bitd/bit-rot-scrub.c:703
#2  0x00007f89c82cc9d4 in reconfigure (this=0x7f89c4008980,
options=0x7f89d3bc9558) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/bit-rot/src/bitd/bit-rot.c:1673
#3  0x00007f89d62044cd in xlator_reconfigure_rec (old_xl=0x7f89c4008980,
new_xl=0x7f89c409b460) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1084
#4  0x00007f89d6204414 in xlator_reconfigure_rec (old_xl=0x7f89c400a6c0,
new_xl=0x7f89c409c500) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1070
#5  0x00007f89d62045df in xlator_tree_reconfigure (old_xl=0x7f89c400a6c0,
new_xl=0x7f89c409c500) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1112
#6  0x00007f89d61ec7bd in glusterfs_graph_reconfigure (oldgraph=0x7f89c4001d30,
newgraph=0x7f89c4098130) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/graph.c:893
#7  0x00007f89d61ec629 in glusterfs_volfile_reconfigure (oldvollen=932,
newvolfile_fp=0x7f89c4097eb0, ctx=0xefe010,

--- Additional comment from Venky Shankar on 2015-06-01 05:41:22 EDT ---

So, the crash is due to a race between CHILD_UP (where ->timer is initialized
for the subvolume) and reconfigure() which tries to access ->timer to
reschedule the scrub time.

--- Additional comment from Anand Avati on 2015-06-11 10:53:00 EDT ---

REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#3) for review on master by Venky Shankar (vshankar at redhat.com)

--- Additional comment from Anand Avati on 2015-06-14 23:35:07 EDT ---

REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#5) for review on master by Venky Shankar (vshankar at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1226666
[Bug 1226666] BitRot :- Handle brick re-connection sanely in bitd/scrub
process
https://bugzilla.redhat.com/show_bug.cgi?id=1226830
[Bug 1226830] Scrubber crash upon pause
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are the Docs Contact for the bug.


More information about the Bugs mailing list