[Bugs] [Bug 1232307] New: Scrubber crash upon pause
bugzilla at redhat.com
bugzilla at redhat.com
Tue Jun 16 13:08:17 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1232307
Bug ID: 1232307
Summary: Scrubber crash upon pause
Product: Red Hat Gluster Storage
Version: 3.1
Component: glusterfs
Sub Component: bitrot
Assignee: rhs-bugs at redhat.com
Reporter: vshankar at redhat.com
QA Contact: rmekala at redhat.com
CC: anekkunt at redhat.com, bugs at gluster.org,
ggarg at redhat.com, nsathyan at redhat.com,
rmekala at redhat.com
Depends On: 1226666, 1226830, 1231617, 1231619
Group: redhat
+++ This bug was initially created as a clone of Bug #1231617 +++
+++ This bug was initially created as a clone of Bug #1226830 +++
Description of problem:
Pausing scrubber results in scrubber process crashing at times.
Version-Release number of selected component (if applicable):
3.7.0
How reproducible:
Sometimes
Steps to Reproduce:
1. Create & start a Gluster volume
2. Enable bitrot on the volume
3. Pause scrubber for this volume as per below:
# gluster volume bitrot <vol> scrub pause
Actual results:
Scrubber process crashes at times
Expected results:
Scrubber process should be running (although it should not scrub the filesystem
for the volume)
BT (reported by anekkunt:
http://www.gluster.org/pipermail/gluster-devel/2015-June/045410.html)
(gdb) bt
#0 0x00007f89d6224731 in gf_tw_mod_timer_pending (base=0xf2fbc0, timer=0x0,
expires=233889) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/contrib/timer-wheel/timer-wheel.c:239
#1 0x00007f89c82ce7e8 in br_fsscan_reschedule (this=0x7f89c4008980,
child=0x7f89c4011238, fsscan=0x7f89c4012290, fsscrub=0x7f89c4010010,
pendingcheck=_gf_true)
at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/bit-rot/src/bitd/bit-rot-scrub.c:703
#2 0x00007f89c82cc9d4 in reconfigure (this=0x7f89c4008980,
options=0x7f89d3bc9558) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/xlators/features/bit-rot/src/bitd/bit-rot.c:1673
#3 0x00007f89d62044cd in xlator_reconfigure_rec (old_xl=0x7f89c4008980,
new_xl=0x7f89c409b460) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1084
#4 0x00007f89d6204414 in xlator_reconfigure_rec (old_xl=0x7f89c400a6c0,
new_xl=0x7f89c409c500) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1070
#5 0x00007f89d62045df in xlator_tree_reconfigure (old_xl=0x7f89c400a6c0,
new_xl=0x7f89c409c500) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/options.c:1112
#6 0x00007f89d61ec7bd in glusterfs_graph_reconfigure (oldgraph=0x7f89c4001d30,
newgraph=0x7f89c4098130) at
/home/jenkins/root/workspace/rackspace-regression-2GB-triggered/libglusterfs/src/graph.c:893
#7 0x00007f89d61ec629 in glusterfs_volfile_reconfigure (oldvollen=932,
newvolfile_fp=0x7f89c4097eb0, ctx=0xefe010,
--- Additional comment from Venky Shankar on 2015-06-01 05:41:22 EDT ---
So, the crash is due to a race between CHILD_UP (where ->timer is initialized
for the subvolume) and reconfigure() which tries to access ->timer to
reschedule the scrub time.
--- Additional comment from Anand Avati on 2015-06-11 10:53:00 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#3) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-14 23:35:07 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#5) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-15 01:53:31 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#6) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-15 23:53:56 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#7) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-16 02:35:46 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#8) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-16 04:38:27 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#9) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-16 05:42:46 EDT ---
REVIEW: http://review.gluster.org/11147 (features/bitrot: cleanup, v1) posted
(#10) for review on master by Venky Shankar (vshankar at redhat.com)
--- Additional comment from Anand Avati on 2015-06-16 05:42:53 EDT ---
REVIEW: http://review.gluster.org/11248 (tests/bitrot: remove induced delay)
posted (#1) for review on master by Venky Shankar (vshankar at redhat.com)
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1226666
[Bug 1226666] BitRot :- Handle brick re-connection sanely in bitd/scrub
process
https://bugzilla.redhat.com/show_bug.cgi?id=1226830
[Bug 1226830] Scrubber crash upon pause
https://bugzilla.redhat.com/show_bug.cgi?id=1231617
[Bug 1231617] Scrubber crash upon pause
https://bugzilla.redhat.com/show_bug.cgi?id=1231619
[Bug 1231619] BitRot :- Handle brick re-connection sanely in bitd/scrub
process
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=r49ay0g5SZ&a=cc_unsubscribe
More information about the Bugs
mailing list