[Bugs] [Bug 1226792] New: Statfs is hung because of frame loss in quota
bugzilla at redhat.com
bugzilla at redhat.com
Mon Jun 1 06:31:35 UTC 2015
https://bugzilla.redhat.com/show_bug.cgi?id=1226792
Bug ID: 1226792
Summary: Statfs is hung because of frame loss in quota
Product: GlusterFS
Version: 3.7.0
Component: quota
Keywords: Triaged
Severity: medium
Priority: medium
Assignee: bugs at gluster.org
Reporter: vmallika at redhat.com
CC: bugs at gluster.org, gluster-bugs at redhat.com,
jdarcy at redhat.com, nbalacha at redhat.com,
rgowdapp at redhat.com, vmallika at redhat.com
Depends On: 1178619
Blocks: 1186580 (qe_tracker_everglades), 1226162
+++ This bug was initially created as a clone of Bug #1178619 +++
Description of problem:
Rebalance process is hung in statfs call of quota and fails after time out
###################################################################
1. crated a 6x2 dist-rep volume
2. Ran ACA script which does deep directory creation and renaming of
directories and files
3. while script is running did add-brick and rebalance
Result:
Rebalance will be hung for 1800 seconds which is call bail timeout then
it runs to completion
statedump:
--------------
[global.callpool.stack.1.frame.1]
ref_count=1
translator=test-server
complete=0
[global.callpool.stack.1.frame.2]
ref_count=0
translator=test-quota
complete=0
parent=/brick2/test7
wind_from=io_stats_statfs
wind_to=FIRST_CHILD(this)->fops->statfs
unwind_to=io_stats_statfs_cbk
[global.callpool.stack.1.frame.3]
ref_count=1
translator=/brick2/test7
complete=0
parent=test-server
wind_from=server_statfs_resume
wind_to=bound_xl->fops->statfs
unwind_to=server_statfs_cbk
>From rebalance logs
===========
[2015-01-03 14:49:59.065353] E [rpc-clnt.c:201:call_bail]
0-test-client-1: bailing out frame type(GlusterFS 3.3) op(STATFS(14)) xid =
0x794 sent = 2015-01-03 14:19:58.397959. timeout = 1800 for
10.70.44.70:49152
Version-Release number of selected component (if applicable):
How reproducible:
When building ancestry fails, it results in frame loss as error is not handled
properly. We saw an error log in brick process which said open failed on the
same gfid (on which statfs was issued). This open most likely would've been
issued as part of Ancestry building code in quota.
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
--- Additional comment from Anand Avati on 2015-01-05 01:52:17 EST ---
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs
frame-loss when an error happens during ancestry building.) posted (#4) for
review on master by Raghavendra G (rgowdapp at redhat.com)
--- Additional comment from Anand Avati on 2015-04-30 03:42:31 EDT ---
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs
frame-loss when an error happens during ancestry building.) posted (#5) for
review on master by Vijaikumar Mallikarjuna (vmallika at redhat.com)
--- Additional comment from Niels de Vos on 2015-05-22 06:21:36 EDT ---
I've dropped this bug from the glusterfs-3.7.1 tracker. Please clone this bug
and have the clone depend on 1178619 (this bug) and block "glusterfs-3.7.1".
--- Additional comment from Anand Avati on 2015-05-28 00:23:31 EDT ---
REVIEW: http://review.gluster.org/9380 (features/quota: prevent statfs
frame-loss when an error happens during ancestry building.) posted (#6) for
review on master by Raghavendra G (rgowdapp at redhat.com)
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1178619
[Bug 1178619] Statfs is hung because of frame loss in quota
https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1226162
[Bug 1226162] Statfs is hung because of frame loss in quota
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list