[Bugs] [Bug 1399015] performance.read-ahead on results in processes on client stuck in IO wait
bugzilla at redhat.com
bugzilla at redhat.com
Tue Dec 13 09:45:54 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1399015
--- Comment #3 from Worker Ant <bugzilla-bot at gluster.org> ---
COMMIT: http://review.gluster.org/15933 committed in release-3.9 by Pranith
Kumar Karampuri (pkarampu at redhat.com)
------
commit 7e35f4c1a8bc5db145aba54b88daf16611de803b
Author: Poornima G <pgurusid at redhat.com>
Date: Mon Nov 21 19:57:08 2016 +0530
libglusterfs: Fix a read hang
Backport of http://review.gluster.org/15923
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.
RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).
Consider the following graph:
...
io-cache (parent)
|
readdir-ahead
|
read-ahead
...
Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
if (!inode_ctx)
STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this),
FIRST_CHILD (frame->this)->fops->readv, fd,
size, offset, flags, xdata);
/* Ideally, this stack_wind should wind to readdir-ahead:readv()
but it winds to read-ahead:readv(). See below for
explaination.
*/
...
}
STACK_WIND_TAIL (frame, obj, fn, ...)
{
frame->this = obj;
/* for the above mentioned graph, frame->this will be readdir-ahead
* frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which
* is as expected
*/
...
THIS = obj;
/* THIS will be read-ahead instead of readdir-ahead!, as obj expands
* to "FIRST_CHILD (frame->this)" and frame->this was pointing
* to readdir-ahead in the previous statement.
*/
...
fn (frame, obj, params);
/* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
* as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and
* frame->this was pointing ro readdir-ahead in the first statement
*/
...
}
Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame->this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame->this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.
Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
next_xl = obj /* resolve obj as the variables passed in obj macro
can be overwritten in the further instrucions */
next_xl_fn = fn /* resolve fn and store in a tmp variable, before
modifying any variables */
frame->this = next_xl;
...
THIS = next_xl;
...
next_xl_fn (frame, next_xl, params);
...
}
>Reviewed-on: http://review.gluster.org/15923
>Smoke: Gluster Build System <jenkins at build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
>Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
>CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
(Cherry picked from commit 8943c19a2ef51b6e4fa66cb57211d469fe558579)
BUG: 1399015
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G <pgurusid at redhat.com>
Reviewed-on: http://review.gluster.org/15933
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
Smoke: Gluster Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list