[Bugs] [Bug 1349097] New: SMB: fix for "while running I/ O on cifs mount and doing graph switch causes cifs mount to hang" causes regression
bugzilla at redhat.com
bugzilla at redhat.com
Wed Jun 22 17:00:18 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1349097
Bug ID: 1349097
Summary: SMB: fix for "while running I/O on cifs mount and
doing graph switch causes cifs mount to hang" causes
regression
Product: GlusterFS
Version: 3.8.0
Component: gluster-smb
Severity: high
Assignee: bugs at gluster.org
Reporter: joe at julianfamily.org
CC: asrivast at redhat.com, bugs at gluster.org,
nlevinki at redhat.com, pgurusid at redhat.com,
rhinduja at redhat.com, rjoseph at redhat.com,
sbhaloth at redhat.com, vdas at redhat.com
glfs_io_async_cbk is called with a NULL iovec from several places but this
patch tests iovec for validity:
GF_VALIDATE_OR_GOTO ("gfapi", iovec, inval);
This causes the glfs_io_async_cbk calls from glfs_pwritev_async_cbk,
glfs_fsync_async_cbk, glfs_ftruncate_async_cbk, glfs_discard_async_cbk, and
glfs_zerofill_async_cbk to fail with invalid iovec.
Further, the return code from glfs_io_async_cbk is never checked so the memory
that would normally have been freed in that function is not, resulting in a
memory leak.
+++ This bug was initially created as a clone of Bug #1333266 +++
REVIEW: http://review.gluster.org/14221 (gfapi: Fix a deadlock caused by graph
switch while aio in progress) posted (#1) for review on release-3.8 by Poornima
G (pgurusid at redhat.com)
--- Additional comment from Vijay Bellur on 2016-05-06 10:23:54 EDT ---
COMMIT: http://review.gluster.org/14221 committed in release-3.8 by Niels de
Vos (ndevos at redhat.com)
------
commit 938dfb9d021db20dc3b511b78ec8c137b8ff3e7c
Author: Poornima G <pgurusid at redhat.com>
Date: Fri Apr 29 12:24:24 2016 -0400
gfapi: Fix a deadlock caused by graph switch while aio in progress
RCA:
Currently async nature is achieved by submitting a syncop operation to
synctask threads. Consider a scenario where the graph switch is triggered,
the next write fop checks for the next available graph and sets
fs->migration_in_progess and triggers the migration of fds and other
things, which can cause some syncop_lookup operation. While this fop (on
synctask thread) is waiting for syncop_lookup to return, lets say there
are another 17 write async calls submitted, all these writes are blocked
waiting for fs->migration_in_progress to be unset, hence all the 16
synctask threads are blocked waiting for fs->migration_in_progress to be
unset. Now the syncop_lookup returns, but there are no synctask threads to
process the lookup_cbk. If this syncop_lookup doesn't return,
then fs->migration_in_progress can not be unset by the first fop.
Thus causing a deadlock.
To fix this deadlock, changing all the async APIs to use STACK_WIND,
instead of syntask to achieve async nature. glfs_preadv_async is already
implemented using STACK_WIND, now changing all the other async APIs
also to do the same.
This patch as such will not reduce the performance of async IO, the only
thing that can affect is that, in case of write, the buf passed by
application is copied onto iobuf in the same thread wheras before it
was being copied in synctask thread.
Since, the syncop + graph switch logic (lock across fops) is not a good
candidate for synctask, changing the async APIs to use STACK_WIND
Backport of http://review.gluster.org/#/c/14148/
Change-Id: Idf665cae0a8e27697fbfc5ec8d93a6d6bae3a4f1
BUG: 1333266
Signed-off-by: Poornima G <pgurusid at redhat.com>
Reviewed-on: http://review.gluster.org/14221
Smoke: Gluster Build System <jenkins at build.gluster.com>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.com>
Reviewed-by: Raghavendra Talur <rtalur at redhat.com>
Reviewed-by: Rajesh Joseph <rjoseph at redhat.com>
Reviewed-by: Niels de Vos <ndevos at redhat.com>
--- Additional comment from Niels de Vos on 2016-06-16 10:05:31 EDT ---
This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-3.8.0, please open a new bug report.
glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.
[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list