[Bugs] [Bug 1505221] New: glusterfs client crash when removing directories
bugzilla at redhat.com
bugzilla at redhat.com
Mon Oct 23 04:32:22 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1505221
Bug ID: 1505221
Summary: glusterfs client crash when removing directories
Product: GlusterFS
Version: 3.12
Component: distribute
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: bugs at gluster.org, nbalacha at redhat.com,
zhhuan at gmail.com
Depends On: 1490642
+++ This bug was initially created as a clone of Bug #1490642 +++
Description of problem:
Glusterfs client crashes when performing removing of directories in parallel.
This issue is found by LTP test case inode02. Glusterfs needs to configure with
more then 1 bricks. It is much more easy to reproduce with commit "event/epoll:
Add back socket for polling of events immediately after reading the entire rpc
message from the wire".
Version-Release number of selected component (if applicable):
mainline
How reproducible:
On some test machine, it crashes every time. However on some other machine, it
never crashes.
Steps to Reproduce:
1. create glusterfs with >1 bricks
2. fuse mount glusterfs
3. run ltp test case inode02
Actual results:
gluster crashes when removing directories and test fails
Expected results:
test finishes without error.
Additional info:
--- Additional comment from Nithya Balachandran on 2017-09-20 10:19:47 EDT ---
Can you please provide the coredump and rpm versions?
--- Additional comment from Zhang Huan on 2017-09-21 22:14:17 EDT ---
I've a fix for this issue. Since I could not login to review.gluster.org, I put
the link of it below FYI.
https://github.com/zhanghuan/glusterfs-1/commit/cd383bc1f49975fae769bed1cbd67e3b0a309819
--- Additional comment from Nithya Balachandran on 2017-09-22 00:29:55 EDT ---
Thank you Zhang for finding the BZ and the fix.
Are you able to log in to review.gluster.org now? It would be great if you can
submit the patch there.
Regards,
Nithya
--- Additional comment from Zhang Huan on 2017-09-22 00:40:42 EDT ---
No, "signed in with GitHub" still gives me a result of forbidden. It is been
for a while.
I saw your comment on my patch. It is good advise, I will modify the patch
accordingly and resent after test.
Thank you for your reply.
--- Additional comment from Nithya Balachandran on 2017-09-22 01:58:17 EDT ---
(In reply to Zhang Huan from comment #4)
> No, "signed in with GitHub" still gives me a result of forbidden. It is been
> for a while.
>
You can ask for help on this by logging into the #gluster channel in IRC. Ask
for nigelb.
--- Additional comment from Nithya Balachandran on 2017-10-11 04:12:21 EDT ---
Hi Zhang,
Please file a bug for the issue where you cannot log into review.gluster.org.
Please use component project-infrastructure.
Thanks,
Nithya
--- Additional comment from Nithya Balachandran on 2017-10-11 04:16:22 EDT ---
(In reply to Nithya Balachandran from comment #6)
> Hi Zhang,
>
> Please file a bug for the issue where you cannot log into
> review.gluster.org. Please use component project-infrastructure.
>
> Thanks,
> Nithya
Please ignore this - I just realised there is a BZ already.
--- Additional comment from Zhang Huan on 2017-10-12 02:03:15 EDT ---
The login issue has been fixed. Related link is
https://bugzilla.redhat.com/show_bug.cgi?id=1494363
I will continue to post the patch to review.gluster.org for review.
--- Additional comment from Worker Ant on 2017-10-13 01:53:31 EDT ---
REVIEW: https://review.gluster.org/18517 (cluster/dht: fix crash when deleting
directories) posted (#1) for review on master by Zhang Huan
(zhanghuan at open-fs.com)
--- Additional comment from Worker Ant on 2017-10-16 06:33:17 EDT ---
COMMIT: https://review.gluster.org/18517 committed in master by Raghavendra G
(rgowdapp at redhat.com)
------
commit 206120126d455417a81a48ae473d49be337e9463
Author: Zhang Huan <zhanghuan at open-fs.com>
Date: Tue Sep 5 11:36:25 2017 +0800
cluster/dht: fix crash when deleting directories
In DHT, after locks on all subvolumes are acquired, it would perform the
following steps sequentially,
1. send remove dir on all other subvolumes except the hashed one in a loop;
2. wait for all pending rmdir to be done
3. remove dir on the hashed subvolume
The problem is that in step 1 there is a check to skip hashed subvolume
in the loop. If the last subvolume to check is actually the
hashed one, and step 3 is quickly done before the last and hashed
subvolume is checked, by accessing shared context data be destroyed in
step 3, would cause a crash.
Fix by saving shared data in a local variable to access later in the
loop.
Change-Id: I8db7cf7cb262d74efcb58eb00f02ea37df4be4e2
BUG: 1490642
Signed-off-by: Zhang Huan <zhanghuan at open-fs.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1490642
[Bug 1490642] glusterfs client crash when removing directories
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list