[Gluster-devel] Wrong behavior on fsync of md-cache ?

Mon Nov 24 19:19:03 UTC 2014

On 24.11.2014 18:53, Raghavendra Gowdappa wrote: 

> ----- Original
Message -----
> 
>> From: "Xavier Hernandez" <xhernandez at datalab.es [1]>
To: "Gluster Devel" <gluster-devel at gluster.org [2]>, "Raghavendra
Gowdappa" <rgowdapp at redhat.com [3]> Cc: "Emmanuel Dreyfus"
<manu at netbsd.org [4]> Sent: Monday, November 24, 2014 11:05:57 PM
Subject: Wrong behavior on fsync of md-cache ? Hi, I have an issue in ec
caused by what seems an incorrect behavior in md-cache, at least in
NetBSD (on linux this doesn't seem to happen). The problem happens when
multiple writes are sent in parallel and one of them fails with an
error. After the error, an fsync is issued, before all pending writes
are completed. The problem is that this fsync request is not propagated
through the xlator stack: md-cache automatically answers it with the
same error code returned by the last write, but it does not wait for all
pending writes to finish.
> 
> Are you sure that fsync is
short-circuited in md-cache. Looking at mdc_fsync I can see that fsync
is wound down the xlator stack unconditionally.

Well, I didn't looked
at the code. I assumed that since disabling md-stat it worked
(performace.stat-prefetch off), the problem was there. Sorry.

>
write-behind flushes all pending writes before fsync is wound down the
xlator stack.

I think the problem is here: the first thing wb_fsync()
checks is if there's an error in the fd (wd_fd_err()). If that's the
case, the call is immediately unwinded with that error. The error seems
to be set in wb_fulfill_cbk(). I don't know the internals of write-back
xlator, but this seems to be the problem.

I'm not sure why disabling
md-cache the problem disappeared. Maybe I've made a mistake and I
disabled write-back instead. I'll check it again tomorrow.

Are you sure
fsync is sent by kernel to glusterfs? May be because of a stale stat
information kernel never issues fsync? You can load a debug/trace xlator
just above io-stats and check whether you get fsync call (you can also
dump fuse to glust> . 
> 
> I've seen this lines in log file:
> 
>
[2014-11-24 16:18:29.348552] T [fuse-bridge.c:2457:fuse_fsync_resume]
0-glusterfs-fuse: 395: FSYNC 0xbb242268
> [2014-11-24 16:18:29.348663] W
[fuse-bridge.c:1261:fuse_err_cbk] 0-glusterfs-fuse: 395: FSYNC() ERR =>
-1 (Disc quota exceeded)
> 
> Th
 in between. I assume that this means
that the kernel has sent the FSYNC request and someone has returned
EDQUOT error immediately (I log a message if FSYNC reaches ec). 

Xavi

Links:
------
[1] mailto:xhernandez at datalab.es
[2]
mailto:gluster-devel at gluster.org
[3] mailto:rgowdapp at redhat.com
[4]
mailto:manu at netbsd.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20141124/28bf36eb/attachment.html>