[Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file
Lian, George (Nokia - CN/Hangzhou)
george.lian at nokia.com
Wed Oct 19 05:21:24 UTC 2016
Hi, Raghavendra,
When we disable md-cache(gluster volume set log performance.md-cache-timeout 0), the issue seems gone.
(we can't disable with " gluster volume set log performance.md-cache off" why?)
So I double confuse that the code I abstract in last mail maybe have some issue for this case.
Could you please share your comments?
Thanks & Best Regards,
George
-----Original Message-----
From: Lian, George (Nokia - CN/Hangzhou)
Sent: Friday, October 14, 2016 1:44 PM
To: 'Raghavendra Gowdappa' <rgowdapp at redhat.com>
Cc: Gluster-devel at gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX at internal.nsn.com>; Zhang, Bingxuan (Nokia - CN/Hangzhou) <bingxuan.zhang at nokia.com>; Zizka, Jan (Nokia - CZ/Prague) <jan.zizka at nokia.com>
Subject: RE: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file
Hi, Raghavendra,
Our version of GlusterFS is 3.6.9, and I also check the newest code of main branch, the function of " mdc_inode_iatt_set_validate" is almost same, from the following code of this function,
We could see a "TODO" comments inline, does it mean if we enhance write-behind feature, the "iatt" field in callback will be NULL, so that inode_invalidate will not be called? So the size of file will not update since "write behind" enabled ?
Is it the root cause for "tail" application failed with "file truncated" issue ?
LOCK (&mdc->lock);
{
if (!iatt || !iatt->ia_ctime) {
mdc->ia_time = 0;
goto unlock;
}
/*
* Invalidate the inode if the mtime or ctime has changed
* and the prebuf doesn't match the value we have cached.
* TODO: writev returns with a NULL iatt due to
* performance/write-behind, causing invalidation on writes.
*/
if (IA_ISREG(inode->ia_type) &&
((iatt->ia_mtime != mdc->md_mtime) ||
(iatt->ia_ctime != mdc->md_ctime)))
if (!prebuf || (prebuf->ia_ctime != mdc->md_ctime) ||
(prebuf->ia_mtime != mdc->md_mtime))
inode_invalidate(inode);
mdc_from_iatt (mdc, iatt);
time (&mdc->ia_time);
}
Best Regards,
George
-----Original Message-----
From: Raghavendra Gowdappa [mailto:rgowdapp at redhat.com]
Sent: Thursday, October 13, 2016 8:58 PM
To: Lian, George (Nokia - CN/Hangzhou) <george.lian at nokia.com>
Cc: Gluster-devel at gluster.org; I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS <I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX at internal.nsn.com>; Zhang, Bingxuan (Nokia - CN/Hangzhou) <bingxuan.zhang at nokia.com>; Zizka, Jan (Nokia - CZ/Prague) <jan.zizka at nokia.com>
Subject: Re: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file
----- Original Message -----
> From: "George Lian (Nokia - CN/Hangzhou)" <george.lian at nokia.com>
> To: Gluster-devel at gluster.org
> Cc: "I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX_GMS" <I_EXT_MBB_WCDMA_SWD3_DA1_MATRIX at internal.nsn.com>, "Bingxuan Zhang (Nokia
> - CN/Hangzhou)" <bingxuan.zhang at nokia.com>, "Jan Zizka (Nokia - CZ/Prague)" <jan.zizka at nokia.com>
> Sent: Thursday, October 13, 2016 2:33:53 PM
> Subject: [Gluster-devel] Issue about the size of fstat is less than the really size of the syslog file
>
> Hi, Dear Expert,
> We have use glusterfs as a network filesystem, and syslog store in there,
> some clients on different host may write the syslog file via “glusterfs”
> mount point.
> Now we encounter an issue when we “tail” the syslog file, it will occasional
> failed with error “ file truncated ”
> As we study and trace with the “tail” source code, it failed with the
> following code:
> if ( S_ISREG (mode) && stats.st_size < f[i].size )
> {
> error (0, 0, _("%s: file truncated"), quotef (name));
> /* Assume the file was truncated to 0,
> and therefore output all "new" data. */
> xlseek (fd, 0, SEEK_SET, name);
> f[i].size = 0;
> }
> When stats.st_size < f[i].size, what mean the size report by fstat is less
> than “tail” had read, it lead to “file truncated”, we also use “strace”
> tools to trace the tail application, the related tail strace log as the
> below:
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192543105, ...}) = 0
> nanosleep({1, 0}, NULL) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544549, ...}) = 0
> read(3, " Data … -"..., 8192) = 1444
> read(3, " Data.. "..., 8192) = 720
> read(3, "", 8192) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=192544789, ...}) = 0
> write(1, “DATA…..” ) = 2164
> write(2, "tail: ", 6tail: ) = 6
> write(2, "/mnt/log/master/syslog: file tru"..., 38/mnt/log/master/syslog:
> file truncated) = 38
> as the above strace log, tail has read 1444+720=2164 bytes,
> but fstat tell “tail” 192544789 – 192543105 = 1664 which less than 2164, so
> it lead to “tail” application “file truncated”.
> And if we turn off “write-behind” feature, the issue will not be reproduced
> any more.
That seems strange. There are no writes happening on the fd/inode through which tail is reading/stating from. So, it seems strange that write-behind is involved here. I suspect whether any of md-cache/read-ahead/io-cache is causing the issue. Can you,
1. Turn off md-cache, read-ahead, io-cache xlators
2. mount glusterfs with --attribute-timeout=0
3. set write-behind on
and rerun the tests? If you don't hit the issue, you can experiment by turning on/off of md-cache, read-ahead and io-cache translators and see what are the minimal number of xlators that need to be turned off to not hit the issue (with write-behind on)?
regards,
Raghavendra
> So we think it may be related to cache consistence issue due to performance
> consider, but we still have concern that:
> The syslog file is used only with “Append” mode, so the size of file
> shouldn’t be reduced, when a client read the file, why “fstat” can’t return
> the really size match to the cache?
> From current investigation, we doubt that the current implement of
> “glusterfs” has a bug on “fstat” when cache is on.
> Your comments is our highly appreciated!
> Thanks & Best Regards
> George
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list