[Gluster-devel] could you help to check about a glusterfs issue seems to be related to ctime

Zhou, Cynthia (NSB - CN/Hangzhou) cynthia.zhou at nokia-sbell.com
Tue Mar 17 04:48:34 UTC 2020


Hi glusterfs expert,
Our product need to tolerate change date to future and then change back.
How about change like this ?
https://review.gluster.org/#/c/glusterfs/+/24229/1/xlators/storage/posix/src/posix-metadata.c

when time change to future and change back , should still be able to update mdata, so the following changes to file can be populated to other clients.

cynthia

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月12日 17:31
To: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com>
Cc: 'Gluster Devel' <gluster-devel at gluster.org>
Subject: RE: could you help to check about a glusterfs issue seems to be related to ctime

Hi,
One more question, I find each client has the same future time stamp where are those time stamps from, since Since it is different from any brick stored time stamp. And after I modify files  from clients, it remains the same.
[root at mn-0:/home/robot]
# stat /mnt/export/testfile
  File: /mnt/export/testfile
  Size: 193             Blocks: 1          IO Block: 131072 regular file
Device: 28h/40d Inode: 10383279039841136109  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (  615/_nokfsuifileshare)
Access: 2020-04-11 12:20:22.114365172 +0300
Modify: 2020-04-11 12:20:22.121552573 +0300
Change: 2020-04-11 12:20:22.121552573 +0300

[root at mn-0:/home/robot]
# date
Thu Mar 12 11:27:33 EET 2020
[root at mn-0:/home/robot]

[root at mn-0:/home/robot]
# stat /mnt/bricks/export/brick/testfile
  File: /mnt/bricks/export/brick/testfile
  Size: 193             Blocks: 16         IO Block: 4096   regular file
Device: fc02h/64514d    Inode: 512015      Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (  615/_nokfsuifileshare)
Access: 2020-04-11 12:20:22.100395536 +0300
Modify: 2020-03-12 11:25:04.095981276 +0200
Change: 2020-03-12 11:25:04.095981276 +0200
Birth: 2020-04-11 08:53:26.805163816 +0300


[root at mn-1:/root]
# stat /mnt/bricks/export/brick/testfile
  File: /mnt/bricks/export/brick/testfile
  Size: 193             Blocks: 16         IO Block: 4096   regular file
Device: fc02h/64514d    Inode: 512015      Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (  615/_nokfsuifileshare)
Access: 2020-04-11 12:20:22.100395536 +0300
Modify: 2020-03-12 11:25:04.094913452 +0200
Change: 2020-03-12 11:25:04.095913453 +0200
Birth: 2020-03-12 07:53:26.803783053 +0200



From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月12日 16:09
To: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Cc: Gluster Devel <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
Subject: RE: could you help to check about a glusterfs issue seems to be related to ctime

Hi,
This is abnormal test case, however, when this happened it will have big impact on the apps using those files. And this can not be restored automatically unless disable some xlator, I think it is unacceptable for the user apps.


cynthia

From: Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Sent: 2020年3月12日 14:37
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Gluster Devel <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
Subject: Re: could you help to check about a glusterfs issue seems to be related to ctime

All the perf xlators depend on time (mostly mtime I guess). In my setup, only quick read was enabled and hence disabling it worked for me.
All perf xlators needs to be disabled to make it work correctly. But I still failed to understand how normal this kind of workload ?

Thanks,
Kotresh

On Thu, Mar 12, 2020 at 11:20 AM Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
When disable both quick-read and performance.io-cache off everything is back to normal
I attached the log when only enable quick-read and performance.io-cache is still on glusterfs trace log
When execute command “cat /mnt/export/testfile”
Can you help to find why this still to fail to show correct content?
The file size showed is 141, but actually in brick it is longer than that.


cynthia


From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月12日 12:53
To: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Cc: 'Gluster Devel' <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
Subject: RE: could you help to check about a glusterfs issue seems to be related to ctime

From my local test only when disable both features.ctime and ctime.noatime this issue is gone.
Or
Do echo 3 >/proc/sys/vm/drop_caches after each time when some client change the file , can cat command show correct data(same as brick )

cynthia

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月12日 9:53
To: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Cc: Gluster Devel <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
Subject: RE: could you help to check about a glusterfs issue seems to be related to ctime

Hi,
Thanks for your responding!
I’ve tried to disable quick-read:
[root at mn-0:/home/robot]
# gluster v get export all| grep quick
performance.quick-read                  off
performance.nfs.quick-read              off

however, this issue still exists.
Two clients see different contents.

it seems only after I disable utime this issue is completely gone.
features.ctime                          off
ctime.noatime                           off


Do you know why is this?


Cynthia
Nokia storage team
From: Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Sent: 2020年3月11日 22:05
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Gluster Devel <gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>>
Subject: Re: could you help to check about a glusterfs issue seems to be related to ctime

Hi,

I figured out what's happening. The issue is that the file has 'c|a|m' time set to future (The file is created after the date is set to +30 days). This
is done from client-1. On client-2 with correct date, when data is appended, it doesn't update the mtime and ctime because of both mtime and ctime is less than
already set time on the file. This protection is required to keep the latest time when two clients are writing to the same file. We update c|m|a time only if it's greater than
existing time. As a result, the perf xlators on client1 which relies on mtime doesn't send read to server as it thinks nothing is changed as in this case the times haven't
changed.

Workarounds:
1. Disabling quick-read solved the issue for me.
I don't know how real this kind of workload is? Is this a normal scenario ?
The other thing to do is to remove that protection of updating time only if it's greater but that would open up the race when two clients are updating the same file.
This would result in keeping the older time than the latest. This requires code change and I don't think that should be done.

Thanks,
Kotresh

On Wed, Mar 11, 2020 at 3:02 PM Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>> wrote:
Exactly, I am also curious about this. I will debug and update about what's exactly happening.

Thanks,
Kotresh

On Wed, Mar 11, 2020 at 1:56 PM Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
I used to think the file is cached in some client side buffer, because I’ve checked from different sn brick, the file content are all correct. But when I open client side trace level log, and cat the file, I only find lookup/open/flush fop from fuse-bridge side, I am just wondering how is file content served to client side? Should not there be readv fop seen from trace log?

cynthia

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月11日 15:54
To: 'Kotresh Hiremath Ravishankar' <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Subject: RE: could you help to check about a glusterfs issue seems to be related to ctime

Does that require, that for all the time client should be time synched? What if the client time is not synched for a while? And then restored?
I make a test when time has been restored and then client change the file, the file’s modify time, access times remains to be wrong, is that correct?

root at mn-0:/home/robot]
# echo "fromm mn-0">>/mnt/export/testfile
[root at mn-0:/home/robot]
# stat /mnt/export/testfile
  File: /mnt/export/testfile
  Size: 30              Blocks: 1          IO Block: 131072 regular file
Device: 28h/40d Inode: 9855109080001305442  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (  615/_nokfsuifileshare)
Access: 2020-05-10 09:33:59.713840197 +0300
Modify: 2020-05-10 09:33:59.713840197 +0300
Change: 2020-05-10 09:33:59.714413772 +0300  //remains to be future time
Birth: -
[root at mn-0:/home/robot]
# cat /mnt/export/testfil
cat: /mnt/export/testfil: No such file or directory
[root at mn-0:/home/robot]
# cat /mnt/export/testfile
from mn0
from mn-1
fromm mn-0
[root at mn-0:/home/robot]
# date
Wed 11 Mar 2020 09:05:58 AM EET
[root at mn-0:/home/robot]

cynthia

From: Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Sent: 2020年3月11日 15:41
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Subject: Re: could you help to check about a glusterfs issue seems to be related to ctime



On Wed, Mar 11, 2020 at 12:46 PM Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
But there are times that ntp service went wrong, and time on two storage nodes may be not synced.
Or do you mean when can not guarantee that the time on two clients is synched, we should not enable this ctime feature?
Yes, that's correct. The ctime feature relies on the time generated at the client (that's the utime xlator loaded in client) and hence
expects all clients to be ntp synced.

Without ctime feature, is there some way to avoid this “file changed as we read it” issue?
Unfortunately no. That's the only way as of now.
cynthia

From: Kotresh Hiremath Ravishankar <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Sent: 2020年3月11日 15:12
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Subject: Re: could you help to check about a glusterfs issue seems to be related to ctime

Hi,

I have not looked at it. I will take a look and update you. But one of the pre-requisite for ctime feature is that the clients should be time synced (ntp or other means).
Could you try your reproducer by syncing the time of all clients and update me back ?

Thanks,
Kotresh HR

On Wed, Mar 11, 2020 at 12:37 PM Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
I make some test
After change date to future, and touch a file write sth, then restore time to normal.
Then append sth to the file, the file modify time access time is still future, it is not the same with ext4,
I think this is wrong.

[root at mn-0:/home/robot]
# echo "fromm mn-0">>/mnt/export/testfile
[root at mn-0:/home/robot]
# stat /mnt/export/testfile
  File: /mnt/export/testfile
  Size: 30              Blocks: 1          IO Block: 131072 regular file
Device: 28h/40d Inode: 9855109080001305442  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (  615/_nokfsuifileshare)
Access: 2020-05-10 09:33:59.713840197 +0300
Modify: 2020-05-10 09:33:59.713840197 +0300
Change: 2020-05-10 09:33:59.714413772 +0300
Birth: -
[root at mn-0:/home/robot]
# cat /mnt/export/testfil
cat: /mnt/export/testfil: No such file or directory
[root at mn-0:/home/robot]
# cat /mnt/export/testfile
from mn0
from mn-1
fromm mn-0
[root at mn-0:/home/robot]
# date
Wed 11 Mar 2020 09:05:58 AM EET
[root at mn-0:/home/robot]

cynthia

From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: 2020年3月11日 14:41
To: 'khiremat at redhat.com<mailto:khiremat at redhat.com>' <khiremat at redhat.com<mailto:khiremat at redhat.com>>
Subject: could you help to check about a glusterfs issue seems to be related to ctime

Hi glusterfs expert,
Good day!
Do you have a time to check the ticket : https://bugzilla.redhat.com/show_bug.cgi?id=1811907 ?


After I disable feature ctime this issue is gone, however, I meet erro “file changed as we read it” when using tar command.
Have you any idea?
Thanks!





Cynthia
Nokia storage team



--
Thanks and Regards,
Kotresh H R


--
Thanks and Regards,
Kotresh H R


--
Thanks and Regards,
Kotresh H R


--
Thanks and Regards,
Kotresh H R


--
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20200317/07e93da9/attachment-0001.html>


More information about the Gluster-devel mailing list