[Gluster-devel] a link issue maybe introduced in a bug fix " Don't let NFS cache stat after writes"

Wed Jan 24 09:04:52 UTC 2018

On Wed, Jan 24, 2018 at 2:24 PM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:

> hi,
>        In the same commit you mentioned earlier, there was this code
> earlier:
> -/* Returns 1 if the stat seems to be filled with zeroes. */
> -int
> -nfs_zero_filled_stat (struct iatt *buf)
> -{
> -        if (!buf)
> -                return 1;
> -
> -        /* Do not use st_dev because it is transformed to store the
> xlator id
> -         * in place of the device number. Do not use st_ino because by
> this time
> -         * we've already mapped the root ino to 1 so it is not guaranteed
> to be
> -         * 0.
> -         */
> -        if ((buf->ia_nlink == 0) && (buf->ia_ctime == 0))
> -                return 1;
> -
> -        return 0;
> -}
> -
> -
>
> I moved this to a common library function that can be used in afr as well.
> Why was it there in NFS? +Niels for answering that question.
>
> If I give you a patch which will assert the error condition, would it be
> possible for you to figure out the first xlator which is unwinding the iatt
> with nlink count as zero but ctime as non-zero?
>

Hey,
      I went through the code, I think gf_fuse_stat2attr() function could
be sending both ctime and nlink count as zero to the kernel like you were
mentioning. So I guess we should wait for Niels' answer about the need for
marking nlink count as zero. We may need to patch fuse code differently and
mark entry, attr valid seconds/nseconds to zero, so that we get a lookup on
the entry.

> On Wed, Jan 24, 2018 at 1:03 PM, Lian, George (NSB - CN/Hangzhou) <
> george.lian at nokia-sbell.com> wrote:
>
>> Hi,  Pranith Kumar,
>>
>>
>>
>> Can you tell me while need set buf->ia_nlink to “0”in function
>> gf_zero_fill_stat(), which API or Application will care it?
>>
>> If I remove this line and also update corresponding in function
>> gf_is_zero_filled_stat,
>>
>> The issue seems gone, but I can’t confirm will lead to other issues.
>>
>>
>>
>> So could you please double check it and give your comments?
>>
>>
>>
>> My change is as the below:
>>
>>
>>
>> gf_boolean_t
>>
>> gf_is_zero_filled_stat (struct iatt *buf)
>>
>> {
>>
>>         if (!buf)
>>
>>                 return 1;
>>
>>
>>
>>         /* Do not use st_dev because it is transformed to store the
>> xlator id
>>
>>          * in place of the device number. Do not use st_ino because by
>> this time
>>
>>          * we've already mapped the root ino to 1 so it is not guaranteed
>> to be
>>
>>          * 0.
>>
>>          */
>>
>> //        if ((buf->ia_nlink == 0) && (buf->ia_ctime == 0))
>>
>>         if (buf->ia_ctime == )
>>
>>                 return 1;
>>
>>
>>
>>         return 0;
>>
>> }
>>
>>
>>
>> void
>>
>> gf_zero_fill_stat (struct iatt *buf)
>>
>> {
>>
>> //       buf->ia_nlink = 0;
>>
>>         buf->ia_ctime = 0;
>>
>> }
>>
>>
>>
>> Thanks & Best Regards
>>
>> George
>>
>> *From:* Lian, George (NSB - CN/Hangzhou)
>> *Sent:* Friday, January 19, 2018 10:03 AM
>> *To:* Pranith Kumar Karampuri <pkarampu at redhat.com>; Zhou, Cynthia (NSB
>> - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
>> *Cc:* Li, Deqian (NSB - CN/Hangzhou) <deqian.li at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>>
>> *Subject:* RE: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>> Hi,
>>
>> >>> Cool, this works for me too. Send me a mail off-list once you are
>> available and we can figure out a way to get into a call and work on this.
>>
>>
>>
>> Have you reproduced the issue per the step I listed in
>> https://bugzilla.redhat.com/show_bug.cgi?id=1531457 and last mail?
>>
>>
>>
>> If not, I would like you could try it yourself , which the difference
>> between yours and mine is just create only 2 bricks instead of 6 bricks.
>>
>>
>>
>> And Cynthia could have a session with you if you needed when I am not
>> available in next Monday and Tuesday.
>>
>>
>>
>> Thanks & Best Regards,
>>
>> George
>>
>>
>>
>> *From:* gluster-devel-bounces at gluster.org [mailto:gluster-devel-bounces@
>> gluster.org <gluster-devel-bounces at gluster.org>] *On Behalf Of *Pranith
>> Kumar Karampuri
>> *Sent:* Thursday, January 18, 2018 6:03 PM
>> *To:* Lian, George (NSB - CN/Hangzhou) <george.lian at nokia-sbell.com>
>> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>;
>> Li, Deqian (NSB - CN/Hangzhou) <deqian.li at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 18, 2018 at 12:17 PM, Lian, George (NSB - CN/Hangzhou) <
>> george.lian at nokia-sbell.com> wrote:
>>
>> Hi,
>>
>> >>>I actually tried it with replica-2 and replica-3 and then distributed
>> replica-2 before replying to the earlier mail. We can have a debugging
>> session if you are okay with it.
>>
>>
>>
>> It is fine if you can’t reproduce the issue in your ENV.
>>
>> And I has attached the detail reproduce log in the Bugzilla FYI
>>
>>
>>
>> But I am sorry I maybe OOO at Monday and Tuesday next week, so debug
>> session will be fine to me at next Wednesday.
>>
>>
>>
>> Cool, this works for me too. Send me a mail off-list once you are
>> available and we can figure out a way to get into a call and work on this.
>>
>>
>>
>>
>>
>>
>>
>> Paste the detail reproduce log FYI here:
>>
>> *root at ubuntu:~# gluster peer probe ubuntu*
>>
>> *peer probe: success. Probe on localhost not needed*
>>
>> *root at ubuntu:~# gluster v create test replica 2 ubuntu:/home/gfs/b1
>> ubuntu:/home/gfs/b2 force*
>>
>> *volume create: test: success: please start the volume to access data*
>>
>> *root at ubuntu:~# gluster v start test*
>>
>> *volume start: test: success*
>>
>> *root at ubuntu:~# gluster v info test*
>>
>>
>>
>> *Volume Name: test*
>>
>> *Type: Replicate*
>>
>> *Volume ID: fef5fca3-81d9-46d3-8847-74cde6f701a5*
>>
>> *Status: Started*
>>
>> *Snapshot Count: 0*
>>
>> *Number of Bricks: 1 x 2 = 2*
>>
>> *Transport-type: tcp*
>>
>> *Bricks:*
>>
>> *Brick1: ubuntu:/home/gfs/b1*
>>
>> *Brick2: ubuntu:/home/gfs/b2*
>>
>> *Options Reconfigured:*
>>
>> *transport.address-family: inet*
>>
>> *nfs.disable: on*
>>
>> *performance.client-io-threads: off*
>>
>> *root at ubuntu:~# gluster v status*
>>
>> *Status of volume: test*
>>
>> *Gluster process                             TCP Port  RDMA Port  Online
>> Pid*
>>
>>
>> *------------------------------------------------------------------------------*
>>
>> *Brick ubuntu:/home/gfs/b1                   49152     0          Y
>> 7798*
>>
>> *Brick ubuntu:/home/gfs/b2                   49153     0          Y
>> 7818*
>>
>> *Self-heal Daemon on localhost               N/A       N/A        Y
>> 7839*
>>
>>
>>
>> *Task Status of Volume test*
>>
>>
>> *------------------------------------------------------------------------------*
>>
>> *There are no active volume tasks*
>>
>>
>>
>>
>>
>> *root at ubuntu:~# gluster v set test cluster.consistent-metadata on*
>>
>> *volume set: success*
>>
>>
>>
>> *root at ubuntu:~# ls /mnt/test*
>>
>> *ls: cannot access '/mnt/test': No such file or directory*
>>
>> *root at ubuntu:~# mkdir -p /mnt/test*
>>
>> *root at ubuntu:~# mount -t glusterfs ubuntu:/test /mnt/test*
>>
>>
>>
>> *root at ubuntu:~# cd /mnt/test*
>>
>> *root at ubuntu:/mnt/test# echo "abc">aaa*
>>
>> *root at ubuntu:/mnt/test# cp aaa bbb;link bbb ccc*
>>
>>
>>
>> *root at ubuntu:/mnt/test# kill -9 7818*
>>
>> *root at ubuntu:/mnt/test# cp aaa ddd;link ddd eee*
>>
>> *link: cannot create link 'eee' to 'ddd': No such file or directory*
>>
>>
>>
>>
>>
>> Best Regards,
>>
>> George
>>
>>
>>
>> *From:* gluster-devel-bounces at gluster.org [mailto:gluster-devel-bounces@
>> gluster.org] *On Behalf Of *Pranith Kumar Karampuri
>> *Sent:* Thursday, January 18, 2018 2:40 PM
>>
>>
>> *To:* Lian, George (NSB - CN/Hangzhou) <george.lian at nokia-sbell.com>
>> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Li, Deqian (NSB - CN/Hangzhou) <
>> deqian.li at nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 18, 2018 at 6:33 AM, Lian, George (NSB - CN/Hangzhou) <
>> george.lian at nokia-sbell.com> wrote:
>>
>> Hi,
>>
>> I suppose the brick numbers in your testing is six, and you just shut
>> down the 3 process.
>>
>> When I reproduce the issue, I only create a replicate volume with 2
>> bricks, only let ONE brick working and set cluster.consistent-metadata on,
>>
>> With this 2 test condition, the issue could 100% reproducible.
>>
>>
>>
>> Hi,
>>
>>       I actually tried it with replica-2 and replica-3 and then
>> distributed replica-2 before replying to the earlier mail. We can have a
>> debugging session if you are okay with it.
>>
>> I am in the middle of a customer issue myself(That is the reason for this
>> delay :-( ) and thinking of wrapping it up early next week. Would that be
>> fine with you?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 16:44:28 :) ⚡ gluster v status
>>
>> Status of volume: r2
>>
>> Gluster process                             TCP Port  RDMA Port  Online
>> Pid
>>
>> ------------------------------------------------------------
>> ------------------
>>
>> Brick localhost.localdomain:/home/gfs/r2_0  49152     0          Y
>> 5309
>>
>> Brick localhost.localdomain:/home/gfs/r2_1  49154     0          Y
>> 5330
>>
>> Brick localhost.localdomain:/home/gfs/r2_2  49156     0          Y
>> 5351
>>
>> Brick localhost.localdomain:/home/gfs/r2_3  49158     0          Y
>> 5372
>>
>> Brick localhost.localdomain:/home/gfs/r2_4  49159     0          Y
>> 5393
>>
>> Brick localhost.localdomain:/home/gfs/r2_5  49160     0          Y
>> 5414
>>
>> Self-heal Daemon on localhost               N/A       N/A        Y
>> 5436
>>
>>
>>
>> Task Status of Volume r2
>>
>> ------------------------------------------------------------
>> ------------------
>>
>> There are no active volume tasks
>>
>>
>>
>> root at dhcp35-190 - ~
>>
>> 16:44:38 :) ⚡ kill -9 5309 5351 5393
>>
>>
>>
>> Best Regards,
>>
>> George
>>
>> *From:* gluster-devel-bounces at gluster.org [mailto:gluster-devel-bounces@
>> gluster.org] *On Behalf Of *Pranith Kumar Karampuri
>> *Sent:* Wednesday, January 17, 2018 7:27 PM
>> *To:* Lian, George (NSB - CN/Hangzhou) <george.lian at nokia-sbell.com>
>> *Cc:* Li, Deqian (NSB - CN/Hangzhou) <deqian.li at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Zhou, Cynthia (NSB - CN/Hangzhou) <
>> cynthia.zhou at nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>>
>>
>> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Jan 15, 2018 at 1:55 PM, Pranith Kumar Karampuri <
>> pkarampu at redhat.com> wrote:
>>
>>
>>
>>
>>
>> On Mon, Jan 15, 2018 at 8:46 AM, Lian, George (NSB - CN/Hangzhou) <
>> george.lian at nokia-sbell.com> wrote:
>>
>> Hi,
>>
>>
>>
>> Have you reproduced this issue? If yes, could you please confirm whether
>> it is an issue or not?
>>
>>
>>
>> Hi,
>>
>>        I tried recreating this on my laptop and on both master and 3.12
>> and I am not able to recreate the issue :-(.
>>
>> Here is the execution log: https://paste.fedoraproject.or
>> g/paste/-csXUKrwsbrZAVW1KzggQQ
>>
>> Since I was doing this on my laptop, I changed shutting down of the
>> replica to killing the brick process to simulate this test.
>>
>> Let me know if I missed something.
>>
>>
>>
>>
>>
>> Sorry, I am held up with some issue at work, so I think I will get some
>> time day after tomorrow to look at this. In the mean time I am adding more
>> people who know about afr to see if they get a chance to work on this
>> before me.
>>
>>
>>
>>
>>
>> And if it is an issue,  do you have any solution for this issue?
>>
>>
>>
>> Thanks & Best Regards,
>>
>> George
>>
>>
>>
>> *From:* Lian, George (NSB - CN/Hangzhou)
>> *Sent:* Thursday, January 11, 2018 2:01 PM
>> *To:* Pranith Kumar Karampuri <pkarampu at redhat.com>
>> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Li, Deqian (NSB - CN/Hangzhou) <
>> deqian.li at nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>> *Subject:* RE: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>> Hi,
>>
>>
>>
>> Please see detail test step on https://bugzilla.redhat.com/sh
>> ow_bug.cgi?id=1531457
>>
>>
>>
>> How reproducible:
>>
>>
>>
>>
>>
>> Steps to Reproduce:
>>
>> 1.create a volume name "test" with replicated
>>
>> 2.set volume option cluster.consistent-metadata with on:
>>
>>   gluster v set test cluster.consistent-metadata on
>>
>> 3. mount volume test on client on /mnt/test
>>
>> 4. create a file aaa size more than 1 byte
>>
>>    echo "1234567890" >/mnt/test/aaa
>>
>> 5. shutdown a replicat node, let's say sn-1, only let sn-0 worked
>>
>> 6. cp /mnt/test/aaa /mnt/test/bbb; link /mnt/test/bbb /mnt/test/ccc
>>
>>
>>
>>
>>
>> BRs
>>
>> George
>>
>>
>>
>> *From:* gluster-devel-bounces at gluster.org [mailto:gluster-devel-bounces@
>> gluster.org <gluster-devel-bounces at gluster.org>] *On Behalf Of *Pranith
>> Kumar Karampuri
>> *Sent:* Thursday, January 11, 2018 12:39 PM
>> *To:* Lian, George (NSB - CN/Hangzhou) <george.lian at nokia-sbell.com>
>> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Li, Deqian (NSB - CN/Hangzhou) <
>> deqian.li at nokia-sbell.com>; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 11, 2018 at 6:35 AM, Lian, George (NSB - CN/Hangzhou) <
>> george.lian at nokia-sbell.com> wrote:
>>
>> Hi,
>>
>> >>> In which protocol are you seeing this issue? Fuse/NFS/SMB?
>>
>> It is fuse, within mountpoint by “mount -t glusterfs  …“ command.
>>
>>
>>
>> Could you let me know the test you did so that I can try to re-create and
>> see what exactly is going on?
>>
>> Configuration of the volume and the steps to re-create the issue you are
>> seeing would be helpful in debugging the issue further.
>>
>>
>>
>>
>>
>> Thanks & Best Regards,
>>
>> George
>>
>>
>>
>> *From:* gluster-devel-bounces at gluster.org [mailto:gluster-devel-bounces@
>> gluster.org] *On Behalf Of *Pranith Kumar Karampuri
>> *Sent:* Wednesday, January 10, 2018 8:08 PM
>> *To:* Lian, George (NSB - CN/Hangzhou) <george.lian at nokia-sbell.com>
>> *Cc:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>;
>> Zhong, Hua (NSB - CN/Hangzhou) <hua.zhong at nokia-sbell.com>; Li, Deqian
>> (NSB - CN/Hangzhou) <deqian.li at nokia-sbell.com>;
>> Gluster-devel at gluster.org; Sun, Ping (NSB - CN/Hangzhou) <
>> ping.sun at nokia-sbell.com>
>> *Subject:* Re: [Gluster-devel] a link issue maybe introduced in a bug
>> fix " Don't let NFS cache stat after writes"
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jan 10, 2018 at 11:09 AM, Lian, George (NSB - CN/Hangzhou) <
>> george.lian at nokia-sbell.com> wrote:
>>
>> Hi, Pranith Kumar,
>>
>>
>>
>> I has create a bug on Bugzilla https://bugzilla.redhat.com/sh
>> ow_bug.cgi?id=1531457
>>
>> After my investigation for this link issue, I suppose your changes on
>> afr-dir-write.c with issue " Don't let NFS cache stat after writes" , your
>> fix is like:
>>
>> --------------------------------------
>>
>>        if (afr_txn_nothing_failed (frame, this)) {
>>
>>                         /*if it did pre-op, it will do post-op changing
>> ctime*/
>>
>>                         if (priv->consistent_metadata &&
>>
>>                             afr_needs_changelog_update (local))
>>
>>                                 afr_zero_fill_stat (local);
>>
>>                         local->transaction.unwind (frame, this);
>>
>>                 }
>>
>> In the above fix, it set the ia_nlink to ‘0’ if option
>> consistent-metadata is set to “on”.
>>
>> And hard link a file with which just created will lead to an error, and
>> the error is caused in kernel function “vfs_link”:
>>
>> if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE))
>>
>>              error =  -ENOENT;
>>
>>
>>
>> could you please have a check and give some comments here?
>>
>>
>>
>> When stat is "zero filled", understanding is that the higher layer
>> protocol doesn't send stat value to the kernel and a separate lookup is
>> sent by the kernel to get the latest stat value. In which protocol are you
>> seeing this issue? Fuse/NFS/SMB?
>>
>>
>>
>>
>>
>> Thanks & Best Regards,
>>
>> George
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>>
>>
>>
>> --
>>
>> Pranith
>>
>
>
>
> --
> Pranith
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20180124/3d3331d2/attachment-0001.html>