[Gluster-devel] afr's return "struct stat" scheme

Krishna Srinivas krishna at zresearch.com
Wed Jan 9 11:37:17 UTC 2008


Hello,

On Jan 8, 2008 8:38 AM, LI Daobing <lidaobing at kingsoft.com> wrote:
> Hello,
>
> just a little review on the afr return "struct stat" scheme
>
> there are 18 functions[1] in xlator_fops which return a struct stat*
> in cbk. in 1.3.7 afr implement 16 of them(except fchmod and fchown,
> which return ENOSYS), and in TLA, all of them is implemented.

correct.

>
> most of them adopt a scheme to return a stable struct stat* when no
> child is out of work. It pick the successful returned value which is
> first appeared in the conf file.  fstat and 1.3.7 readv is a little
> difference with others, these two functions do not send the op to all
> children, but only the first succesful child. So it's also stable. And
> in TLA, some funtion is changed to serial call (for example, mknod),
> and the returned struct stat is also come from the first succesful
> child, so it's also stable.

correct. The idea is to return always stat from first child in the subvols
list which returns success.
We call create() serially in AFR to handle the case (for example) where
two afrs are trying to create a file and each one succeeds creating
a file in each child (assuming afr has 2 subvols) and both the afrs
returning success (because one of the creates was success)

>
> but ftruncate, writev, lookup and TLA-version does not return a stable
> struct stat, ftruncate and writev pick the return value from the last
> successful returned child . and lookup pick the child with the largest
> mtime. the TLA-version readv return the struct stat from the rchild.

i have to check in your findings of writev and ftruncate, thanks. did that
fix your vi editing problem?
Regarding lookup, it returns stat of the entry with the greatest mtime
but retaining the inode number of the first successful child. Ideally we
should return stat of the entry based on xattrs createtime and version
and not mtime (this change will soon go in)


>
> Bug:
> when you use vim on a glusterfs file system (with afr and the children
> of afr direct to different machine). Sometimes you will get a warning:
> The file has been changed since reading it!!! I have submitted this
> bug at https://savannah.nongnu.org/bugs/?21945, but the patch provided
> by me only concern the writev and ftruncate functions, so it still
> can't fix this bug. I will provide a improved patch later.
>
> But is there a good excuse to let lookup return the stat with a largest mtime?

It is just that the copy with the latest mtime will have the latest
correct attributes.
How ever as i said we should really look at the xattrs createtime &
version to decide
on the latest stat.

>
> And
> If you use read-volume option in afr, I suggest you putting the
> `read-volume' volume at the first place in the sub-volumes.

I did not understand this, can you explain?

>
> Any suggestion?
>
> Thanks.
>
> [1]
>    1. lookup
>    2. stat
>    3. fstat
>    4. chmod
>    5. fchmod
>    6. chown
>    7. fchown
>    8. truncate
>    9. ftruncate
>   10. utimens
>   11. mknod
>   12. mkdir
>   13. symlink
>   14. rename
>   15. link
>   16. create
>   17. readv
>   18. writev
>
> --
> Best Regards,
>  LI Daobing
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>





More information about the Gluster-devel mailing list