[Gluster-devel] Solving Ctime Issue with legacy files [BUG 1593542]

Xavi Hernandez jahernan at redhat.com
Mon Jun 17 12:08:20 UTC 2019


Hi Kotresh,

On Mon, Jun 17, 2019 at 1:50 PM Kotresh Hiremath Ravishankar <
khiremat at redhat.com> wrote:

> Hi All,
>
> The ctime feature is enabled by default from release gluster-6. But as
> explained in bug [1]  there is a known issue with legacy files i.e., the
> files which are created before ctime feature is enabled. These files would
> not have "trusted.glusterfs.mdata" xattr which maintain time attributes. So
> on, accessing those files, it gets created with latest time attributes.
> This is not correct because all the time attributes (atime, mtime, ctime)
> get updated instead of required time attributes.
>
> There are couple of approaches to solve this.
>
> 1. On accessing the files, let the posix update the time attributes from
> the back end file on respective replicas. This obviously results in
> inconsistent "trusted.glusterfs.mdata" xattr values with in replica set.
> AFR/EC should heal this xattr as part of metadata heal upon accessing this
> file. It can chose to replicate from any subvolume. Ideally we should
> consider the highest time from the replica and treat it as source but I
> think that should be fine as replica time attributes are mostly in sync
> with max difference in order of few seconds if am not wrong.
>
>    But client side self heal is disabled by default because of performance
> reasons [2]. If we chose to go by this approach, we need to consider
> enabling at least client side metadata self heal by default. Please share
> your thoughts on enabling the same by default.
>
> 2. Don't let posix update the legacy files from the backend. On lookup
> cbk, let the utime xlator update the time attributes from statbuf received
> synchronously.
>
> Both approaches are similar as both results in updating the xattr during
> lookup. Please share your inputs on which approach is better.
>

I prefer second approach. First approach is not feasible for EC volumes
because self-heal requires that k bricks (on a k+r configuration) agree on
the value of this xattr, otherwise it considers the metadata damaged and
needs manual intervention to fix it. During upgrade, first r bricks with be
upgraded without problems, but trusted.glusterfs.mdata won't be healed
because r < k. In fact this xattr will be removed from new bricks because
the majority of bricks agree on xattr not being present. Once the r+1 brick
is upgraded, it's possible that posix sets different values for
trusted.glusterfs.mdata, which will cause self-heal to fail.

Second approach seems better to me if guarded by a new option that enables
this behavior. utime xlator should only update the mdata xattr if that
option is set, and that option should only be settable once all nodes have
been upgraded (controlled by op-version). In this situation the first
lookup on a file where utime detects that mdata is not set, will require a
synchronous update. I think this is good enough because it will only happen
once per file. We'll need to consider cases where different clients do
lookups at the same time, but I think this can be easily solved by ignoring
the request if mdata is already present.

Xavi


>
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593542
> [2] https://github.com/gluster/glusterfs/issues/473
>
> --
> Thanks and Regards,
> Kotresh H R
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190617/cf826141/attachment.html>


More information about the Gluster-devel mailing list