[Gluster-devel] Quota problems without a way of fixing them

Thu Jan 22 07:28:47 UTC 2015

----- Original Message -----
> From: "Joe Julian" <joe at julianfamily.org>
> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
> Cc: pkoro at grid.auth.gr, "Gluster-devel at gluster.org" <gluster-devel at gluster.org>
> Sent: Thursday, January 22, 2015 11:16:39 AM
> Subject: Re: [Gluster-devel] Quota problems without a way of fixing them
> 
> On 01/21/2015 09:32 PM, Raghavendra Gowdappa wrote:
> >
> > ----- Original Message -----
> >> From: "Joe Julian" <joe at julianfamily.org>
> >> To: "Gluster Devel" <gluster-devel at gluster.org>
> >> Cc: "Paschalis Korosoglou" <pkoro at grid.auth.gr>
> >> Sent: Thursday, January 22, 2015 12:54:44 AM
> >> Subject: [Gluster-devel] Quota problems without a way of fixing them
> >>
> >> Paschalis (PeterA in #gluster) has reported these bugs and we've tried to
> >> find the source of the problem to no avail. Worse yet, there's no way to
> >> just reset the quotas to match what's actually there, as far as I can
> >> tell.
> >>
> >> What should we look for to isolate the source of this problem since this
> >> is a
> >> production system with enough activity to make isolating the repro
> >> difficult
> >> at best, and debug logs have enough noise to make isolation nearly
> >> impossible?
> >>
> >> Finally, isn't there some simple way to trigger quota to rescan a path to
> >> reset trusted.glusterfs.quota.size ?
> > 1. Delete following xattrs from all the files/directories on all the bricks
> >     a) trusted.glusterfs.quota.size
> >     b) trusted.glusterfs.quota.*.contri
> >     c) trusted.glusterfs.quota.dirty
> >
> > 2. Turn off md-cache
> >     # gluster volume set <volname> performance.stat-prefetch off
> >
> > 3. Mount glusterfs asking not to use readdirp instead of readdir
> >     # mount -t glusterfs -o use-readdirp=no <volfile-server>:<volfile-id>
> >     /mnt/glusterfs
> >
> > 4. Do a crawl on the mountpoint
> >     # find /mnt/glusterfs -exec stat \{} \; > /dev/null
> >
> > This should correct the accounting on bricks. Once done, you should see
> > correct values in quota list output. Please let us know if it doesn't work
> > for you.
> 
> But that could be a months-long process with the size of many of our
> users volumes. There should be a way to do this with a single directory
> tree.

If you can isolate a sub-directory tree where size accounting has gone bad, this can be done by setting xattr trusted.glusterfs.quota.dirty of a directory to 1 and sending a lookup on that directory. Basically what this does is to add sizes of all immediate children and set that as the value of trusted.glusterfs.quota.size on the directory. But, the catch here is that the sizes of immediate children need not be accounted correctly. Hence this healing should be done bottom up starting with bottom-most directory and healing towards the top-level subdirectory which is isolated. We can have an algorithm like this:

void
heal (char *path)
{
       char value = 1;
       struct stbuf = {0, };

       setxattr (path, "trusted.glusterfs.quota.dirty", (const void *) &value, sizeof (value));

       /* now the dirty xattr has been set, trigger a lookup, so that the directory is healed */
       stat (path, &stbuf);

       return;
}

void
crawl (DIR *dirfd, char *path)
{
      struct dirent *result = NULL, entry = {0, };

      while (result = readdir (dirfd, &entry, NULL)) {
           if (IA_ISDIR (result->d_type)) {
               DIR *childfd = NULL;
               char *childpath = NULL;

               childpath = construct_path (path, entry->d_name);

               childfd = opendir (entry->d_name);

               crawl (childfd, childpath);
           }
      }

      heal (dirfd);

      return;
}

Now call crawl on isolated sub-directory (on the mountpoint). Note that above is a psudo-code, and a tool should be written using the above algo. We'll try to add a program to extras/utils which does this.

> 
> >    
> >> His production system has been unmanageable for months now. It is possible
> >> for someone spare some cycles to get this looked at?
> >>
> >> 2013-03-04 - https://bugzilla.redhat.com/show_bug.cgi?id=917901
> >> 2013-10-24 - https://bugzilla.redhat.com/show_bug.cgi?id=1023134
> > We are working on these bugs. We'll update on the bugzilla once we find
> > anything substantial.
> 
>