[Gluster-devel] Quota problems without a way of fixing them

Vijaikumar M vmallika at redhat.com
Thu Jan 22 07:57:03 UTC 2015


Hi Joe,

Please find the attached scripts. Check if this solves the problem

1) 'quota-verify': This helps finding directories whose quota accounting 
is wrong
2) 'quota-heal': This helps healing the directories which is identified 
by script 'quota-verify'

Usage of these script:

This needs to be executed for all the bricks on all the nodes in the 
cluster where quota is enabled.
# quota-verify -b <brick_path1>   >>  logfile_node_1
# quota-verify -b <brick_path2>   >> logfile_node_1


This needs to be executed on all the nodes. Please make sure that no IO 
is happening when running the quota-heal script
# quota-heal -l logfile_node_1
# quota-heal -l logfile_node_2
...



Thanks,
Vijay




On Thursday 22 January 2015 01:03 PM, Raghavendra Gowdappa wrote:
>
> ----- Original Message -----
>> From: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>> To: "Joe Julian" <joe at julianfamily.org>
>> Cc: "Gluster-devel at gluster.org" <gluster-devel at gluster.org>, pkoro at grid.auth.gr
>> Sent: Thursday, January 22, 2015 12:58:47 PM
>> Subject: Re: [Gluster-devel] Quota problems without a way of fixing them
>>
>>
>>
>> ----- Original Message -----
>>> From: "Joe Julian" <joe at julianfamily.org>
>>> To: "Raghavendra Gowdappa" <rgowdapp at redhat.com>
>>> Cc: pkoro at grid.auth.gr, "Gluster-devel at gluster.org"
>>> <gluster-devel at gluster.org>
>>> Sent: Thursday, January 22, 2015 11:16:39 AM
>>> Subject: Re: [Gluster-devel] Quota problems without a way of fixing them
>>>
>>> On 01/21/2015 09:32 PM, Raghavendra Gowdappa wrote:
>>>> ----- Original Message -----
>>>>> From: "Joe Julian" <joe at julianfamily.org>
>>>>> To: "Gluster Devel" <gluster-devel at gluster.org>
>>>>> Cc: "Paschalis Korosoglou" <pkoro at grid.auth.gr>
>>>>> Sent: Thursday, January 22, 2015 12:54:44 AM
>>>>> Subject: [Gluster-devel] Quota problems without a way of fixing them
>>>>>
>>>>> Paschalis (PeterA in #gluster) has reported these bugs and we've tried
>>>>> to
>>>>> find the source of the problem to no avail. Worse yet, there's no way to
>>>>> just reset the quotas to match what's actually there, as far as I can
>>>>> tell.
>>>>>
>>>>> What should we look for to isolate the source of this problem since this
>>>>> is a
>>>>> production system with enough activity to make isolating the repro
>>>>> difficult
>>>>> at best, and debug logs have enough noise to make isolation nearly
>>>>> impossible?
>>>>>
>>>>> Finally, isn't there some simple way to trigger quota to rescan a path
>>>>> to
>>>>> reset trusted.glusterfs.quota.size ?
>>>> 1. Delete following xattrs from all the files/directories on all the
>>>> bricks
>>>>      a) trusted.glusterfs.quota.size
>>>>      b) trusted.glusterfs.quota.*.contri
>>>>      c) trusted.glusterfs.quota.dirty
>>>>
>>>> 2. Turn off md-cache
>>>>      # gluster volume set <volname> performance.stat-prefetch off
>>>>
>>>> 3. Mount glusterfs asking not to use readdirp instead of readdir
>>>>      # mount -t glusterfs -o use-readdirp=no <volfile-server>:<volfile-id>
>>>>      /mnt/glusterfs
>>>>
>>>> 4. Do a crawl on the mountpoint
>>>>      # find /mnt/glusterfs -exec stat \{} \; > /dev/null
>>>>
>>>> This should correct the accounting on bricks. Once done, you should see
>>>> correct values in quota list output. Please let us know if it doesn't
>>>> work
>>>> for you.
>>> But that could be a months-long process with the size of many of our
>>> users volumes. There should be a way to do this with a single directory
>>> tree.
>> If you can isolate a sub-directory tree where size accounting has gone bad,
> But, the problem with this approach is that how do we know whether parents of this sub-directory have correct size. If a subdirectory has wrong size, then most likely accounting of all the ancestors of that sub-directory till root has gone bad. Hence I am skeptic about just healing "part" of a directory tree.
>
>> this can be done by setting xattr trusted.glusterfs.quota.dirty of a
>> directory to 1 and sending a lookup on that directory. Basically what this
>> does is to add sizes of all immediate children and set that as the value of
>> trusted.glusterfs.quota.size on the directory. But, the catch here is that
>> the sizes of immediate children need not be accounted correctly. Hence this
>> healing should be done bottom up starting with bottom-most directory and
>> healing towards the top-level subdirectory which is isolated. We can have an
>> algorithm like this:
>>
>> void
>> heal (char *path)
>> {
>>         char value = 1;
>>         struct stbuf = {0, };
>>
>>         setxattr (path, "trusted.glusterfs.quota.dirty", (const void *)
>>         &value, sizeof (value));
>>
>>         /* now the dirty xattr has been set, trigger a lookup, so that the
>>         directory is healed */
>>         stat (path, &stbuf);
>>
>>         return;
>> }
>>
>> void
>> crawl (DIR *dirfd, char *path)
>> {
>>        struct dirent *result = NULL, entry = {0, };
>>
>>        while (result = readdir (dirfd, &entry, NULL)) {
>>             if (IA_ISDIR (result->d_type)) {
>>                 DIR *childfd = NULL;
>>                 char *childpath = NULL;
>>
>>                 childpath = construct_path (path, entry->d_name);
>>
>>                 childfd = opendir (entry->d_name);
>>
>>                 crawl (childfd, childpath);
>>             }
>>        }
>>
>>        heal (dirfd);
>>
>>        return;
>> }
>>
>> Now call crawl on isolated sub-directory (on the mountpoint). Note that above
>> is a psudo-code, and a tool should be written using the above algo. We'll
>> try to add a program to extras/utils which does this.
>>
>>>>     
>>>>> His production system has been unmanageable for months now. It is
>>>>> possible
>>>>> for someone spare some cycles to get this looked at?
>>>>>
>>>>> 2013-03-04 - https://bugzilla.redhat.com/show_bug.cgi?id=917901
>>>>> 2013-10-24 - https://bugzilla.redhat.com/show_bug.cgi?id=1023134
>>>> We are working on these bugs. We'll update on the bugzilla once we find
>>>> anything substantial.
>>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: quota-heal.gz
Type: application/gzip
Size: 1952 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150122/3e79c42c/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: quota-verify.gz
Type: application/gzip
Size: 1924 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150122/3e79c42c/attachment-0001.bin>


More information about the Gluster-devel mailing list