[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch

Mauro Tridici mauro.tridici at cmcc.it
Mon Sep 10 11:28:00 UTC 2018


Dear Hari,

the log files that I attached to my last mail have been generated running quota-fsck script after deleting the files.
The quota-fsck script version that I used is the one in the following link https://review.gluster.org/#/c/19179/9..9/extras/quota/quota_fsck.py
I didn’t edit the log files, but during the execution I forgot to redirect the stderr and stdout to the same log file, sorry, mea culpa! 

Anyway, as you suggested, I executed again the quota-fsck script with option —fix-issues.
At the end of script execution, I launched the du command, but the problem is still there.

[root at s02 auto]# df -hT /tier2/ASC/
File system    Tipo            Dim. Usati Dispon. Uso% Montato su
s02-stg:tier2  fuse.glusterfs   10T  2,6T    7,5T  26% /tier2

I’m sorry to bother you so much.
Last time I used the script everything went smoothly, but this time it seems to be more difficult.

In attachment you can find the new log files.

Thank you,
Mauro


> Il giorno 10 set 2018, alle ore 12:27, Hari Gowtham <hgowtham at redhat.com> ha scritto:
> 
> On Mon, Sep 10, 2018 at 3:13 PM Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> Dear Hari,
>> 
>> I followed you suggestions, but, unfortunately, nothing is changed.
>> I tried to execute both the quota-fsck script with —fix-issues options both the "setfattr -n trusted.glusterfs.quota.dirty -v 0x3100” command against the files and directory mentioned by you (on each available brick).
> 
> There can be an issue with fix-issue in the script. As the directories
> with accounting mismatch awre found its better to set the dirty xattr
> and then do a du(this way its wasy and has to resolve the issue). The
> script can be used when we dont know where the issue is.
> 
>> Disk quota assigned to /tier2/ASC directory seems to be partially used (about 2,6 TB used), but the “real and current” situation is the following one (I deleted all files in primavera directory):
> 
> If the files are deleted, then state of the log file from the script
> is outdated. The folders I suggested are as per the old log file, So
> setting the dirty xattr and then doing a lookup (du on that dir) might
> not help.
> 
>> 
>> [root at s03 qc]# du -hsc /tier2/ASC/*
>> 22G /tier2/ASC/orientgate
>> 26K /tier2/ASC/primavera
>> 22G totale
>> 
>> So, I think that the problem should be only in "orientgate” or in “primavera” directory, right!?
>> For this reason, in order to collect some fresh logs, I executed again the check script starting from the top level directory “ASC”  using the following bash script (named hari-20180910) based on the new version of quota_fsck (rel. 9):
>> 
>> hari-20180910 script:
>> 
>> #!/bin/bash
>> 
>> #set -xv
>> 
>> host=$(hostname)
>> 
>> for i in {1..12}
>> do
>> ./quota_fsck_r9.py --full-logs --sub-dir ASC /gluster/mnt$i/brick >> $host.log
>> done
>> ~
>> 
>> In attachment, you can find the log files generated by the script.
>> 
>> SOME IMPORTANT NOTES:
>> 
>> - in the new log files, “primavera” directory is no more present
>> 
>> Is there something more that I can do?
> As there were files that were deleted, the accounting would have changed again.
> 
> Need to look from the beginning, as the above suggestions may not be
> true anymore.
> 
> I find that the log files are edited. A few lines are missing. Can you
> send the actual log file from running the script
> And i would recommend you to run the script after all the files are
> deleted (or other major modifications are done).
> So that we can fix once at the end.
> 
> If the fix-issue argument on script doesn't work on the directory/
> subdirectory where you find mismatch, then you can send the whole
> file.
> Will check the log and let you know where you need to do the lookup.
> 
>> 
>> Thank you very much for your patience.
>> Regards,
>> Mauro
>> 
>> 
>> Il giorno 10 set 2018, alle ore 10:51, Hari Gowtham <hgowtham at redhat.com> ha scritto:
>> 
>> Hi,
>> 
>> Looking at the logs, I can see that the file:
>> 
>> /orientgate/ftp/climate/3_urban_adaptation_health/6_budapest_veszprem_hungary/RHMSS_CMCC-CM_NMMB_Balkan_8km_1971-2005
>> /orientgate/ftp/climate/3_urban_adaptation_health/6_budapest_veszprem_hungary/RHMSS_ERA40_NMMB_Balkan_8km_1971-2000
>> /orientgate/ftp/climate/3_urban_adaptation_health/6_budapest_veszprem_hungary/RHMSS_CMCC-CM_NMMB-RCP8.5_Balkan_8km_2010-2100
>> /primavera/cam
>> 
>> has mismatch.
>> 
>> You can try setting dirty for this and then do a du on it.
>> 
>> A few corrections for my above comments.
>> The contri size in the xattr and the aggregated size have to be checked.
>> 
>> On Mon, Sep 10, 2018 at 1:16 PM Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> 
>> Hi Hari,
>> 
>> thank you very much for your help.
>> I will try to use the latest available version of quota_fsck script and I will provide you a feedback as soon as possible.
>> 
>> Thank you again for the detailed explanation.
>> Regards,
>> Mauro
>> 
>> Il giorno 10 set 2018, alle ore 09:17, Hari Gowtham <hgowtham at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> The problem might be at some other place, So setting the xattr and
>> doing the lookup might not have fixed the issue.
>> 
>> To resolve this we need to read the log file reported by the fsck
>> script. In this log file we need to look for the size reported by the
>> xattr (the value "SIZE:" in the log file) and the size reported by the
>> stat on the file (the value after "st_size=" ).
>> 
>> 
>> The contri size in the xattr and the aggregated size have to be checked
>> 
>> These two should be the same. If they mismatch, then we have to find
>> the top most dir which has the mismatch.
>> 
>> 
>> Bottom most dir/file has to be found. Replace top with bottom in the
>> following places as well.
>> 
>> On this top most directory you have to do a set dirty xattr and then
>> do a lookup.
>> 
>> If there are two different directories without a common top directory,
>> then both these have to undergo the above process.
>> 
>> The fsck script should work fine. can you try the "--fix-issue" with
>> the latest script instead of the 6th patch used above?
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Hari Gowtham.
>> 
>> 
>> 
>> -------------------------
>> Mauro Tridici
>> 
>> Fondazione CMCC
>> CMCC Supercomputing Center
>> presso Complesso Ecotekne - Università del Salento -
>> Strada Prov.le Lecce - Monteroni sn
>> 73100 Lecce  IT
>> http://www.cmcc.it
>> 
>> mobile: (+39) 327 5630841
>> email: mauro.tridici at cmcc.it
>> 
> 
> 
> -- 
> Regards,
> Hari Gowtham.


-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Università del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180910/0bc73f33/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log_after_deletion.tar.gz
Type: application/x-gzip
Size: 51659 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180910/0bc73f33/attachment.gz>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180910/0bc73f33/attachment-0001.html>


More information about the Gluster-users mailing list