[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch

Wed Jul 11 08:23:25 UTC 2018

Hi Hari, Hi Sanoj,

thank you very much for your patience and your support! 
The problem has been solved following your instructions :-)

N.B.: in order to reduce the running time, I executed the “du” command as follows:

for i in {1..12}
do
 du /gluster/mnt$i/brick/CSP/ans004/ftp
done

and not on each brick at "/gluster/mnt$i/brick" tree level.

I hope it was a correct idea :-)

Thank you again for helping me to solve this issue.
Have a good day.
Mauro

> Il giorno 11 lug 2018, alle ore 09:16, Hari Gowtham <hgowtham at redhat.com> ha scritto:
> 
> Hi,
> 
> There was a accounting issue in your setup.
> The directory ans004/ftp/CMCC-CM2-VHR4-CTR/atm/hist and ans004/ftp/CMCC-CM2-VHR4
> had wrong size value on them.
> 
> To fix it, you will have to set dirty xattr (an internal gluster
> xattr) on these directories
> which will mark it for calculating the values again for the directory.
> And then do a du on the mount after setting the xattrs. This will do a
> stat that will
> calculate and update the right values.
> 
> To set dirty xattr:
> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 <path to the directory>
> This has to be done for both the directories one after the other on each brick.
> Once done for all the bricks issue the du command.
> 
> Thanks to Sanoj for the guidance
> On Tue, Jul 10, 2018 at 6:37 PM Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> Hi Hari,
>> 
>> sorry for the late.
>> Yes, the gluster volume is a single volume that is spread between all the 3 node and has 36 bricks
>> 
>> In attachment you can find a tar.gz file containing:
>> 
>> - gluster volume status command output;
>> - gluster volume info command output;
>> - the output of the following script execution (it generated 3 files per server: s01.log, s02.log, s03.log).
>> 
>> This is the “check.sh” script that has been executed on each server (servers are s01, s02, s03).
>> 
>> #!/bin/bash
>> 
>> #set -xv
>> 
>> host=$(hostname)
>> 
>> for i in {1..12}
>> do
>> ./quota_fsck_new-6.py --full-logs --sub-dir CSP/ans004 /gluster/mnt$i/brick >> $host.log
>> done
>> 
>> Many thanks,
>> Mauro
>> 
>> 
>> Il giorno 10 lug 2018, alle ore 12:12, Hari Gowtham <hgowtham at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> Can you send the gluster v status command output?
>> 
>> Is it a single volume that is spread between all the 3 node and has 36 bricks?
>> If yes, you will have to run on all the bricks.
>> 
>> In the command use sub-dir option if you are running only for the
>> directory where limit is set. else if you are
>> running on the brick mount path you can remove it.
>> 
>> The full-log will consume a lot of space as its going to record the
>> xattrs for each entry inside the path we are
>> running it. This data is needed to cross check and verify quota's
>> marker functionality.
>> 
>> To reduce resource consumption you can run it on one replica set alone
>> (if its replicate volume)
>> But its better if you can run it on all the brick if possible and if
>> the size consumed is fine with you.
>> 
>> Make sure you run it with the script link provided above by Sanoj. (patch set 6)
>> On Tue, Jul 10, 2018 at 2:54 PM Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> 
>> Hi Hari,
>> 
>> thank you very much for your answer.
>> I will try to use the script mentioned above pointing to each backend bricks.
>> 
>> So, if I understand, since I have a gluster cluster composed by 3 nodes (with 12 bricks on each node), I have to execute the script 36 times. Right?
>> 
>> You can find below the “df” command output executed on a cluster node:
>> 
>> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
>> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62% /gluster/mnt2
>> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63% /gluster/mnt3
>> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,7T    3,4T  63% /gluster/mnt8
>> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62% /gluster/mnt1
>> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  65% /gluster/mnt10
>> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,7T    3,4T  64% /gluster/mnt6
>> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63% /gluster/mnt4
>> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,7T  60% /gluster/mnt11
>> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,7T  60% /gluster/mnt12
>> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,7T    3,4T  64% /gluster/mnt5
>> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,7T    3,4T  63% /gluster/mnt7
>> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  65% /gluster/mnt9
>> 
>> I will execute the following command and I will put here the output.
>> 
>> ./quota_fsck_new.py --full-logs --sub-dir /gluster/mnt{1..12}
>> 
>> Thank you again for your support.
>> Regards,
>> Mauro
>> 
>> Il giorno 10 lug 2018, alle ore 11:02, Hari Gowtham <hgowtham at redhat.com> ha scritto:
>> 
>> Hi,
>> 
>> There is no explicit command to backup all the quota limits as per my
>> understanding. need to look further about this.
>> But you can do the following to backup and set it.
>> Gluster volume quota volname list which will print all the quota
>> limits on that particular volume.
>> You will have to make a note of the directories with their respective limit set.
>> Once noted down, you can disable quota on the volume and then enable it.
>> Once enabled, you will have to set each limit explicitly on the volume.
>> 
>> Before doing this we suggest you can to try running the script
>> mentioned above with the backend brick path instead of the mount path.
>> you need to run this on the machines where the backend bricks are
>> located and not on the mount.
>> On Mon, Jul 9, 2018 at 9:01 PM Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> Hi Sanoj,
>> 
>> could you provide me the command that I need in order to backup all quota limits?
>> If there is no solution for this kind of problem, I would like to try to follow your “backup” suggestion.
>> 
>> Do you think that I should contact gluster developers too?
>> 
>> Thank you very much.
>> Regards,
>> Mauro
>> 
>> 
>> Il giorno 05 lug 2018, alle ore 09:56, Mauro Tridici <mauro.tridici at cmcc.it> ha scritto:
>> 
>> Hi Sanoj,
>> 
>> unfortunately the output of the command execution was not helpful.
>> 
>> [root at s01 ~]# find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
>> [root at s01 ~]#
>> 
>> Do you have some other idea in order to detect the cause of the issue?
>> 
>> Thank you again,
>> Mauro
>> 
>> 
>> Il giorno 05 lug 2018, alle ore 09:08, Sanoj Unnikrishnan <sunnikri at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> A script issue did not capture all necessary xattr.
>> Could you provide the xattrs with..
>> find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
>> 
>> Meanwhile, If you are being impacted, you could do the following
>> back up quota limits
>> disable quota
>> enable quota
>> freshly set the limits.
>> 
>> Please capture the xattr values first, so that we can get to know what went wrong.
>> Regards,
>> Sanoj
>> 
>> 
>> On Tue, Jul 3, 2018 at 4:09 PM, Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> Dear Sanoj,
>> 
>> thank you very much for your support.
>> I just downloaded and executed the script you suggested.
>> 
>> This is the full command I executed:
>> 
>> ./quota_fsck_new.py --full-logs --sub-dir /tier2/CSP/ans004/ /gluster
>> 
>> In attachment, you can find the logs generated by the script.
>> What can I do now?
>> 
>> Thank you very much for your patience.
>> Mauro
>> 
>> 
>> 
>> 
>> Il giorno 03 lug 2018, alle ore 11:34, Sanoj Unnikrishnan <sunnikri at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> This may be an issue with update of backend xattrs.
>> To RCA further and provide resolution could you provide me with the logs by running the following fsck script.
>> https://review.gluster.org/#/c/19179/6/extras/quota/quota_fsck.py
>> 
>> Try running the script and revert with the logs generated.
>> 
>> Thanks,
>> Sanoj
>> 
>> 
>> On Mon, Jul 2, 2018 at 2:21 PM, Mauro Tridici <mauro.tridici at cmcc.it> wrote:
>> 
>> 
>> Dear Users,
>> 
>> I just noticed that, after some data deletions executed inside "/tier2/CSP/ans004” folder, the amount of used disk reported by quota command doesn’t reflect the value indicated by du command.
>> Surfing on the web, it seems that it is a bug of previous versions of Gluster FS and it was already fixed.
>> In my case, the problem seems unfortunately still here.
>> 
>> How can I solve this issue? Is it possible to do it without starting a downtime period?
>> 
>> Thank you very much in advance,
>> Mauro
>> 
>> [root at s01 ~]# glusterfs -V
>> glusterfs 3.10.5
>> Repository revision: git://git.gluster.org/glusterfs.git
>> Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
>> GlusterFS comes with ABSOLUTELY NO WARRANTY.
>> It is licensed to you under your choice of the GNU Lesser
>> General Public License, version 3 or any later version (LGPLv3
>> or later), or the GNU General Public License, version 2 (GPLv2),
>> in all cases as published by the Free Software Foundation.
>> 
>> [root at s01 ~]# gluster volume quota tier2 list /CSP/ans004
>>                Path                   Hard-limit  Soft-limit      Used  Available  Soft-limit exceeded? Hard-limit exceeded?
>> -------------------------------------------------------------------------------------------------------------------------------
>> /CSP/ans004                                1.0TB     99%(1013.8GB)    3.9TB  0Bytes             Yes                  Yes
>> 
>> [root at s01 ~]# du -hs /tier2/CSP/ans004/
>> 295G /tier2/CSP/ans004/
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Hari Gowtham.
>> 
>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Hari Gowtham.
>> 
>> 
> 
> 
> -- 
> Regards,
> Hari Gowtham.

-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Università del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180711/c5ea73bc/attachment.html>