[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch
Mauro Tridici
mauro.tridici at cmcc.it
Sat Sep 8 18:55:11 UTC 2018
Ops, sorry.
Attached ZIP file was quarantined by mail security system policies (it seems that .zip and .docm files are not good).
So, I try to send you the log files in the tar.gz format
Thank you,
Mauro
> Il giorno 08 set 2018, alle ore 20:46, Mauro Tridici <mauro.tridici at cmcc.it> ha scritto:
>
>
> Hi Hari, Hi Sanoj,
>
> sorry if I disturb you again, but I have a problem similar to the one described (and solved) below.
> As you can see from the following output there is a mismatch between the used disk size reported by quota and the value returned by du command.
>
> [root at s01 qc]# df -hT /tier2/ASC
> File system Tipo Dim. Usati Dispon. Uso% Montato su
> s01-stg:tier2 fuse.glusterfs 10T 4,5T 5,6T 45% /tier2
>
> [root at s01 qc]# gluster volume quota tier2 list /ASC
> Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded?
> -------------------------------------------------------------------------------------------------------------------------------
> /ASC 10.0TB 99%(9.9TB) 4.4TB 5.6TB No No
>
> [root at s01 qc]# du -hs /tier2/ASC
> 1,9T /tier2/ASC
>
> I already executed the "quota_fsck_new-6.py” script to identify the folder to be fixed.
> In attachment you can find the output produced executing a customized “check” script on each gluster server (s01, s02, s03 are the names of the servers).
>
> check script:
>
> #!/bin/bash
>
> #set -xv
>
> host=$(hostname)
>
> for i in {1..12}
> do
> ./quota_fsck_new-6.py --full-logs --sub-dir ASC /gluster/mnt$i/brick >> $host.log
> done
>
> I don’t know how to read the log files in order to detect the critical folder.
> Anyway, since the mismatch problem has been detected after some files have been deleted from /tier2/ASC/primavera/cam directory, I thought that the problem should be there.
> So, I tried to execute the following customized fix script:
>
> fix script:
>
> #!/bin/bash
>
> #set -xv
>
> host=$(hostname)
>
> for i in {1..12}
> do
> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 /gluster/mnt$i/brick/ASC/primavera/cam
> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 /gluster/mnt$i/brick/ASC/primavera
> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 /gluster/mnt$i/brick/ASC
>
> #after setfattr procedure, please comment setaffr lines and uncomment du lines and execute the script again
>
> #du /gluster/mnt$i/brick/ASC/primavera/cam
> #du /gluster/mnt$i/brick/ASC/primavera
> #du /gluster/mnt$i/brick/ASC
>
> done
> ~
>
> Unfortunately, the problem seems to be still here.
> I also tried to use the quota-fsck script using —fix-issues option, but nothing changed.
> Could you please help me to try to solve this issue?
>
> Thank you very much in advance,
> Mauro
>
> <logs.zip>
>
>> Il giorno 11 lug 2018, alle ore 10:23, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> ha scritto:
>>
>> Hi Hari, Hi Sanoj,
>>
>> thank you very much for your patience and your support!
>> The problem has been solved following your instructions :-)
>>
>> N.B.: in order to reduce the running time, I executed the “du” command as follows:
>>
>> for i in {1..12}
>> do
>> du /gluster/mnt$i/brick/CSP/ans004/ftp
>> done
>>
>> and not on each brick at "/gluster/mnt$i/brick" tree level.
>>
>> I hope it was a correct idea :-)
>>
>> Thank you again for helping me to solve this issue.
>> Have a good day.
>> Mauro
>>
>>
>>> Il giorno 11 lug 2018, alle ore 09:16, Hari Gowtham <hgowtham at redhat.com <mailto:hgowtham at redhat.com>> ha scritto:
>>>
>>> Hi,
>>>
>>> There was a accounting issue in your setup.
>>> The directory ans004/ftp/CMCC-CM2-VHR4-CTR/atm/hist and ans004/ftp/CMCC-CM2-VHR4
>>> had wrong size value on them.
>>>
>>> To fix it, you will have to set dirty xattr (an internal gluster
>>> xattr) on these directories
>>> which will mark it for calculating the values again for the directory.
>>> And then do a du on the mount after setting the xattrs. This will do a
>>> stat that will
>>> calculate and update the right values.
>>>
>>> To set dirty xattr:
>>> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 <path to the directory>
>>> This has to be done for both the directories one after the other on each brick.
>>> Once done for all the bricks issue the du command.
>>>
>>> Thanks to Sanoj for the guidance
>>> On Tue, Jul 10, 2018 at 6:37 PM Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
>>>>
>>>>
>>>> Hi Hari,
>>>>
>>>> sorry for the late.
>>>> Yes, the gluster volume is a single volume that is spread between all the 3 node and has 36 bricks
>>>>
>>>> In attachment you can find a tar.gz file containing:
>>>>
>>>> - gluster volume status command output;
>>>> - gluster volume info command output;
>>>> - the output of the following script execution (it generated 3 files per server: s01.log, s02.log, s03.log).
>>>>
>>>> This is the “check.sh” script that has been executed on each server (servers are s01, s02, s03).
>>>>
>>>> #!/bin/bash
>>>>
>>>> #set -xv
>>>>
>>>> host=$(hostname)
>>>>
>>>> for i in {1..12}
>>>> do
>>>> ./quota_fsck_new-6.py --full-logs --sub-dir CSP/ans004 /gluster/mnt$i/brick >> $host.log
>>>> done
>>>>
>>>> Many thanks,
>>>> Mauro
>>>>
>>>>
>>>> Il giorno 10 lug 2018, alle ore 12:12, Hari Gowtham <hgowtham at redhat.com <mailto:hgowtham at redhat.com>> ha scritto:
>>>>
>>>> Hi Mauro,
>>>>
>>>> Can you send the gluster v status command output?
>>>>
>>>> Is it a single volume that is spread between all the 3 node and has 36 bricks?
>>>> If yes, you will have to run on all the bricks.
>>>>
>>>> In the command use sub-dir option if you are running only for the
>>>> directory where limit is set. else if you are
>>>> running on the brick mount path you can remove it.
>>>>
>>>> The full-log will consume a lot of space as its going to record the
>>>> xattrs for each entry inside the path we are
>>>> running it. This data is needed to cross check and verify quota's
>>>> marker functionality.
>>>>
>>>> To reduce resource consumption you can run it on one replica set alone
>>>> (if its replicate volume)
>>>> But its better if you can run it on all the brick if possible and if
>>>> the size consumed is fine with you.
>>>>
>>>> Make sure you run it with the script link provided above by Sanoj. (patch set 6)
>>>> On Tue, Jul 10, 2018 at 2:54 PM Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
>>>>
>>>>
>>>>
>>>> Hi Hari,
>>>>
>>>> thank you very much for your answer.
>>>> I will try to use the script mentioned above pointing to each backend bricks.
>>>>
>>>> So, if I understand, since I have a gluster cluster composed by 3 nodes (with 12 bricks on each node), I have to execute the script 36 times. Right?
>>>>
>>>> You can find below the “df” command output executed on a cluster node:
>>>>
>>>> /dev/mapper/cl_s01-gluster 100G 33M 100G 1% /gluster
>>>> /dev/mapper/gluster_vgd-gluster_lvd 9,0T 5,6T 3,5T 62% /gluster/mnt2
>>>> /dev/mapper/gluster_vge-gluster_lve 9,0T 5,7T 3,4T 63% /gluster/mnt3
>>>> /dev/mapper/gluster_vgj-gluster_lvj 9,0T 5,7T 3,4T 63% /gluster/mnt8
>>>> /dev/mapper/gluster_vgc-gluster_lvc 9,0T 5,6T 3,5T 62% /gluster/mnt1
>>>> /dev/mapper/gluster_vgl-gluster_lvl 9,0T 5,8T 3,3T 65% /gluster/mnt10
>>>> /dev/mapper/gluster_vgh-gluster_lvh 9,0T 5,7T 3,4T 64% /gluster/mnt6
>>>> /dev/mapper/gluster_vgf-gluster_lvf 9,0T 5,7T 3,4T 63% /gluster/mnt4
>>>> /dev/mapper/gluster_vgm-gluster_lvm 9,0T 5,4T 3,7T 60% /gluster/mnt11
>>>> /dev/mapper/gluster_vgn-gluster_lvn 9,0T 5,4T 3,7T 60% /gluster/mnt12
>>>> /dev/mapper/gluster_vgg-gluster_lvg 9,0T 5,7T 3,4T 64% /gluster/mnt5
>>>> /dev/mapper/gluster_vgi-gluster_lvi 9,0T 5,7T 3,4T 63% /gluster/mnt7
>>>> /dev/mapper/gluster_vgk-gluster_lvk 9,0T 5,8T 3,3T 65% /gluster/mnt9
>>>>
>>>> I will execute the following command and I will put here the output.
>>>>
>>>> ./quota_fsck_new.py --full-logs --sub-dir /gluster/mnt{1..12}
>>>>
>>>> Thank you again for your support.
>>>> Regards,
>>>> Mauro
>>>>
>>>> Il giorno 10 lug 2018, alle ore 11:02, Hari Gowtham <hgowtham at redhat.com <mailto:hgowtham at redhat.com>> ha scritto:
>>>>
>>>> Hi,
>>>>
>>>> There is no explicit command to backup all the quota limits as per my
>>>> understanding. need to look further about this.
>>>> But you can do the following to backup and set it.
>>>> Gluster volume quota volname list which will print all the quota
>>>> limits on that particular volume.
>>>> You will have to make a note of the directories with their respective limit set.
>>>> Once noted down, you can disable quota on the volume and then enable it.
>>>> Once enabled, you will have to set each limit explicitly on the volume.
>>>>
>>>> Before doing this we suggest you can to try running the script
>>>> mentioned above with the backend brick path instead of the mount path.
>>>> you need to run this on the machines where the backend bricks are
>>>> located and not on the mount.
>>>> On Mon, Jul 9, 2018 at 9:01 PM Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
>>>>
>>>>
>>>> Hi Sanoj,
>>>>
>>>> could you provide me the command that I need in order to backup all quota limits?
>>>> If there is no solution for this kind of problem, I would like to try to follow your “backup” suggestion.
>>>>
>>>> Do you think that I should contact gluster developers too?
>>>>
>>>> Thank you very much.
>>>> Regards,
>>>> Mauro
>>>>
>>>>
>>>> Il giorno 05 lug 2018, alle ore 09:56, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> ha scritto:
>>>>
>>>> Hi Sanoj,
>>>>
>>>> unfortunately the output of the command execution was not helpful.
>>>>
>>>> [root at s01 ~]# find /tier2/CSP/ans004 | xargs getfattr -d -m. -e hex
>>>> [root at s01 ~]#
>>>>
>>>> Do you have some other idea in order to detect the cause of the issue?
>>>>
>>>> Thank you again,
>>>> Mauro
>>>>
>>>>
>>>> Il giorno 05 lug 2018, alle ore 09:08, Sanoj Unnikrishnan <sunnikri at redhat.com <mailto:sunnikri at redhat.com>> ha scritto:
>>>>
>>>> Hi Mauro,
>>>>
>>>> A script issue did not capture all necessary xattr.
>>>> Could you provide the xattrs with..
>>>> find /tier2/CSP/ans004 | xargs getfattr -d -m. -e hex
>>>>
>>>> Meanwhile, If you are being impacted, you could do the following
>>>> back up quota limits
>>>> disable quota
>>>> enable quota
>>>> freshly set the limits.
>>>>
>>>> Please capture the xattr values first, so that we can get to know what went wrong.
>>>> Regards,
>>>> Sanoj
>>>>
>>>>
>>>> On Tue, Jul 3, 2018 at 4:09 PM, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
>>>>
>>>>
>>>> Dear Sanoj,
>>>>
>>>> thank you very much for your support.
>>>> I just downloaded and executed the script you suggested.
>>>>
>>>> This is the full command I executed:
>>>>
>>>> ./quota_fsck_new.py --full-logs --sub-dir /tier2/CSP/ans004/ /gluster
>>>>
>>>> In attachment, you can find the logs generated by the script.
>>>> What can I do now?
>>>>
>>>> Thank you very much for your patience.
>>>> Mauro
>>>>
>>>>
>>>>
>>>>
>>>> Il giorno 03 lug 2018, alle ore 11:34, Sanoj Unnikrishnan <sunnikri at redhat.com <mailto:sunnikri at redhat.com>> ha scritto:
>>>>
>>>> Hi Mauro,
>>>>
>>>> This may be an issue with update of backend xattrs.
>>>> To RCA further and provide resolution could you provide me with the logs by running the following fsck script.
>>>> https://review.gluster.org/#/c/19179/6/extras/quota/quota_fsck.py <https://review.gluster.org/#/c/19179/6/extras/quota/quota_fsck.py>
>>>>
>>>> Try running the script and revert with the logs generated.
>>>>
>>>> Thanks,
>>>> Sanoj
>>>>
>>>>
>>>> On Mon, Jul 2, 2018 at 2:21 PM, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote:
>>>>
>>>>
>>>> Dear Users,
>>>>
>>>> I just noticed that, after some data deletions executed inside "/tier2/CSP/ans004” folder, the amount of used disk reported by quota command doesn’t reflect the value indicated by du command.
>>>> Surfing on the web, it seems that it is a bug of previous versions of Gluster FS and it was already fixed.
>>>> In my case, the problem seems unfortunately still here.
>>>>
>>>> How can I solve this issue? Is it possible to do it without starting a downtime period?
>>>>
>>>> Thank you very much in advance,
>>>> Mauro
>>>>
>>>> [root at s01 ~]# glusterfs -V
>>>> glusterfs 3.10.5
>>>> Repository revision: git://git.gluster.org/glusterfs.git <git://git.gluster.org/glusterfs.git>
>>>> Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/ <https://www.gluster.org/>>
>>>> GlusterFS comes with ABSOLUTELY NO WARRANTY.
>>>> It is licensed to you under your choice of the GNU Lesser
>>>> General Public License, version 3 or any later version (LGPLv3
>>>> or later), or the GNU General Public License, version 2 (GPLv2),
>>>> in all cases as published by the Free Software Foundation.
>>>>
>>>> [root at s01 ~]# gluster volume quota tier2 list /CSP/ans004
>>>> Path Hard-limit Soft-limit Used Available Soft-limit exceeded? Hard-limit exceeded?
>>>> -------------------------------------------------------------------------------------------------------------------------------
>>>> /CSP/ans004 1.0TB 99%(1013.8GB) 3.9TB 0Bytes Yes Yes
>>>>
>>>> [root at s01 ~]# du -hs /tier2/CSP/ans004/
>>>> 295G /tier2/CSP/ans004/
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Hari Gowtham.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Hari Gowtham.
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Hari Gowtham.
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180908/c996fbfa/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.tar.gz
Type: application/x-gzip
Size: 1240645 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180908/c996fbfa/attachment-0001.gz>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180908/c996fbfa/attachment-0003.html>
More information about the Gluster-users
mailing list