[Gluster-devel] [Gluster-users] Fwd: dht_is_subvol_filled messages on client

Thu May 5 11:07:49 UTC 2016

On 05/05/16 11:31, Kaushal M wrote:
> On Thu, May 5, 2016 at 2:36 PM, David Gossage
> <dgossage at carouselchecks.com> wrote:
>>
>>
>>
>> On Thu, May 5, 2016 at 3:28 AM, Serkan Çoban <cobanserkan at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> You can find the output below link:
>>> https://www.dropbox.com/s/wzrh5yp494ogksc/status_detail.txt?dl=0
>>>
>>> Thanks,
>>> Serkan
>>
>>
>> Maybe not issue, but playing one of these things is not like the other I
>> notice of all the bricks only one seems to be different at a quick glance
>>
>> Brick                : Brick 1.1.1.235:/bricks/20
>> TCP Port             : 49170
>> RDMA Port            : 0
>> Online               : Y
>> Pid                  : 26736
>> File System          : ext4
>> Device               : /dev/mapper/vol0-vol_root
>> Mount Options        : rw,relatime,data=ordered
>> Inode Size           : 256
>> Disk Space Free      : 86.1GB
>> Total Disk Space     : 96.0GB
>> Inode Count          : 6406144
>> Free Inodes          : 6381374
>>
>> Every other brick seems to be 7TB and xfs but this one.
>
> Looks like the brick fs isn't mounted, and the root-fs is being used
> instead. But that still leaves enough inodes free.
>
> What I suspect is that one of the cluster translators is mixing up
> stats when aggregating from multiple bricks.
> From the log snippet you gave in the first mail, it seems like the
> disperse translator is possibly involved.

Currently ec takes the number of potential files in the subvolume 
(f_files) as the maximum of all its subvolumes, but it takes the 
available count (f_ffree) as the minumum of all its volumes.

This causes max to be ~781.000.000, but free will be ~6.300.000. This 
gives a ~0.8% available, i.e. almost 100% full.

Given the circumstances I think it's the correct thing to do.

Xavi

>
> BTW, how large is the volume you have? Those are a lot of bricks!
>
> ~kaushal
>
>
>>
>>
>>
>>>
>>>
>>> On Thu, May 5, 2016 at 9:33 AM, Xavier Hernandez <xhernandez at datalab.es>
>>> wrote:
>>>> Can you post the result of 'gluster volume status v0 detail' ?
>>>>
>>>>
>>>> On 05/05/16 06:49, Serkan Çoban wrote:
>>>>>
>>>>> Hi, Can anyone suggest something for this issue? df, du has no issue
>>>>> for the bricks yet one subvolume not being used by gluster..
>>>>>
>>>>> On Wed, May 4, 2016 at 4:40 PM, Serkan Çoban <cobanserkan at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I changed cluster.min-free-inodes to "0". Remount the volume on
>>>>>> clients. inode full messages not coming to syslog anymore but I see
>>>>>> disperse-56 subvolume still not being used.
>>>>>> Anything I can do to resolve this issue? Maybe I can destroy and
>>>>>> recreate the volume but I am not sure It will fix this issue...
>>>>>> Maybe the disperse size 16+4 is too big should I change it to 8+2?
>>>>>>
>>>>>> On Tue, May 3, 2016 at 2:36 PM, Serkan Çoban <cobanserkan at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> I also checked the df output all 20 bricks are same like below:
>>>>>>> /dev/sdu1 7.3T 34M 7.3T 1% /bricks/20
>>>>>>>
>>>>>>> On Tue, May 3, 2016 at 1:40 PM, Raghavendra G
>>>>>>> <raghavendra at gluster.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 2, 2016 at 11:41 AM, Serkan Çoban
>>>>>>>> <cobanserkan at gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> 1. What is the out put of du -hs <back-end-export>? Please get
>>>>>>>>>> this
>>>>>>>>>> information for each of the brick that are part of disperse.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry. I needed df output of the filesystem containing brick. Not
>>>>>>>> du.
>>>>>>>> Sorry
>>>>>>>> about that.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> There are 20 bricks in disperse-56 and the du -hs output is like:
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 1.8M /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>> 80K /bricks/20
>>>>>>>>>
>>>>>>>>> I see that gluster is not writing to this disperse set. All other
>>>>>>>>> disperse sets are filled 13GB but this one is empty. I see
>>>>>>>>> directory
>>>>>>>>> structure created but no files in directories.
>>>>>>>>> How can I fix the issue? I will try to rebalance but I don't think
>>>>>>>>> it
>>>>>>>>> will write to this disperse set...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Apr 30, 2016 at 9:22 AM, Raghavendra G
>>>>>>>>> <raghavendra at gluster.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 29, 2016 at 12:32 AM, Serkan Çoban
>>>>>>>>>> <cobanserkan at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi, I cannot get an answer from user list, so asking to devel
>>>>>>>>>>> list.
>>>>>>>>>>>
>>>>>>>>>>> I am getting [dht-diskusage.c:277:dht_is_subvol_filled] 0-v0-dht:
>>>>>>>>>>> inodes on subvolume 'v0-disperse-56' are at (100.00 %), consider
>>>>>>>>>>> adding more bricks.
>>>>>>>>>>>
>>>>>>>>>>> message on client logs.My cluster is empty there are only a
>>>>>>>>>>> couple
>>>>>>>>>>> of
>>>>>>>>>>> GB files for testing. Why this message appear in syslog?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> dht uses disk usage information from backend export.
>>>>>>>>>>
>>>>>>>>>> 1. What is the out put of du -hs <back-end-export>? Please get
>>>>>>>>>> this
>>>>>>>>>> information for each of the brick that are part of disperse.
>>>>>>>>>> 2. Once you get du information from each brick, the value seen by
>>>>>>>>>> dht
>>>>>>>>>> will
>>>>>>>>>> be based on how cluster/disperse aggregates du info (basically
>>>>>>>>>> statfs
>>>>>>>>>> fop).
>>>>>>>>>>
>>>>>>>>>> The reason for 100% disk usage may be,
>>>>>>>>>> In case of 1, backend fs might be shared by data other than brick.
>>>>>>>>>> In case of 2, some issues with aggregation.
>>>>>>>>>>
>>>>>>>>>>> Is is safe to
>>>>>>>>>>> ignore it?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> dht will try not to have data files on the subvol in question
>>>>>>>>>> (v0-disperse-56). Hence lookup cost will be two hops for files
>>>>>>>>>> hashing
>>>>>>>>>> to
>>>>>>>>>> disperse-56 (note that other fops like read/write/open still have
>>>>>>>>>> the
>>>>>>>>>> cost
>>>>>>>>>> of single hop and dont suffer from this penalty). Other than that
>>>>>>>>>> there
>>>>>>>>>> is
>>>>>>>>>> no significant harm unless disperse-56 is really running out of
>>>>>>>>>> space.
>>>>>>>>>>
>>>>>>>>>> regards,
>>>>>>>>>> Raghavendra
>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-devel mailing list
>>>>>>>>>>> Gluster-devel at gluster.org
>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Raghavendra G
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-devel mailing list
>>>>>>>>> Gluster-devel at gluster.org
>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Raghavendra G
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>