[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

Nithya Balachandran nbalacha at redhat.com
Thu Mar 1 05:32:26 UTC 2018


Hi Jose,

On 28 February 2018 at 22:31, Jose V. Carrión <jocarbur at gmail.com> wrote:

> Hi Nithya,
>
> My initial setup was composed of 2 similar nodes: stor1data and stor2data.
> A month ago I expanded both volumes with a new node: stor3data (2 bricks
> per volume).
> Of course, then to add the new peer with the bricks I did the 'balance
> force' operation. This task finished successfully (you can see info below)
> and number of files on the 3 nodes were very similar .
>
> For volumedisk1 I only have files of 500MB and they are continuosly
> written in sequential mode. The filename pattern of written files is:
>
> run.node1.0000.rd
> run.node2.0000.rd
> run.node1.0001.rd
> run.node2.0001.rd
> run.node1.0002.rd
> run.node2.0002.rd
> ...........
> ...........
> run.node1.X.rd
> run.node2.X.rd
>
> (  X ranging from 0000 to infinite )
>
> Curiously stor1data and stor2data maintain similar ratios in bytes:
>
> Filesystem              1K-blocks        Used               Available
> Use% Mounted on
> /dev/sdc1             52737613824 17079174264  35658439560  33%
> /mnt/glusterfs/vol1   -> stor1data
> /dev/sdc1             52737613824 17118810848  35618802976  33%
> /mnt/glusterfs/vol1  ->  stor2data
>
> However the ratio on som3data differs too much (1TB):
> Filesystem           1K-blocks        Used                Available
> Use% Mounted on
> /dev/sdc1             52737613824 15479191748  37258422076  30%
> /mnt/disk_c/glusterfs/vol1 -> stor3data
> /dev/sdd1             52737613824 15566398604  37171215220  30%
> /mnt/disk_d/glusterfs/vol1 -> stor3data
>
> Thinking in  inodes:
>
> Filesystem                Inodes       IUsed       IFree          IUse%
> Mounted on
> /dev/sdc1             5273970048  851053  5273118995    1%
> /mnt/glusterfs/vol1 ->  stor1data
> /dev/sdc1             5273970048  849388  5273120660    1%
> /mnt/glusterfs/vol1 ->  stor2data
>
> /dev/sdc1             5273970048  846877  5273123171    1%
> /mnt/disk_c/glusterfs/vol1 -> stor3data
> /dev/sdd1             5273970048  845250  5273124798    1%
> /mnt/disk_d/glusterfs/vol1 -> stor3data
>
> 851053 (stor1) - 845250 (stor3) = 5803 files of difference !
>

The inode numbers are a little misleading here - gluster uses some to
create its own internal files and directory structures. Based on the
average file size, I think this would actually work out to a difference of
around 2000 files.


>
> In adition, correct me if I'm wrong,  stor3data should have 50% of
> probability to store a new file (even taking into account the algorithm of
> DHT with filename patterns)
>
> Theoretically yes , but again, it depends on the filenames and their hash
distribution.

Please send us the output of :
gluster volume rebalance <volname> status

for the volume.

Regards,
Nithya


> Thanks,
> Greetings.
>
> Jose V.
>
> Status of volume: volumedisk0
> Gluster process                             TCP Port  RDMA Port  Online
>  Pid
> ------------------------------------------------------------
> ------------------
> Brick stor1data:/mnt/glusterfs/vol0/bri
> ck1                                         49152     0          Y
> 13533
> Brick stor2data:/mnt/glusterfs/vol0/bri
> ck1                                         49152     0          Y
> 13302
> Brick stor3data:/mnt/disk_b1/glusterfs/
> vol0/brick1                                 49152     0          Y
> 17371
> Brick stor3data:/mnt/disk_b2/glusterfs/
> vol0/brick1                                 49153     0          Y
> 17391
> NFS Server on localhost                     N/A       N/A        N
> N/A
> NFS Server on stor3data                 N/A       N/A        N       N/A
> NFS Server on stor2data                 N/A       N/A        N       N/A
>
> Task Status of Volume volumedisk0
> ------------------------------------------------------------
> ------------------
> Task                 : Rebalance
> ID                   : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca
> Status               : completed
>
> Status of volume: volumedisk1
> Gluster process                             TCP Port  RDMA Port  Online
>  Pid
> ------------------------------------------------------------
> ------------------
> Brick stor1data:/mnt/glusterfs/vol1/bri
> ck1                                         49153     0          Y
> 13579
> Brick stor2data:/mnt/glusterfs/vol1/bri
> ck1                                         49153     0          Y
> 13344
> Brick stor3data:/mnt/disk_c/glusterfs/v
> ol1/brick1                                  49154     0          Y
> 17439
> Brick stor3data:/mnt/disk_d/glusterfs/v
> ol1/brick1                                  49155     0          Y
> 17459
> NFS Server on localhost                     N/A       N/A        N
> N/A
> NFS Server on stor3data                 N/A       N/A        N       N/A
> NFS Server on stor2data                 N/A       N/A        N       N/A
>
> Task Status of Volume volumedisk1
> ------------------------------------------------------------
> ------------------
> Task                 : Rebalance
> ID                   : d0048704-beeb-4a6a-ae94-7e7916423fd3
> Status               : completed
>
>
> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:
>
>> Hi Jose,
>>
>> On 28 February 2018 at 18:28, Jose V. Carrión <jocarbur at gmail.com> wrote:
>>
>>> Hi Nithya,
>>>
>>> I applied the workarround for this bug and now df shows the right size:
>>>
>>> That is good to hear.
>>
>>
>>
>>> [root at stor1 ~]# df -h
>>> Filesystem            Size  Used Avail Use% Mounted on
>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>> stor1data:/volumedisk0
>>>                       101T  3,3T   97T   4% /volumedisk0
>>> stor1data:/volumedisk1
>>>                       197T   61T  136T  31% /volumedisk1
>>>
>>>
>>> [root at stor2 ~]# df -h
>>> Filesystem            Size  Used Avail Use% Mounted on
>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>> stor2data:/volumedisk0
>>>                       101T  3,3T   97T   4% /volumedisk0
>>> stor2data:/volumedisk1
>>>                       197T   61T  136T  31% /volumedisk1
>>>
>>>
>>> [root at stor3 ~]# df -h
>>> Filesystem            Size  Used Avail Use% Mounted on
>>> /dev/sdb1              25T  638G   24T   3% /mnt/disk_b1/glusterfs/vol0
>>> /dev/sdb2              25T  654G   24T   3% /mnt/disk_b2/glusterfs/vol0
>>> /dev/sdc1              50T   15T   35T  30% /mnt/disk_c/glusterfs/vol1
>>> /dev/sdd1              50T   15T   35T  30% /mnt/disk_d/glusterfs/vol1
>>> stor3data:/volumedisk0
>>>                       101T  3,3T   97T   4% /volumedisk0
>>> stor3data:/volumedisk1
>>>                       197T   61T  136T  31% /volumedisk1
>>>
>>>
>>> However I'm concerned because, as you can see, the volumedisk0 on
>>> stor3data is composed by 2 bricks on thesame disk but on different
>>> partitions (/dev/sdb1 and /dev/sdb2).
>>> After to aplly the workarround, the  shared-brick-count parameter was
>>> setted to 1 in all the bricks and all the servers (see below). Could be
>>> this an issue ?
>>>
>>> No, this is correct. The shared-brick-count will be > 1 only if multiple
>> bricks share the same partition.
>>
>>
>>
>>> Also, I can check that stor3data is now unbalanced respect stor1data and
>>> stor2data. The three nodes have the same size of brick but stor3data bricks
>>> have used 1TB less than stor1data and stor2data:
>>>
>>
>>
>> This does not necessarily indicate a problem. The distribution need not
>> be exactly equal and depends on the filenames. Can you provide more
>> information on the kind of dataset (how many files, sizes etc) on this
>> volume? Did you create the volume with all 4 bricks or add some later?
>>
>> Regards,
>> Nithya
>>
>>>
>>> stor1data:
>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>
>>> stor2data bricks:
>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>
>>> stor3data bricks:
>>>   /dev/sdb1              25T  638G   24T   3% /mnt/disk_b1/glusterfs/vol0
>>>   /dev/sdb2              25T  654G   24T   3% /mnt/disk_b2/glusterfs/vol0
>>>    dev/sdc1              50T   15T   35T  30% /mnt/disk_c/glusterfs/vol1
>>>    /dev/sdd1             50T   15T   35T  30% /mnt/disk_d/glusterfs/vol1
>>>
>>>
>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>>
>>> [root at stor2 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>>
>>> [root at stor3t ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>> shared-brick-count 0
>>>
>>> Thaks for your help,
>>> Greetings.
>>>
>>> Jose V.
>>>
>>>
>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:
>>>
>>>> Hi Jose,
>>>>
>>>> There is a known issue with gluster 3.12.x builds (see [1]) so you may
>>>> be running into this.
>>>>
>>>> The "shared-brick-count" values seem fine on stor1. Please send us "grep
>>>> -n "share" /var/lib/glusterd/vols/volumedisk1/*" results for the other
>>>> nodes so we can check if they are the cause.
>>>>
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>
>>>>
>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260
>>>>
>>>> On 28 February 2018 at 03:03, Jose V. Carrión <jocarbur at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Some days ago all my glusterfs configuration was working fine. Today I
>>>>> realized that the total size reported by df command was changed and is
>>>>> smaller than the aggregated capacity of all the bricks in the volume.
>>>>>
>>>>> I checked that all the volumes status are fine, all the glusterd
>>>>> daemons are running, there is no error in logs,  however df shows a bad
>>>>> total size.
>>>>>
>>>>> My configuration for one volume: volumedisk1
>>>>> [root at stor1 ~]# gluster volume status volumedisk1  detail
>>>>>
>>>>> Status of volume: volumedisk1
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick                : Brick stor1data:/mnt/glusterfs/vol1/brick1
>>>>> TCP Port             : 49153
>>>>> RDMA Port            : 0
>>>>> Online               : Y
>>>>> Pid                  : 13579
>>>>> File System          : xfs
>>>>> Device               : /dev/sdc1
>>>>> Mount Options        : rw,noatime
>>>>> Inode Size           : 512
>>>>> Disk Space Free      : 35.0TB
>>>>> Total Disk Space     : 49.1TB
>>>>> Inode Count          : 5273970048
>>>>> Free Inodes          : 5273123069
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick                : Brick stor2data:/mnt/glusterfs/vol1/brick1
>>>>> TCP Port             : 49153
>>>>> RDMA Port            : 0
>>>>> Online               : Y
>>>>> Pid                  : 13344
>>>>> File System          : xfs
>>>>> Device               : /dev/sdc1
>>>>> Mount Options        : rw,noatime
>>>>> Inode Size           : 512
>>>>> Disk Space Free      : 35.0TB
>>>>> Total Disk Space     : 49.1TB
>>>>> Inode Count          : 5273970048
>>>>> Free Inodes          : 5273124718
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick                : Brick stor3data:/mnt/disk_c/glusterf
>>>>> s/vol1/brick1
>>>>> TCP Port             : 49154
>>>>> RDMA Port            : 0
>>>>> Online               : Y
>>>>> Pid                  : 17439
>>>>> File System          : xfs
>>>>> Device               : /dev/sdc1
>>>>> Mount Options        : rw,noatime
>>>>> Inode Size           : 512
>>>>> Disk Space Free      : 35.7TB
>>>>> Total Disk Space     : 49.1TB
>>>>> Inode Count          : 5273970048
>>>>> Free Inodes          : 5273125437
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick                : Brick stor3data:/mnt/disk_d/glusterf
>>>>> s/vol1/brick1
>>>>> TCP Port             : 49155
>>>>> RDMA Port            : 0
>>>>> Online               : Y
>>>>> Pid                  : 17459
>>>>> File System          : xfs
>>>>> Device               : /dev/sdd1
>>>>> Mount Options        : rw,noatime
>>>>> Inode Size           : 512
>>>>> Disk Space Free      : 35.6TB
>>>>> Total Disk Space     : 49.1TB
>>>>> Inode Count          : 5273970048
>>>>> Free Inodes          : 5273127036
>>>>>
>>>>>
>>>>> Then full size for volumedisk1 should be: 49.1TB + 49.1TB + 49.1TB
>>>>> +49.1TB = *196,4 TB  *but df shows:
>>>>>
>>>>> [root at stor1 ~]# df -h
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> /dev/sda2              48G   21G   25G  46% /
>>>>> tmpfs                  32G   80K   32G   1% /dev/shm
>>>>> /dev/sda1             190M   62M  119M  35% /boot
>>>>> /dev/sda4             395G  251G  124G  68% /data
>>>>> /dev/sdb1              26T  601G   25T   3% /mnt/glusterfs/vol0
>>>>> /dev/sdc1              50T   15T   36T  29% /mnt/glusterfs/vol1
>>>>> stor1data:/volumedisk0
>>>>>                        76T  1,6T   74T   3% /volumedisk0
>>>>> stor1data:/volumedisk1
>>>>>                       *148T*   42T  106T  29% /volumedisk1
>>>>>
>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB
>>>>>
>>>>> It's a production system so I hope you can help me.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Jose V.
>>>>>
>>>>>
>>>>> Below some other data of my configuration:
>>>>>
>>>>> [root at stor1 ~]# gluster volume info
>>>>>
>>>>> Volume Name: volumedisk0
>>>>> Type: Distribute
>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1
>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1
>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1
>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1
>>>>> Options Reconfigured:
>>>>> performance.cache-size: 4GB
>>>>> cluster.min-free-disk: 1%
>>>>> performance.io-thread-count: 16
>>>>> performance.readdir-ahead: on
>>>>>
>>>>> Volume Name: volumedisk1
>>>>> Type: Distribute
>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1
>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1
>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1
>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1
>>>>> Options Reconfigured:
>>>>> cluster.min-free-inodes: 6%
>>>>> performance.cache-size: 4GB
>>>>> cluster.min-free-disk: 1%
>>>>> performance.io-thread-count: 16
>>>>> performance.readdir-ahead: on
>>>>>
>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol:3:    option shared-brick-count 1
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 1
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol:3:    option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/6fb5f694/attachment.html>


More information about the Gluster-users mailing list