[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
Nithya Balachandran
nbalacha at redhat.com
Thu Mar 1 05:32:26 UTC 2018
Hi Jose,
On 28 February 2018 at 22:31, Jose V. Carrión <jocarbur at gmail.com> wrote:
> Hi Nithya,
>
> My initial setup was composed of 2 similar nodes: stor1data and stor2data.
> A month ago I expanded both volumes with a new node: stor3data (2 bricks
> per volume).
> Of course, then to add the new peer with the bricks I did the 'balance
> force' operation. This task finished successfully (you can see info below)
> and number of files on the 3 nodes were very similar .
>
> For volumedisk1 I only have files of 500MB and they are continuosly
> written in sequential mode. The filename pattern of written files is:
>
> run.node1.0000.rd
> run.node2.0000.rd
> run.node1.0001.rd
> run.node2.0001.rd
> run.node1.0002.rd
> run.node2.0002.rd
> ...........
> ...........
> run.node1.X.rd
> run.node2.X.rd
>
> ( X ranging from 0000 to infinite )
>
> Curiously stor1data and stor2data maintain similar ratios in bytes:
>
> Filesystem 1K-blocks Used Available
> Use% Mounted on
> /dev/sdc1 52737613824 17079174264 35658439560 33%
> /mnt/glusterfs/vol1 -> stor1data
> /dev/sdc1 52737613824 17118810848 35618802976 33%
> /mnt/glusterfs/vol1 -> stor2data
>
> However the ratio on som3data differs too much (1TB):
> Filesystem 1K-blocks Used Available
> Use% Mounted on
> /dev/sdc1 52737613824 15479191748 37258422076 30%
> /mnt/disk_c/glusterfs/vol1 -> stor3data
> /dev/sdd1 52737613824 15566398604 37171215220 30%
> /mnt/disk_d/glusterfs/vol1 -> stor3data
>
> Thinking in inodes:
>
> Filesystem Inodes IUsed IFree IUse%
> Mounted on
> /dev/sdc1 5273970048 851053 5273118995 1%
> /mnt/glusterfs/vol1 -> stor1data
> /dev/sdc1 5273970048 849388 5273120660 1%
> /mnt/glusterfs/vol1 -> stor2data
>
> /dev/sdc1 5273970048 846877 5273123171 1%
> /mnt/disk_c/glusterfs/vol1 -> stor3data
> /dev/sdd1 5273970048 845250 5273124798 1%
> /mnt/disk_d/glusterfs/vol1 -> stor3data
>
> 851053 (stor1) - 845250 (stor3) = 5803 files of difference !
>
The inode numbers are a little misleading here - gluster uses some to
create its own internal files and directory structures. Based on the
average file size, I think this would actually work out to a difference of
around 2000 files.
>
> In adition, correct me if I'm wrong, stor3data should have 50% of
> probability to store a new file (even taking into account the algorithm of
> DHT with filename patterns)
>
> Theoretically yes , but again, it depends on the filenames and their hash
distribution.
Please send us the output of :
gluster volume rebalance <volname> status
for the volume.
Regards,
Nithya
> Thanks,
> Greetings.
>
> Jose V.
>
> Status of volume: volumedisk0
> Gluster process TCP Port RDMA Port Online
> Pid
> ------------------------------------------------------------
> ------------------
> Brick stor1data:/mnt/glusterfs/vol0/bri
> ck1 49152 0 Y
> 13533
> Brick stor2data:/mnt/glusterfs/vol0/bri
> ck1 49152 0 Y
> 13302
> Brick stor3data:/mnt/disk_b1/glusterfs/
> vol0/brick1 49152 0 Y
> 17371
> Brick stor3data:/mnt/disk_b2/glusterfs/
> vol0/brick1 49153 0 Y
> 17391
> NFS Server on localhost N/A N/A N
> N/A
> NFS Server on stor3data N/A N/A N N/A
> NFS Server on stor2data N/A N/A N N/A
>
> Task Status of Volume volumedisk0
> ------------------------------------------------------------
> ------------------
> Task : Rebalance
> ID : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca
> Status : completed
>
> Status of volume: volumedisk1
> Gluster process TCP Port RDMA Port Online
> Pid
> ------------------------------------------------------------
> ------------------
> Brick stor1data:/mnt/glusterfs/vol1/bri
> ck1 49153 0 Y
> 13579
> Brick stor2data:/mnt/glusterfs/vol1/bri
> ck1 49153 0 Y
> 13344
> Brick stor3data:/mnt/disk_c/glusterfs/v
> ol1/brick1 49154 0 Y
> 17439
> Brick stor3data:/mnt/disk_d/glusterfs/v
> ol1/brick1 49155 0 Y
> 17459
> NFS Server on localhost N/A N/A N
> N/A
> NFS Server on stor3data N/A N/A N N/A
> NFS Server on stor2data N/A N/A N N/A
>
> Task Status of Volume volumedisk1
> ------------------------------------------------------------
> ------------------
> Task : Rebalance
> ID : d0048704-beeb-4a6a-ae94-7e7916423fd3
> Status : completed
>
>
> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:
>
>> Hi Jose,
>>
>> On 28 February 2018 at 18:28, Jose V. Carrión <jocarbur at gmail.com> wrote:
>>
>>> Hi Nithya,
>>>
>>> I applied the workarround for this bug and now df shows the right size:
>>>
>>> That is good to hear.
>>
>>
>>
>>> [root at stor1 ~]# df -h
>>> Filesystem Size Used Avail Use% Mounted on
>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0
>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1
>>> stor1data:/volumedisk0
>>> 101T 3,3T 97T 4% /volumedisk0
>>> stor1data:/volumedisk1
>>> 197T 61T 136T 31% /volumedisk1
>>>
>>>
>>> [root at stor2 ~]# df -h
>>> Filesystem Size Used Avail Use% Mounted on
>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0
>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1
>>> stor2data:/volumedisk0
>>> 101T 3,3T 97T 4% /volumedisk0
>>> stor2data:/volumedisk1
>>> 197T 61T 136T 31% /volumedisk1
>>>
>>>
>>> [root at stor3 ~]# df -h
>>> Filesystem Size Used Avail Use% Mounted on
>>> /dev/sdb1 25T 638G 24T 3% /mnt/disk_b1/glusterfs/vol0
>>> /dev/sdb2 25T 654G 24T 3% /mnt/disk_b2/glusterfs/vol0
>>> /dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1
>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1
>>> stor3data:/volumedisk0
>>> 101T 3,3T 97T 4% /volumedisk0
>>> stor3data:/volumedisk1
>>> 197T 61T 136T 31% /volumedisk1
>>>
>>>
>>> However I'm concerned because, as you can see, the volumedisk0 on
>>> stor3data is composed by 2 bricks on thesame disk but on different
>>> partitions (/dev/sdb1 and /dev/sdb2).
>>> After to aplly the workarround, the shared-brick-count parameter was
>>> setted to 1 in all the bricks and all the servers (see below). Could be
>>> this an issue ?
>>>
>>> No, this is correct. The shared-brick-count will be > 1 only if multiple
>> bricks share the same partition.
>>
>>
>>
>>> Also, I can check that stor3data is now unbalanced respect stor1data and
>>> stor2data. The three nodes have the same size of brick but stor3data bricks
>>> have used 1TB less than stor1data and stor2data:
>>>
>>
>>
>> This does not necessarily indicate a problem. The distribution need not
>> be exactly equal and depends on the filenames. Can you provide more
>> information on the kind of dataset (how many files, sizes etc) on this
>> volume? Did you create the volume with all 4 bricks or add some later?
>>
>> Regards,
>> Nithya
>>
>>>
>>> stor1data:
>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0
>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1
>>>
>>> stor2data bricks:
>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0
>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1
>>>
>>> stor3data bricks:
>>> /dev/sdb1 25T 638G 24T 3% /mnt/disk_b1/glusterfs/vol0
>>> /dev/sdb2 25T 654G 24T 3% /mnt/disk_b2/glusterfs/vol0
>>> dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1
>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1
>>>
>>>
>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>>
>>> [root at stor2 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>>
>>> [root at stor3t ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>> option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>> shared-brick-count 0
>>>
>>> Thaks for your help,
>>> Greetings.
>>>
>>> Jose V.
>>>
>>>
>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:
>>>
>>>> Hi Jose,
>>>>
>>>> There is a known issue with gluster 3.12.x builds (see [1]) so you may
>>>> be running into this.
>>>>
>>>> The "shared-brick-count" values seem fine on stor1. Please send us "grep
>>>> -n "share" /var/lib/glusterd/vols/volumedisk1/*" results for the other
>>>> nodes so we can check if they are the cause.
>>>>
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>
>>>>
>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260
>>>>
>>>> On 28 February 2018 at 03:03, Jose V. Carrión <jocarbur at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Some days ago all my glusterfs configuration was working fine. Today I
>>>>> realized that the total size reported by df command was changed and is
>>>>> smaller than the aggregated capacity of all the bricks in the volume.
>>>>>
>>>>> I checked that all the volumes status are fine, all the glusterd
>>>>> daemons are running, there is no error in logs, however df shows a bad
>>>>> total size.
>>>>>
>>>>> My configuration for one volume: volumedisk1
>>>>> [root at stor1 ~]# gluster volume status volumedisk1 detail
>>>>>
>>>>> Status of volume: volumedisk1
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick : Brick stor1data:/mnt/glusterfs/vol1/brick1
>>>>> TCP Port : 49153
>>>>> RDMA Port : 0
>>>>> Online : Y
>>>>> Pid : 13579
>>>>> File System : xfs
>>>>> Device : /dev/sdc1
>>>>> Mount Options : rw,noatime
>>>>> Inode Size : 512
>>>>> Disk Space Free : 35.0TB
>>>>> Total Disk Space : 49.1TB
>>>>> Inode Count : 5273970048
>>>>> Free Inodes : 5273123069
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick : Brick stor2data:/mnt/glusterfs/vol1/brick1
>>>>> TCP Port : 49153
>>>>> RDMA Port : 0
>>>>> Online : Y
>>>>> Pid : 13344
>>>>> File System : xfs
>>>>> Device : /dev/sdc1
>>>>> Mount Options : rw,noatime
>>>>> Inode Size : 512
>>>>> Disk Space Free : 35.0TB
>>>>> Total Disk Space : 49.1TB
>>>>> Inode Count : 5273970048
>>>>> Free Inodes : 5273124718
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick : Brick stor3data:/mnt/disk_c/glusterf
>>>>> s/vol1/brick1
>>>>> TCP Port : 49154
>>>>> RDMA Port : 0
>>>>> Online : Y
>>>>> Pid : 17439
>>>>> File System : xfs
>>>>> Device : /dev/sdc1
>>>>> Mount Options : rw,noatime
>>>>> Inode Size : 512
>>>>> Disk Space Free : 35.7TB
>>>>> Total Disk Space : 49.1TB
>>>>> Inode Count : 5273970048
>>>>> Free Inodes : 5273125437
>>>>> ------------------------------------------------------------
>>>>> ------------------
>>>>> Brick : Brick stor3data:/mnt/disk_d/glusterf
>>>>> s/vol1/brick1
>>>>> TCP Port : 49155
>>>>> RDMA Port : 0
>>>>> Online : Y
>>>>> Pid : 17459
>>>>> File System : xfs
>>>>> Device : /dev/sdd1
>>>>> Mount Options : rw,noatime
>>>>> Inode Size : 512
>>>>> Disk Space Free : 35.6TB
>>>>> Total Disk Space : 49.1TB
>>>>> Inode Count : 5273970048
>>>>> Free Inodes : 5273127036
>>>>>
>>>>>
>>>>> Then full size for volumedisk1 should be: 49.1TB + 49.1TB + 49.1TB
>>>>> +49.1TB = *196,4 TB *but df shows:
>>>>>
>>>>> [root at stor1 ~]# df -h
>>>>> Filesystem Size Used Avail Use% Mounted on
>>>>> /dev/sda2 48G 21G 25G 46% /
>>>>> tmpfs 32G 80K 32G 1% /dev/shm
>>>>> /dev/sda1 190M 62M 119M 35% /boot
>>>>> /dev/sda4 395G 251G 124G 68% /data
>>>>> /dev/sdb1 26T 601G 25T 3% /mnt/glusterfs/vol0
>>>>> /dev/sdc1 50T 15T 36T 29% /mnt/glusterfs/vol1
>>>>> stor1data:/volumedisk0
>>>>> 76T 1,6T 74T 3% /volumedisk0
>>>>> stor1data:/volumedisk1
>>>>> *148T* 42T 106T 29% /volumedisk1
>>>>>
>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB
>>>>>
>>>>> It's a production system so I hope you can help me.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Jose V.
>>>>>
>>>>>
>>>>> Below some other data of my configuration:
>>>>>
>>>>> [root at stor1 ~]# gluster volume info
>>>>>
>>>>> Volume Name: volumedisk0
>>>>> Type: Distribute
>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1
>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1
>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1
>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1
>>>>> Options Reconfigured:
>>>>> performance.cache-size: 4GB
>>>>> cluster.min-free-disk: 1%
>>>>> performance.io-thread-count: 16
>>>>> performance.readdir-ahead: on
>>>>>
>>>>> Volume Name: volumedisk1
>>>>> Type: Distribute
>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1
>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1
>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1
>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1
>>>>> Options Reconfigured:
>>>>> cluster.min-free-inodes: 6%
>>>>> performance.cache-size: 4GB
>>>>> cluster.min-free-disk: 1%
>>>>> performance.io-thread-count: 16
>>>>> performance.readdir-ahead: on
>>>>>
>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/*
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 1
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>>>> shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0
>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option
>>>>> shared-brick-count 0
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/6fb5f694/attachment.html>
More information about the Gluster-users
mailing list