[Bugs] [Bug 1637196] New: Disperse volume 'df' usage is extremely incorrect after replace-brick.

bugzilla at redhat.com bugzilla at redhat.com
Mon Oct 8 21:16:47 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1637196

            Bug ID: 1637196
           Summary: Disperse volume 'df' usage is extremely incorrect
                    after replace-brick.
           Product: GlusterFS
           Version: 3.12
         Component: disperse
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: jbyers at stonefly.com
                CC: bugs at gluster.org



Disperse volume 'df' usage is extremely incorrect after replace-brick.

Disperse volume 'df' usage statistics are extremely incorrect
after replace brick where the source brick is down. On a 3
brick redundancy 1 disperse volume, the available space is
reduced by 50%, and the used 'inode' count goes up by 50% even
on empty volumes. The 'df' usage numbers are wrong on both
FUSE and NFS v3 mounts. Starting/stopping the disperse volume,
and remounting the client does not correct the 'df' usage
numbers.

When the replace-brick is done while the source brick is
running, the 'df' usage statistics after a replace brick seem
to be OK.

It looks as though only the statfs() numbers that 'df' is
using are incorrect; the actual disperse volume space and
inode usage looks OK. In some ways, that makes the issue
cosmetic, except for any applications or features that use and
believe these numbers.

Test plan:

# gluster --version
glusterfs 3.12.14

##### Start with empty bricks, on separate file-systems.
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdd        100G   33M  100G   1% /exports/brick-1
/dev/sde        100G   33M  100G   1% /exports/brick-2
/dev/sdf        100G   33M  100G   1% /exports/brick-3
/dev/sdg        100G   33M  100G   1% /exports/brick-4
/dev/sdh        100G   33M  100G   1% /exports/brick-5
/dev/sdi        100G   33M  100G   1% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M     3   50M    1% /exports/brick-1
/dev/sde          50M     3   50M    1% /exports/brick-2
/dev/sdf          50M     3   50M    1% /exports/brick-3
/dev/sdg          50M     3   50M    1% /exports/brick-4
/dev/sdh          50M     3   50M    1% /exports/brick-5
/dev/sdi          50M     3   50M    1% /exports/brick-6

##### Create the disperse volume:
# mkdir /exports/brick-1/disp-vol /exports/brick-2/disp-vol
/exports/brick-3/disp-vol /exports/brick-4/disp-vol /exports/brick-5/disp-vol
/exports/brick-6/disp-vol
# gluster volume create disp-vol disperse-data 2 redundancy 1 transport tcp
10.0.0.28:/exports/brick-1/disp-vol/ 10.0.0.28:/exports/brick-2/disp-vol/
10.0.0.28:/exports/brick-3/disp-vol/ force
volume create: disp-vol: success: please start the volume to access data
# gluster volume start disp-vol
volume start: disp-vol: success

##### Mount the disperse volume using both FUSE and NFS v3:
# mkdir /mnt/disp-vol-fuse
# mkdir /mnt/disp-vol-nfs
# mount -t glusterfs -o acl,log-level=WARNING,fuse-mountopts=noatime
127.0.0.1:/disp-vol /mnt/disp-vol-fuse/
# gluster volume set disp-vol nfs.disable off
Gluster NFS is being deprecated in favor of NFS-Ganesha Enter "yes" to continue
using Gluster NFS (y/n) yes
volume set: success
# mount 127.0.0.1:/disp-vol /mnt/disp-vol-nfs/

##### Initially, the space and inode usage numbers are correct:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   65M  200G   1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   64M  200G   1% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    22   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    22   50M    1% /mnt/disp-vol-nfs
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M    22   50M    1% /exports/brick-1
/dev/sde          50M    20   50M    1% /exports/brick-2
/dev/sdf          50M    20   50M    1% /exports/brick-3

##### Create a file to use up some space:
# fallocate -l 25G /mnt/disp-vol-fuse/file.1
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    26   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    26   50M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdd        100G   13G   88G  13% /exports/brick-1
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdd          50M    26   50M    1% /exports/brick-1
/dev/sde          50M    24   50M    1% /exports/brick-2
/dev/sdf          50M    24   50M    1% /exports/brick-3

##### Perform the first replace-brick with the source brick being up:
# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-1/disp-vol/
10.0.0.28:/exports/brick-4/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-2/disp-vol
/file.1
Status: Connected
Number of entries: 1
Brick 10.0.0.28:/exports/brick-3/disp-vol
/file.1
Status: Connected
Number of entries: 1

##### After first replace-brick with up source brick, the space and inode usage
numbers are correct:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G  8.1G   92G   9% /exports/brick-4
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-2/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
Status: Connected
Number of entries: 0
##### Still good after healing is done:
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde        100G   13G   88G  13% /exports/brick-2
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sde          50M    24   50M    1% /exports/brick-2
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs

##### Kill brick-2 process to simulate failure:
# gluster volume status disp-vol
Status of volume: disp-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.28:/exports/brick-4/disp-vol   62003     0          Y       110996
Brick 10.0.0.28:/exports/brick-2/disp-vol   62001     0          Y       107148
Brick 10.0.0.28:/exports/brick-3/disp-vol   62002     0          Y       107179
NFS Server on localhost                     2049      0          Y       111004
Self-heal Daemon on localhost               N/A       N/A        Y       111015
Task Status of Volume disp-vol
------------------------------------------------------------------------------
There are no active volume tasks
# kill 107148
##### Before the replace-brick, the 'df' numbers are still good:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  200G   26G  175G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M    24   50M    1% /mnt/disp-vol-nfs

##### After the replace-brick with a down source brick, the
'df' numbers are still crazy, volume size reduced by 50%, and
inode use went from 1% to 51%:

# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-2/disp-vol/
10.0.0.28:/exports/brick-5/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
/file.1
Status: Connected
Number of entries: 1
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
/file.1
Status: Connected
Number of entries: 1
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G  2.1G   98G   3% /exports/brick-5
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5

##### 'df' numbers are no better after healing is done:
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-3/disp-vol
Status: Connected
Number of entries: 0
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf        100G   13G   88G  13% /exports/brick-3
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdf          50M    24   50M    1% /exports/brick-3
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5

#### Stopping/starting the disperse volume, and remounting clients does not
help:
# gluster volume stop disp-vol
Stopping volume will make its data inaccessible. Do you want to continue? (y/n)
y
volume stop: disp-vol: success
# gluster volume start disp-vol
volume start: disp-vol: success
# mount -t glusterfs -o acl,log-level=WARNING,fuse-mountopts=noatime
127.0.0.1:/disp-vol /mnt/disp-vol-fuse/
# mount 127.0.0.1:/disp-vol /mnt/disp-vol-nfs/
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    50M   26M   25M   51% /mnt/disp-vol-nfs

##### Simulate a second brick failure, and replacement:
# gluster volume status disp-vol
Status of volume: disp-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.0.0.28:/exports/brick-4/disp-vol   62001     0          Y       121258
Brick 10.0.0.28:/exports/brick-5/disp-vol   62004     0          Y       121278
Brick 10.0.0.28:/exports/brick-3/disp-vol   62005     0          Y       121298
NFS Server on localhost                     2049      0          Y       121319
Self-heal Daemon on localhost               N/A       N/A        Y       121328
Task Status of Volume disp-vol
------------------------------------------------------------------------------
There are no active volume tasks

# kill 121298
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol  100G   13G   88G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    25M    12   25M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    25M    12   25M    1% /mnt/disp-vol-nfs
# gluster volume replace-brick disp-vol 10.0.0.28:/exports/brick-3/disp-vol/
10.0.0.28:/exports/brick-6/disp-vol/ commit force
volume replace-brick: success: replace-brick commit force operation successful

##### After the second replace-brick with a down source brick,
the volume size reported by 'df' goes down by another 33%. The
inode usage went back down from 51%, but it is now less than
the number the volume started with, which is suspicious, and
the total number of inodes has gone from a starting value of
50M down to 17M!

# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
/dev/sdi        100G  2.1G   98G   3% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5
/dev/sdi          50M    24   50M    1% /exports/brick-6
# gluster volume heal disp-vol info
Brick 10.0.0.28:/exports/brick-4/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-5/disp-vol
Status: Connected
Number of entries: 0
Brick 10.0.0.28:/exports/brick-6/disp-vol
Status: Connected
Number of entries: 0
##### 'df' numbers are no better after healing is done:
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs
# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol    17M     8   17M    1% /mnt/disp-vol-nfs
# df -h /exports/brick-*
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdg        100G   13G   88G  13% /exports/brick-4
/dev/sdh        100G   13G   88G  13% /exports/brick-5
/dev/sdi        100G   13G   88G  13% /exports/brick-6
# df -h -i /exports/brick-*
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/sdg          50M    24   50M    1% /exports/brick-4
/dev/sdh          50M    24   50M    1% /exports/brick-5
/dev/sdi          50M    24   50M    1% /exports/brick-6
# gluster volume info disp-vol
Volume Name: disp-vol
Type: Disperse
Volume ID: fb9cccb8-311f-49ac-948d-60e4894da0b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.0.28:/exports/brick-4/disp-vol
Brick2: 10.0.0.28:/exports/brick-5/disp-vol
Brick3: 10.0.0.28:/exports/brick-6/disp-vol
Options Reconfigured:
transport.address-family: inet
nfs.disable: off

##### Note that although 'df' is saying the disperse volume is
only 67G, it really still does have 200GB of space.

# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G  8.4G   59G  13% /mnt/disp-vol-nfs

# fallocate -l 25G /mnt/disp-vol-fuse/file.2
# fallocate -l 25G /mnt/disp-vol-fuse/file.3
# fallocate -l 25G /mnt/disp-vol-fuse/file.4
# fallocate -l 25G /mnt/disp-vol-fuse/file.5
# fallocate -l 25G /mnt/disp-vol-fuse/file.6
# fallocate -l 25G /mnt/disp-vol-fuse/file.7
# fallocate -l 25G /mnt/disp-vol-fuse/file.8
fallocate: /mnt/disp-vol-fuse/file.8: fallocate failed: No space left on device
# df -h /mnt/disp-vol-*
Filesystem           Size  Used Avail Use% Mounted on
127.0.0.1:/disp-vol   67G   62G  5.4G  93% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   67G   62G  5.4G  93% /mnt/disp-vol-nfs
# du -sh /mnt/disp-vol-fuse/
176G    /mnt/disp-vol-fuse/
# du -sh /mnt/disp-vol-nfs/
176G    /mnt/disp-vol-nfs/

# df -h -i /mnt/disp-vol-*
Filesystem          Inodes IUsed IFree IUse% Mounted on
127.0.0.1:/disp-vol   5.4M    15  5.4M    1% /mnt/disp-vol-fuse
127.0.0.1:/disp-vol   5.4M    15  5.4M    1% /mnt/disp-vol-nfs

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list