[Bugs] [Bug 1762438] New: DHT- gluster rebalance status shows wrong data size after rebalance is completed successfully

bugzilla at redhat.com bugzilla at redhat.com
Wed Oct 16 18:04:53 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1762438

            Bug ID: 1762438
           Summary: DHT- gluster rebalance status shows wrong data size
                    after rebalance is completed successfully
           Product: GlusterFS
           Version: mainline
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: srakonde at redhat.com
                CC: amukherj at redhat.com, bmekala at redhat.com,
                    bshetty at redhat.com, bugs at gluster.org,
                    nchilaka at redhat.com, rhs-bugs at redhat.com,
                    saraut at redhat.com, spalai at redhat.com,
                    storage-qa-internal at redhat.com, vbellur at redhat.com
        Depends On: 1761486
  Target Milestone: ---
    Classification: Community



+++ This bug was initially created as a clone of Bug #1761486 +++

Description of problem:
=======================
While doing a gluster re-balance from 1*(4+2) to 2*(4+2) and then to
3*(4+2),the re-balance status shows inconsistency (wrong data size) after the
re-balance is completed. The re-balance status shows proper data till 4~6 hours
of completion, then it mismatches. This was seen in both 2*(4+2) and 3*(4+2)
configuration. Though the size was in GB, after the completion of re-balance it
shows in PB.


Version-Release number of selected component (if applicable):
============================================================
glusterfs-6.0-15.el7rhgs.x86_64

How reproducible:
================
2/2

Steps:
=====
1.Create a RHHI-V setup with 3 node config
2.Create a disperse volume of 1*(4+2)
3.Start writing fio IO's using a gluster client and wait till its completed
(say 400GB)
4.Now add brick and it's 2*(4+2) and start re-balance and wait till it
completes
5.After 4-5 hours check the re-balance status, it shows incorrect data size
6.Repeat step 3 and 4 ,now it becomes 3*(4+2)
7.Start re-balance and wait till the re-balance is completed
8.Check the re-balance status as mentioned in step 5


Actual results:
==============
Showing wrong file size (in Peta Byte)


Expected results:
=================
Should show actual size


--- Additional comment from bipin on 2019-10-14 19:19:54 IST ---

[root at rhsqa-grafton7-nic2 glusterfs]# gluster volume info 
Volume Name: backupvol
Type: Distributed-Disperse
Volume ID: a2926822-a9a4-4ca7-8894-cf71d9ae1057
Status: Started
Snapshot Count: 0
Number of Bricks: 3 x (4 + 2) = 18
Transport-type: tcp
Bricks:
Brick1: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick1/b1
Brick2: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick1/b1
Brick3: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick1/b1
Brick4: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick2/b2
Brick5: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick2/b2
Brick6: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick2/b2
Brick7: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick3/b3
Brick8: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick3/b3
Brick9: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick3/b3
Brick10: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick4/b4
Brick11: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick4/b4
Brick12: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick4/b4
Brick13: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick5/b5
Brick14: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick5/b5
Brick15: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick5/b5
Brick16: rhsqa-grafton7.lab.eng.blr.redhat.com:/gluster/brick6/b6
Brick17: rhsqa-grafton8.lab.eng.blr.redhat.com:/gluster/brick6/b6
Brick18: rhsqa-grafton9.lab.eng.blr.redhat.com:/gluster/brick6/b6
Options Reconfigured:
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.export-dirs: off
nfs.export-volumes: off
performance.read-ahead: off
performance.io-thread-count: 24
performance.md-cache-timeout: 3
performance.cache-refresh-timeout: 3
performance.cache-size: 7GB
performance.parallel-readdir: on
client.event-threads: 5
server.event-threads: 5
features.cache-invalidation-timeout: 600
performance.cache-invalidation: on
network.inode-lru-limit: 65536
cluster.brick-multiplex: on

[root at rhsqa-grafton7-nic2 glusterfs]# df -Th
Filesystem                                              Type            Size 
Used Avail Use% Mounted on
devtmpfs                                                devtmpfs        126G   
 0  126G   0% /dev
tmpfs                                                   tmpfs           126G  
16K  126G   1% /dev/shm
tmpfs                                                   tmpfs           126G  
11M  126G   1% /run
tmpfs                                                   tmpfs           126G   
 0  126G   0% /sys/fs/cgroup
/dev/mapper/rhel_rhsqa--grafton7-root                   xfs              50G 
5.2G   45G  11% /
/dev/mapper/GLUSTER_vg2-GLUSTER_lv2                     xfs             1.9T  
81G  1.8T   5% /gluster/brick2
/dev/mapper/GLUSTER_vg4-GLUSTER_lv4                     xfs             1.9T  
38G  1.8T   3% /gluster/brick4
/dev/mapper/GLUSTER_vg3-GLUSTER_lv3                     xfs             1.9T  
38G  1.8T   3% /gluster/brick3
/dev/mapper/GLUSTER_vg1-GLUSTER_lv1                     xfs             1.9T 
115G  1.7T   7% /gluster/brick1
/dev/mapper/GLUSTER_vg6-GLUSTER_lv6                     xfs             1.9T  
41G  1.8T   3% /gluster/brick6
/dev/mapper/GLUSTER_vg5-GLUSTER_lv5                     xfs             1.9T  
41G  1.8T   3% /gluster/brick5
/dev/mapper/gluster_vg_sdp2-gluster_lv_commvault_server xfs             500G  
64G  437G  13% /gluster_bricks/commvault_server
/dev/mapper/rhel_rhsqa--grafton7-home                   xfs             876G  
34M  876G   1% /home
/dev/sda1                                               xfs            1014M 
200M  815M  20% /boot
/dev/mapper/gluster_vg_sdp1-gluster_lv_engine           xfs             100G 
5.9G   95G   6% /gluster_bricks/engine
rhsqa-grafton7.lab.eng.blr.redhat.com:/engine           fuse.glusterfs  100G 
6.9G   94G   7%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_engine
rhsqa-grafton7.lab.eng.blr.redhat.com:/commvault_server fuse.glusterfs  500G  
69G  432G  14%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_commvault__server
tmpfs                                                   tmpfs            26G   
 0   26G   0% /run/user/0


[root at rhsqa-grafton8-nic2 ~]# df -Th
Filesystem                                              Type            Size 
Used Avail Use% Mounted on
devtmpfs                                                devtmpfs        126G   
 0  126G   0% /dev
tmpfs                                                   tmpfs           126G  
16K  126G   1% /dev/shm
tmpfs                                                   tmpfs           126G  
11M  126G   1% /run
tmpfs                                                   tmpfs           126G   
 0  126G   0% /sys/fs/cgroup
/dev/mapper/rhel_rhsqa--grafton8-root                   xfs              50G 
3.1G   47G   7% /
/dev/mapper/GLUSTER_vg5-GLUSTER_lv5                     xfs             1.9T  
41G  1.8T   3% /gluster/brick5
/dev/mapper/GLUSTER_vg2-GLUSTER_lv2                     xfs             1.9T  
81G  1.8T   5% /gluster/brick2
/dev/mapper/rhel_rhsqa--grafton8-home                   xfs             876G  
34M  876G   1% /home
/dev/mapper/GLUSTER_vg1-GLUSTER_lv1                     xfs             1.9T  
81G  1.8T   5% /gluster/brick1
/dev/mapper/GLUSTER_vg3-GLUSTER_lv3                     xfs             1.9T  
38G  1.8T   3% /gluster/brick3
/dev/mapper/GLUSTER_vg6-GLUSTER_lv6                     xfs             1.9T  
41G  1.8T   3% /gluster/brick6
/dev/mapper/GLUSTER_vg4-GLUSTER_lv4                     xfs             1.9T  
38G  1.8T   3% /gluster/brick4
/dev/sda1                                               xfs            1014M 
200M  815M  20% /boot
/dev/mapper/gluster_vg_sdp2-gluster_lv_commvault_server xfs             500G  
64G  437G  13% /gluster_bricks/commvault_server
/dev/mapper/gluster_vg_sdp1-gluster_lv_engine           xfs             100G 
5.9G   95G   6% /gluster_bricks/engine
rhsqa-grafton7.lab.eng.blr.redhat.com:/engine           fuse.glusterfs  100G 
6.9G   94G   7%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_engine
rhsqa-grafton7.lab.eng.blr.redhat.com:/commvault_server fuse.glusterfs  500G  
69G  432G  14%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_commvault__server
tmpfs                                                   tmpfs            26G   
 0   26G   0% /run/user/0


[root at rhsqa-grafton9 glusterfs]# df -Th
Filesystem                                              Type            Size 
Used Avail Use% Mounted on
devtmpfs                                                devtmpfs        126G   
 0  126G   0% /dev
tmpfs                                                   tmpfs           126G  
16K  126G   1% /dev/shm
tmpfs                                                   tmpfs           126G  
27M  126G   1% /run
tmpfs                                                   tmpfs           126G   
 0  126G   0% /sys/fs/cgroup
/dev/mapper/rhel_rhsqa--grafton9-root                   xfs              50G 
3.1G   47G   7% /
/dev/mapper/GLUSTER_vg6-GLUSTER_lv6                     xfs             1.9T  
41G  1.8T   3% /gluster/brick6
/dev/mapper/gluster_vg_sdp2-gluster_lv_commvault_server xfs             500G  
64G  437G  13% /gluster_bricks/commvault_server
/dev/mapper/GLUSTER_vg1-GLUSTER_lv1                     xfs             1.9T  
77G  1.8T   5% /gluster/brick1
/dev/mapper/GLUSTER_vg4-GLUSTER_lv4                     xfs             1.9T  
38G  1.8T   3% /gluster/brick4
/dev/mapper/GLUSTER_vg5-GLUSTER_lv5                     xfs             1.9T  
41G  1.8T   3% /gluster/brick5
/dev/mapper/GLUSTER_vg2-GLUSTER_lv2                     xfs             1.9T  
80G  1.8T   5% /gluster/brick2
/dev/mapper/GLUSTER_vg3-GLUSTER_lv3                     xfs             1.9T  
38G  1.8T   3% /gluster/brick3
/dev/mapper/gluster_vg_sdp1-gluster_lv_engine           xfs             100G 
5.9G   95G   6% /gluster_bricks/engine
/dev/mapper/rhel_rhsqa--grafton9-home                   xfs             876G  
34M  876G   1% /home
/dev/sda1                                               xfs            1014M 
200M  815M  20% /boot
rhsqa-grafton7.lab.eng.blr.redhat.com:/engine           fuse.glusterfs  100G 
6.9G   94G   7%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_engine
rhsqa-grafton7.lab.eng.blr.redhat.com:/commvault_server fuse.glusterfs  500G  
69G  432G  14%
/rhev/data-center/mnt/glusterSD/rhsqa-grafton7.lab.eng.blr.redhat.com:_commvault__server
tmpfs                                                   tmpfs            26G   
 0   26G   0% /run/user/0

--- Additional comment from bipin on 2019-10-14 19:21:45 IST ---

While the re-balance is in progress:
====================================
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                0        0Bytes        
   36             0             0          in progress        0:12:12
   rhsqa-grafton8.lab.eng.blr.redhat.com                0        0Bytes        
   36             0             0          in progress        0:12:12
                               localhost                4       100.8GB        
   20             0             0          in progress        0:12:12
Estimated time left for rebalance to complete :        0:18:57

                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                1        28.7GB        
   36             0             0          in progress        0:13:13
   rhsqa-grafton8.lab.eng.blr.redhat.com                0        0Bytes        
   36             0             0          in progress        0:13:13
                               localhost                4       100.8GB        
   20             0             0          in progress        0:13:13
Estimated time left for rebalance to complete :        0:18:26

                                   Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                3        86.6GB        
   36             0             0          in progress        0:17:30
   rhsqa-grafton8.lab.eng.blr.redhat.com                2        57.8GB        
   36             0             0          in progress        0:17:30
                               localhost                4       100.8GB        
   20             0             0          in progress        0:17:30
Estimated time left for rebalance to complete :        0:19:45


After re-balance is completed:
==============================
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                7       200.9GB        
   36             0             0            completed        0:24:27
   rhsqa-grafton8.lab.eng.blr.redhat.com                3       127.9GB        
   36             0             0            completed        0:27:18
                               localhost                6       198.1GB        
   20             0             0            completed        0:26:32


After 4-6 hours it shows incorrect data size:
=============================================
[root at rhsqa-grafton7-nic2 tmp]# gluster volume rebalance backupvol status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                7       200.9GB        
   36             0             0            completed        0:24:27
   rhsqa-grafton8.lab.eng.blr.redhat.com                3     16384.0PB        
   36             0             0            completed        0:27:18
                               localhost                6     16384.0PB        
   20             0             0            completed        0:26:32






Rebalance status:
=================                 

                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
   rhsqa-grafton9.lab.eng.blr.redhat.com                7       200.9GB        
   36             0             0            completed        0:24:27
   rhsqa-grafton8.lab.eng.blr.redhat.com                3       127.9GB        
   36             0             0            completed        0:27:18
                               localhost                6       198.1GB        
   20             0             0            completed        0:26:32



The below file size in the log matches with the above rebalance status

Rebalance log for grafton7:
==========================
[2019-10-14 04:50:42.971557] I [MSGID: 109028]
[dht-rebalance.c:5059:gf_defrag_status_get] 0-backupvol-dht: Rebalance is
completed. Time taken is 1467.00 secs
[2019-10-14 04:50:42.971586] I [MSGID: 109028]
[dht-rebalance.c:5065:gf_defrag_status_get] 0-backupvol-dht: Files migrated: 7,
size: 215754080256, lookups: 36, failures: 0, skipped: 0
[2019-10-14 04:50:42.973059] W [glusterfsd.c:1560:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7ea5) [0x7f0e936d1ea5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x5592b03111f5]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x5592b031105b] ) 0-: received
signum (15), shutting down


Rebalance log for grafton8:
===========================
[2019-10-14 04:53:27.016118] I [dht-rebalance.c:4991:gf_defrag_status_get]
0-glusterfs: TIME: Estimated total time to complete (size)= 2755 seconds,
seconds left = 1123
[2019-10-14 04:53:27.016147] I [MSGID: 109028]
[dht-rebalance.c:5059:gf_defrag_status_get] 0-glusterfs: Rebalance is in
progress. Time taken is 1632.00 secs
[2019-10-14 04:53:27.016159] I [MSGID: 109028]
[dht-rebalance.c:5065:gf_defrag_status_get] 0-glusterfs: Files migrated: 2,
size: 62050402304, lookups: 36, failures: 0, skipped: 0
[2019-10-14 04:53:33.285796] I [MSGID: 109022]
[dht-rebalance.c:2233:dht_migrate_file] 0-backupvol-dht: completed migration of
/f.1.0 from subvolume backupvol-disperse-1 to backupvol-disperse-0
[2019-10-14 04:53:33.288968] I [MSGID: 109028]
[dht-rebalance.c:5059:gf_defrag_status_get] 0-backupvol-dht: Rebalance is
completed. Time taken is 1638.00 secs
[2019-10-14 04:53:33.288990] I [MSGID: 109028]
[dht-rebalance.c:5065:gf_defrag_status_get] 0-backupvol-dht: Files migrated: 3,
size: 137363980288, lookups: 36, failures: 0, skipped: 0


Rebalance log for grafton9:
==========================
[2019-10-14 04:50:42.968102] W [MSGID: 114031]
[client-rpc-fops_v2.c:2634:client4_0_lookup_cbk] 0-backupvol-client-3: remote
operation failed. Path: /test/f.3.0 (52789828-ad20-480d-b936-648b531a315a) [No
such file or directory]
[2019-10-14 04:50:42.971557] I [MSGID: 109028]
[dht-rebalance.c:5059:gf_defrag_status_get] 0-backupvol-dht: Rebalance is
completed. Time taken is 1467.00 secs
[2019-10-14 04:50:42.971586] I [MSGID: 109028]
[dht-rebalance.c:5065:gf_defrag_status_get] 0-backupvol-dht: Files migrated: 7,
size: 215754080256, lookups: 36, failures: 0, skipped: 0

--- Additional comment from bipin on 2019-10-14 19:24:42 IST ---

Client snippet:
==============
[root at dhcp43-65 backupvol_test]# df -Th
Filesystem                                            Type            Size 
Used Avail Use% Mounted on
/dev/mapper/rhel_dhcp43--65-root                      xfs              44G 
1.5G   43G   4% /
devtmpfs                                              devtmpfs        3.9G    
0  3.9G   0% /dev
tmpfs                                                 tmpfs           3.9G    
0  3.9G   0% /dev/shm
tmpfs                                                 tmpfs           3.9G 
8.5M  3.9G   1% /run
tmpfs                                                 tmpfs           3.9G    
0  3.9G   0% /sys/fs/cgroup
/dev/sda1                                             xfs            1014M 
145M  870M  15% /boot
tmpfs                                                 tmpfs           783M    
0  783M   0% /run/user/0
rhsqa-grafton7-nic2.lab.eng.blr.redhat.com:/backupvol fuse.glusterfs   15T 
764G   14T   6% /mnt/backupvol_test


mount -t glusterfs rhsqa-grafton7-nic2.lab.eng.blr.redhat.com:/backupvol
/mnt/backupvol_test/

[root at dhcp43-65 mnt]# rpm -qa | grep glusterfs
glusterfs-libs-6.0-16.el7.x86_64
glusterfs-6.0-16.el7.x86_64
glusterfs-client-xlators-6.0-16.el7.x86_64
glusterfs-fuse-6.0-16.el7.x86_64



The data was created using fio which had larger size file (Just a background on
how the data was generated). Each file had size of about 6~8GB. 

[root at rhsqa-grafton7-nic2 b2]# ls
f.0.0  f.1.0  f.2.0  f.3.0  f.4.0  f.6.0  test
[root at rhsqa-grafton7-nic2 b2]# tree
.
├── f.0.0
├── f.1.0
├── f.2.0
├── f.3.0
├── f.4.0
├── f.6.0
└── test
    ├── f.5.0
    ├── f.7.0
    ├── f.8.0
    └── f.9.0

1 directory, 10 files

[root at rhsqa-grafton7-nic2 b2]# ls -lstah *
7.2G -rw-------. 2 root root 7.2G Oct 14 09:54 f.6.0
 18G -rw-------. 2 root root  18G Oct 14 09:54 f.1.0
7.2G -rw-------. 2 root root 7.2G Oct 14 09:54 f.0.0
7.3G -rw-------. 2 root root 7.3G Oct 14 09:54 f.3.0
7.3G -rw-------. 2 root root 7.3G Oct 14 09:54 f.2.0
7.3G -rw-------. 2 root root 7.3G Oct 14 09:54 f.4.0

test:
total 27G
   0 drwxr-xr-x. 2 root root   58 Oct 14 10:22 .
8.0K drwxr-xr-x. 4 root root 8.0K Oct 14 10:21 ..
7.0G -rw-------. 2 root root 7.0G Oct 10 10:44 f.7.0
7.0G -rw-------. 2 root root 7.0G Oct 10 10:44 f.9.0
6.2G -rw-------. 2 root root 6.2G Oct 10 10:44 f.5.0
6.9G -rw-------. 2 root root 6.9G Oct 10 10:44 f.8.0

--- Additional comment from bipin on 2019-10-14 19:39:22 IST ---

Logs @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1761486/

--- Additional comment from Susant Kumar Palai on 2019-10-15 15:35:37 IST ---

Reproducer:

- create single brick volume
- create 10 files each of size 1GB.
- add 5 bricks 
- trigger rebalance: gluster v rebalance VOLNAME start force
- Most likely in rebalance status you will find few files getting migrated.
- kill and start glusterd process.
- The size information will be showing a random overflowing value.

[root at vm3 mnt]# gluster v rebalance test2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost              624         2.8GB        
  712            11             0            completed        0:00:56
volume rebalance: test2: success
[root at vm3 mnt]# pkill gluster
[root at vm3 mnt]# glusterd
[root at vm3 mnt]# gluster v rebalance test2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                               localhost              624     16384.0PB        
  712            11             0            completed        0:00:56
volume rebalance: test2: success
[root at vm3 mnt]# 


Not familiar with glusterd code entirely but I guess size data is not fetched
from glusterd data base and that is the reason it could be printing an
uninitialized data. (I could be wrong)

        fprintf(fp, "Volume%d.rebalance.id: %s\n", count, id_str);
        fprintf(fp, "Volume%d.rebalance.status: %s\n", count,
rebal_status_str);
        fprintf(fp, "Volume%d.rebalance.failures: %" PRIu64 "\n", count,
                volinfo->rebal.rebalance_failures);
        fprintf(fp, "Volume%d.rebalance.skipped: %" PRIu64 "\n", count,
                volinfo->rebal.skipped_files);
        fprintf(fp, "Volume%d.rebalance.lookedup: %" PRIu64 "\n", count,
                volinfo->rebal.lookedup_files);
        fprintf(fp, "Volume%d.rebalance.files: %" PRIu64 "\n", count,
                volinfo->rebal.rebalance_files);
        fprintf(fp, "Volume%d.rebalance.data: %s\n", count, rebal_data);
        fprintf(fp, "Volume%d.time_left: %" PRIu64 "\n", count,
                volinfo->rebal.time_left);


Moving this to Glusterd Component for a quick resolution.

--- Additional comment from Sanju on 2019-10-15 16:21:52 IST ---

Bipin/Susant,

is rebalance status showing incorrect values after a glusterd restart or just
after 4-6 hours after the rebalance is complete.

Thanks,
Sanju

--- Additional comment from bipin on 2019-10-15 16:57:01 IST ---

Sanju,

In my test i never restarted the glusterd. It was seen after some hours after
the re-balance was successfully completed

--- Additional comment from Susant Kumar Palai on 2019-10-15 17:36:38 IST ---

I restarted glusterd and it immediately showed wrong size info as mentioned in
the previous comment.

--- Additional comment from Sayalee on 2019-10-16 11:55:50 IST ---

Did below steps on 3.4.4 (3.12.2-47.5) build:

1) Create a 1X(4+2) volume.
2) Mount the volume using FUSE.
3) From mount point, create 10 files of 10G each. Wait for the IO to complete.
4) Add bricks to the volume to make it 2X(4+2).
5) Trigger rebalance and wait for it to complete.
6) Check rebalance status once its complete.
7) Wait for 4/5 hours and again check the rebalance status 

Observations:
============

Tue Oct 15 17:56:20 IST 2019
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                0        0Bytes        
   10             0             0          in progress        0:05:33
                             10.70.35.47                0        0Bytes        
   10             0             0          in progress        0:05:33
                               localhost                0        0Bytes        
   10             0             0            completed        0:00:00
The estimated time for rebalance to complete will be unavailable for the first
10 minutes.
volume rebalance: say1: success

Tue Oct 15 18:06:21 IST 2019  
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                3        30.0GB        
   10             0             0            completed        0:15:19
                             10.70.35.47                1        10.0GB        
   10             0             0            completed        0:13:41
                               localhost                0        0Bytes        
   10             0             0            completed        0:00:00
volume rebalance: say1: success

***Checked the status after some hours***

Tue Oct 15 23:46:27 IST 2019
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                3        30.0GB        
   10             0             0            completed        0:15:19
                             10.70.35.47                1        10.0GB        
   10             0             0            completed        0:13:41
                               localhost                0        0Bytes        
   10             0             0            completed        0:00:00
volume rebalance: say1: success



Wed Oct 16 10:26:36 IST 2019
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                3        30.0GB        
   10             0             0            completed        0:15:19
                             10.70.35.47                1        10.0GB        
   10             0             0            completed        0:13:41
                               localhost                0        0Bytes        
   10             0             0            completed        0:00:00
volume rebalance: say1: success


====> Did not hit the issue.

--- Additional comment from Sayalee on 2019-10-16 12:02:36 IST ---


Also tried out the reproducer steps as in Comment 6 on 3.4.4 (3.12.2-47.5)
build:


Observations:
============

[root at dhcp35-125 home]# gluster v rebal say2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                7         7.0GB        
   10             0             0            completed        0:01:04
                             10.70.35.47                0        0Bytes        
    1             0             0            completed        0:00:00
volume rebalance: say2: success
[root at dhcp35-125 home]# 

[root at dhcp35-125 home]# pkill gluster   <=== (NOTE: The glusterd process was
killed and started on all the nodes that were part of the volume,but providing
the observation from just one node)

[root at dhcp35-125 home]# glusterd

[root at dhcp35-125 home]# pidof glusterd
2001

[root at dhcp35-125 home]# gluster v rebal say2 status
                                    Node Rebalanced-files          size      
scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
                             10.70.35.95                7     16384.0PB        
   10             0             0            completed        0:01:04
                             10.70.35.47                0        0Bytes        
    1             0             0            completed        0:00:00
volume rebalance: say2: success

====> The issue was hit with the steps in Comment 6

--- Additional comment from Sanju on 2019-10-16 15:28:22 IST ---

Looks like this is a day1 issue.

>From gdb:
glusterd_store_retrieve_node_state (volinfo=volinfo at entry=0x555555e67fc0) at
glusterd-store.c:2980
2980                volinfo->rebal.rebalance_data = atoi(value);
4: value = 0x5555557dfea0 "3145728000"
3: key = 0x5555557e41d0 "size"
2: volinfo->rebal.rebalance_data = 0
1: volinfo->volname = "test1", '\000' <repeats 250 times>
(gdb) n
3048            GF_FREE(key);
4: value = 0x5555557dfea0 "3145728000"
3: key = 0x5555557e41d0 "size"
2: volinfo->rebal.rebalance_data = 18446744072560312320
1: volinfo->volname = "test1", '\000' <repeats 250 times>
(gdb) p atoi(value)
$20 = -1149239296
(gdb) p/u atoi(value)
$21 = 3145728000

The issue here is, atoi() is returning negative value because of overflow.
Below statements from gdb proves it.

(gdb) set volinfo->rebal.rebalance_data=atoi("314572800")
(gdb) p volinfo->rebal.rebalance_data 
$41 = 314572800
(gdb) set volinfo->rebal.rebalance_data=atoi("3145728000")
(gdb) p volinfo->rebal.rebalance_data 
$42 = 18446744072560312320            <-- wrong value
(gdb) 

Will send out the patch to fix this.

Thanks,
Sanju

--- Additional comment from Sanju on 2019-10-16 16:54:37 IST ---

even after using suffix for unsigned int, I see the same wrong value:

(gdb) set volinfo->rebal.rebalance_data = atoi("3145728000")
(gdb) p/u volinfo->rebal.rebalance_data 
$57 = 18446744072560312320
(gdb) 

RCA:
When we restart glusterd, glusterd fetches information from /var/lib/glusterd/
and restores it in its memory. In the process of restoring, glusterd fetches
contents of /var/lib/glusterd/vols/volname/node_state.info where rebalance
related information is stored. Once the control reaches
glusterd_store_retrieve_node_state() it fetches all the key-value pairs one by
one. size is one such key. In above gdb output, value of size is 3145728000 and
it is assigned to volinfo->rebal.rebalance_data after converting it into an
integer.

        volinfo->rebal.rebalance_data = atoi(value);
After this assignment, volinfo->rebal.rebalance_data value is
18446744072560312320. So, the volinfo will have in-correct size in its memory
and /var/lib/glusterd/vols/test1/node_state.info is also updated with this
wrong information. This value is shown as 16384.0PB in rebalance status output.

[root at dhcp35-84 ~]# cat /var/lib/glusterd/vols/test1/node_state.info
rebalance_status=1
status=3
rebalance_op=19
rebalance-id=47c58350-7fc1-4bbf-96aa-01ca7ec0c4e3
rebalanced-files=2
size=18446744072560312320
scanned=6
failures=0
skipped=1
run-time=318.000000
[root at dhcp35-84 ~]# 

Thanks,
Sanju


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1761486
[Bug 1761486] DHT- gluster rebalance status shows wrong data size after
rebalance is completed successfully
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list