[Bugs] [Bug 1687051] gluster volume heal failed when online upgrading from 3.12 to 5.x and when rolling back online upgrade from 4.1.4 to 3.12.15

bugzilla at redhat.com bugzilla at redhat.com
Fri Mar 22 17:01:42 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1687051

Amgad <amgad.saleh at nokia.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(amgad.saleh at nokia |
                   |.com)                       |



--- Comment #48 from Amgad <amgad.saleh at nokia.com> ---
That's not the case here.
In my scenario, heal is performed after the rolback (from 5.5 to 3.12.15) is
done on gfs-1 (gfs-2 and gfs-3new are still on 5.5) and all volumes/bricks were
up.

I actually did another test, during the rollback for gfs-1, a client generated
128 files. All files existed on nodes gfs-2 and gfs-3new, but not on gfs-1. 
Heal kept failing despite all bricks are online.

Here's the outputs:
==================
1) On gfs-1, the one rolled-back to 3.12.15

[root at gfs-1 ansible2]# gluster --version
glusterfs 3.12.15
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root at gfs-1 ansible2]# gluster volume status
Status of volume: glustervol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data1/1            49152     0          Y       10712
Brick 10.76.153.213:/mnt/data1/1            49155     0          Y       20297
Brick 10.76.153.207:/mnt/data1/1            49155     0          Y       21395
Self-heal Daemon on localhost               N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       20336
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422

Task Status of Volume glustervol1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data2/2            49153     0          Y       10721
Brick 10.76.153.213:/mnt/data2/2            49156     0          Y       20312
Brick 10.76.153.207:/mnt/data2/2            49156     0          Y       21404
Self-heal Daemon on localhost               N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       20336
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422

Task Status of Volume glustervol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data3/3            49154     0          Y       10731
Brick 10.76.153.213:/mnt/data3/3            49157     0          Y       20327
Brick 10.76.153.207:/mnt/data3/3            49157     0          Y       21413
Self-heal Daemon on localhost               N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       20336

Task Status of Volume glustervol3
------------------------------------------------------------------------------
There are no active volume tasks

[root at gfs-1 ansible2]# for i in glustervol1 glustervol2 glustervol3; do gluster
volume heal $i; done
Launching heal operation to perform index self heal on volume glustervol1 has
been unsuccessful:
Commit failed on 10.76.153.213. Please check log file for details.
Commit failed on 10.76.153.207. Please check log file for details.
Launching heal operation to perform index self heal on volume glustervol2 has
been unsuccessful:
Commit failed on 10.76.153.213. Please check log file for details.
Commit failed on 10.76.153.207. Please check log file for details.
Launching heal operation to perform index self heal on volume glustervol3 has
been unsuccessful:
Commit failed on 10.76.153.207. Please check log file for details.
Commit failed on 10.76.153.213. Please check log file for details.
[root at gfs-1 ansible2]#
[root at gfs-1 ansible2]# gluster volume heal glustervol3 infoBrick
10.76.153.206:/mnt/data3/3
Status: Connected
Number of entries: 0

Brick 10.76.153.213:/mnt/data3/3
/test_file.0 
/ 
/test_file.1 
/test_file.2 
/test_file.3 
/test_file.4 
..
/test_file.125 
/test_file.126 
/test_file.127 
Status: Connected
Number of entries: 129

Brick 10.76.153.207:/mnt/data3/3
/test_file.0 
/ 
/test_file.1 
/test_file.2 
/test_file.3 
/test_file.4 
...
/test_file.125 
/test_file.126 
/test_file.127 
Status: Connected
Number of entries: 129

[root at gfs-1 ansible2]# ls -ltr /mnt/data3/3/         ====> None of the
test_file.? exists
total 8
-rw-------. 2 root root  0 Mar 11 15:52 c2file3
-rw-------. 2 root root 66 Mar 11 16:37 c1file3
-rw-------. 2 root root 91 Mar 22 16:36 c1file2
[root at gfs-1 ansible2]#

2) On gfs-2, on 5.5
[root at gfs-2 ansible2]# gluster --version
glusterfs 5.5
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root at gfs-2 ansible2]#
[root at gfs-2 ansible2]# gluster volume status
Status of volume: glustervol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data1/1            49152     0          Y       10712
Brick 10.76.153.213:/mnt/data1/1            49155     0          Y       20297
Brick 10.76.153.207:/mnt/data1/1            49155     0          Y       21395
Self-heal Daemon on localhost               N/A       N/A        Y       20336
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422

Task Status of Volume glustervol1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data2/2            49153     0          Y       10721
Brick 10.76.153.213:/mnt/data2/2            49156     0          Y       20312
Brick 10.76.153.207:/mnt/data2/2            49156     0          Y       21404
Self-heal Daemon on localhost               N/A       N/A        Y       20336
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422

Task Status of Volume glustervol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data3/3            49154     0          Y       10731
Brick 10.76.153.213:/mnt/data3/3            49157     0          Y       20327
Brick 10.76.153.207:/mnt/data3/3            49157     0          Y       21413
Self-heal Daemon on localhost               N/A       N/A        Y       20336
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       10703
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       21422

Task Status of Volume glustervol3
------------------------------------------------------------------------------
There are no active volume tasks

** gluster volume heal glustervol3 info has the same output as gfs-1

[root at gfs-2 ansible2]# ls -ltr /mnt/data3/3/          =====> all test_file.?
are there
total 131080
-rw-------. 2 root root       0 Mar 11 15:52 c2file3
-rw-------. 2 root root      66 Mar 11 16:37 c1file3
-rw-------. 2 root root      91 Mar 22 16:36 c1file2
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.0
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.1
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.2
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.3
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.4
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.5
........
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.123
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.124
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.125
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.126
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.127
[root at gfs-2 ansible2]# 

3) On gfs-3new, same as gfs-2
[root at gfs-3new ansible2]# ls -ltr /mnt/data3/3/
total 131080
-rw-------. 2 root root       0 Mar 11 15:52 c2file3
-rw-------. 2 root root      66 Mar 11 16:37 c1file3
-rw-------. 2 root root      91 Mar 22 16:36 c1file2
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.0
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.1
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.2
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.3
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.4
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.5
.....
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.122
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.123
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.124
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.125
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.126
-rw-------. 2 root root 1048576 Mar 22 16:43 test_file.127
[root at gfs-3new ansible2]# 

I'm attaching the logs for this case as well

Regards,
Amgad

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list