[Bugs] [Bug 1687051] gluster volume heal failed when online upgrading from 3.12 to 5.x and when rolling back online upgrade from 4.1.4 to 3.12.15

bugzilla at redhat.com bugzilla at redhat.com
Tue Mar 12 16:20:27 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1687051



--- Comment #15 from Amgad <amgad.saleh at nokia.com> ---
Case 2) online upgrade from 3.12.15 to 4.1.4 and rollback:

A) I have a cluster of 3 replicas: gfs-1 (10.76.153.206), gfs-2
(10.76.153.213), and gfs-3new (10.76.153.206), running 3.12.15. 
When online upgraded gfs-1 from 3.12.15 to 4.1.4, heal succeeded. Continuing
with gfs-2, then gfs-3new, online upgrade and heal succeeded.

1) Here're the outputs after gfs-1 was online upgraded from 3.12.15 to 4.1.4:
Logs uploaded are: gfs-1-logs-gfs-1-UpgFrom3.12.15-to-4.1.4.tgz,
gfs-2-logs-gfs-1-UpgFrom3.12.15-to-4.1.4.tgz, and
gfs-3new-logs-gfs-1-UpgFrom3.12.15-to-4.1.4.tgz - see the latest upgrade case.

[root at gfs-1 ansible1]# gluster volume info

Volume Name: glustervol1
Type: Replicate
Volume ID: 28b16639-7c58-4f28-975b-5ea17274e87b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data1/1
Brick2: 10.76.153.213:/mnt/data1/1
Brick3: 10.76.153.207:/mnt/data1/1
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

Volume Name: glustervol2
Type: Replicate
Volume ID: 8637eee7-20b7-4a88-b497-192b4626093d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data2/2
Brick2: 10.76.153.213:/mnt/data2/2
Brick3: 10.76.153.207:/mnt/data2/2
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

Volume Name: glustervol3
Type: Replicate
Volume ID: f8c21e8c-0a9a-40ba-b098-931a4219de0f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data3/3
Brick2: 10.76.153.213:/mnt/data3/3
Brick3: 10.76.153.207:/mnt/data3/3
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
[root at gfs-1 ansible1]# 
[root at gfs-1 ansible1]# gluster volume status
Status of volume: glustervol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data1/1            49155     0          Y       30270
Brick 10.76.153.213:/mnt/data1/1            49152     0          Y       12726
Brick 10.76.153.207:/mnt/data1/1            49152     0          Y       26671
Self-heal Daemon on localhost               N/A       N/A        Y       30260
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       12716
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       26661

Task Status of Volume glustervol1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data2/2            49156     0          Y       30279
Brick 10.76.153.213:/mnt/data2/2            49153     0          Y       12735
Brick 10.76.153.207:/mnt/data2/2            49153     0          Y       26680
Self-heal Daemon on localhost               N/A       N/A        Y       30260
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       12716
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       26661

Task Status of Volume glustervol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data3/3            49157     0          Y       30288
Brick 10.76.153.213:/mnt/data3/3            49154     0          Y       12744
Brick 10.76.153.207:/mnt/data3/3            49154     0          Y       26689
Self-heal Daemon on localhost               N/A       N/A        Y       30260
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       12716
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       26661

Task Status of Volume glustervol3
------------------------------------------------------------------------------
There are no active volume tasks

[root at gfs-1 ansible1]# for i in `gluster volume list`; do gluster volume heal
$i; done
Launching heal operation to perform index self heal on volume glustervol1 has
been successful 
Use heal info commands to check status.
Launching heal operation to perform index self heal on volume glustervol2 has
been successful 
Use heal info commands to check status.
Launching heal operation to perform index self heal on volume glustervol3 has
been successful 
Use heal info commands to check status.
[root at gfs-1 ansible1]# 
=======================
=====================

2) Here're the outputs after all were online upgraded from 3.12.15 to 4.1.4:
Logs uploaded see the logs for B) which include this case as well

[root at gfs-3new ansible1]# gluster volume info

Volume Name: glustervol1
Type: Replicate
Volume ID: 28b16639-7c58-4f28-975b-5ea17274e87b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data1/1
Brick2: 10.76.153.213:/mnt/data1/1
Brick3: 10.76.153.207:/mnt/data1/1
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

Volume Name: glustervol2
Type: Replicate
Volume ID: 8637eee7-20b7-4a88-b497-192b4626093d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data2/2
Brick2: 10.76.153.213:/mnt/data2/2
Brick3: 10.76.153.207:/mnt/data2/2
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet

Volume Name: glustervol3
Type: Replicate
Volume ID: f8c21e8c-0a9a-40ba-b098-931a4219de0f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data3/3
Brick2: 10.76.153.213:/mnt/data3/3
Brick3: 10.76.153.207:/mnt/data3/3
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
[root at gfs-3new ansible1]# 
[root at gfs-3new ansible1]# gluster volume status
Status of volume: glustervol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data1/1            49155     0          Y       30270
Brick 10.76.153.213:/mnt/data1/1            49155     0          Y       13874
Brick 10.76.153.207:/mnt/data1/1            49155     0          Y       28144
Self-heal Daemon on localhost               N/A       N/A        Y       28134
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       30260

Task Status of Volume glustervol1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data2/2            49156     0          Y       30279
Brick 10.76.153.213:/mnt/data2/2            49156     0          Y       13883
Brick 10.76.153.207:/mnt/data2/2            49156     0          Y       28153
Self-heal Daemon on localhost               N/A       N/A        Y       28134
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       30260
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864

Task Status of Volume glustervol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data3/3            49157     0          Y       30288
Brick 10.76.153.213:/mnt/data3/3            49157     0          Y       13892
Brick 10.76.153.207:/mnt/data3/3            49157     0          Y       28162
Self-heal Daemon on localhost               N/A       N/A        Y       28134
Self-heal Daemon on 10.76.153.206           N/A       N/A        Y       30260
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864

Task Status of Volume glustervol3
------------------------------------------------------------------------------
There are no active volume tasks

[root at gfs-3new ansible1]# 
[root at gfs-3new ansible1]# for i in `gluster volume list`; do gluster volume
heal $i; done
Launching heal operation to perform index self heal on volume glustervol1 has
been successful 
Use heal info commands to check status.
Launching heal operation to perform index self heal on volume glustervol2 has
been successful 
Use heal info commands to check status.
Launching heal operation to perform index self heal on volume glustervol3 has
been successful 
Use heal info commands to check status.
[root at gfs-3new ansible1]# 

======
=======

B) Here're the outputs after gfs-1 was online rollbacked from 4.1.4 to 3.12.15
- rollback succeeded, but "gluster volume heal" was unsuccessful:
Logs uploaded are: gfs-1-logs-gfs-1-RollbackFrom4.1.4-to-3.12.15.tgz,
gfs-2-logs-gfs-1-RollbackFrom4.1.4-to-3.12.15.tgz, and
gfs-3new-logs-gfs-1-RollbackFrom4.1.4-to-3.12.15.tgz - includes case 2) as well
right before

[root at gfs-1 ansible1]# gluster volume info

Volume Name: glustervol1
Type: Replicate
Volume ID: 28b16639-7c58-4f28-975b-5ea17274e87b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data1/1
Brick2: 10.76.153.213:/mnt/data1/1
Brick3: 10.76.153.207:/mnt/data1/1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Volume Name: glustervol2
Type: Replicate
Volume ID: 8637eee7-20b7-4a88-b497-192b4626093d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data2/2
Brick2: 10.76.153.213:/mnt/data2/2
Brick3: 10.76.153.207:/mnt/data2/2
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Volume Name: glustervol3
Type: Replicate
Volume ID: f8c21e8c-0a9a-40ba-b098-931a4219de0f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.76.153.206:/mnt/data3/3
Brick2: 10.76.153.213:/mnt/data3/3
Brick3: 10.76.153.207:/mnt/data3/3
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
[root at gfs-1 ansible1]# 
[root at gfs-1 ansible1]# gluster volume status
Status of volume: glustervol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data1/1            49152     0          Y       32078
Brick 10.76.153.213:/mnt/data1/1            49155     0          Y       13874
Brick 10.76.153.207:/mnt/data1/1            49155     0          Y       28144
Self-heal Daemon on localhost               N/A       N/A        Y       32068
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       28134

Task Status of Volume glustervol1
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data2/2            49153     0          Y       32087
Brick 10.76.153.213:/mnt/data2/2            49156     0          Y       13883
Brick 10.76.153.207:/mnt/data2/2            49156     0          Y       28153
Self-heal Daemon on localhost               N/A       N/A        Y       32068
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       28134

Task Status of Volume glustervol2
------------------------------------------------------------------------------
There are no active volume tasks

Status of volume: glustervol3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.76.153.206:/mnt/data3/3            49154     0          Y       32096
Brick 10.76.153.213:/mnt/data3/3            49157     0          Y       13892
Brick 10.76.153.207:/mnt/data3/3            49157     0          Y       28162
Self-heal Daemon on localhost               N/A       N/A        Y       32068
Self-heal Daemon on 10.76.153.213           N/A       N/A        Y       13864
Self-heal Daemon on 10.76.153.207           N/A       N/A        Y       28134

Task Status of Volume glustervol3
------------------------------------------------------------------------------
There are no active volume tasks

[root at gfs-1 ansible1]# for i in `gluster volume list`; do gluster volume heal
$i; done
Launching heal operation to perform index self heal on volume glustervol1 has
been unsuccessful:
Commit failed on 10.76.153.207. Please check log file for details.
Commit failed on 10.76.153.213. Please check log file for details.
Launching heal operation to perform index self heal on volume glustervol2 has
been unsuccessful:
Commit failed on 10.76.153.213. Please check log file for details.
Commit failed on 10.76.153.207. Please check log file for details.
Launching heal operation to perform index self heal on volume glustervol3 has
been unsuccessful:
Commit failed on 10.76.153.207. Please check log file for details.
Commit failed on 10.76.153.213. Please check log file for details.
[root at gfs-1 ansible1]#

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list