[Bugs] [Bug 1468261] Regression: non-disruptive(in-service) upgrade on EC volume fails

bugzilla at redhat.com bugzilla at redhat.com
Wed Jul 12 07:33:33 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1468261

Ashish Pandey <aspandey at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aspandey at redhat.com



--- Comment #6 from Ashish Pandey <aspandey at redhat.com> ---

Description of problem:
====================
The ec non-disruptive upgrade fails due to some regression


Client IO:tar:
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_handles.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_import.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_import.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_intent.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_intent.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_kernelcomm.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_kernelcomm.h:
Cannot open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lib.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lib.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_linkea.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_linkea.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lmv.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_lmv.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_log.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_log.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mdc.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mdc.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mds.h
tar: linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_mds.h: Cannot
open: Input/output error
linux-4.11.7/drivers/staging/lustre/lustre/include/lustre_net.h



Client fuse logs:
17-06-27 06:31:41.488462] W [MSGID: 122035] [ec-common.c:464:ec_child_select]
0-ecv-disperse-0: Executing operation with some subvolumes unavailable (4)
[2017-06-27 06:31:41.492350] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.495012] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.498939] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.500037] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.501771] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.502741] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.510185] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.512205] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.517462] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.520244] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.522030] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.530202] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.533945] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.536465] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.539042] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.540564] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.544238] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.545663] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0: Executing operation with
some subvolumes unavailable (4)
[2017-06-27 06:31:41.550015] W [MSGID: 122040]
[ec-common.c:990:ec_prepare_update_cbk] 0-ecv-disperse-0: Failed to get size
and version [Input/output error]
[2017-06-27 06:31:41.552186] W [MSGID: 122035]
[ec-common.c:464:ec_child_select] 0-ecv-disperse-0:


Version-Release number of selected component (if applicable):
============
3.8.4.28-->3.8.4.29
3.8.4.29-->3.8.4-31

How reproducible:
======
2/2

Steps to Reproduce:
1.have a 4+2 ec volume on 6 nodes
2.let untar linux kernel go on during this upgrade procedure
3.upgrade node#1 and #2 (kill glusterfsd, glusterfs,stop glusterd and post
upgrade of rpm start glusterd)
4. wait for healing to complete
5. post heal completed, and with still kernel untar going on
6. now upgrade node#3((kill glusterfsd, glusterfs,stop glusterd)

At this step you will see IO errors with i/o error

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=5h4GMp3NLk&a=cc_unsubscribe


More information about the Bugs mailing list