[Bugs] [Bug 1318136] New: Distributed-Replicate sharding corrupt VMs in ESXi

bugzilla at redhat.com bugzilla at redhat.com
Wed Mar 16 07:35:41 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1318136

            Bug ID: 1318136
           Summary: Distributed-Replicate sharding corrupt VMs in ESXi
           Product: GlusterFS
           Version: 3.7.8
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: mahdi.adnan at outlook.com
                CC: bugs at gluster.org



Description of problem: Creating a shard enabled Distributed Replicated volume
and mounted in ESXi leads to corrupted VM disk, i cannot move, migrate, clone,
or create VM in the mounted volume.


Version-Release number of selected component (if applicable): Gluster 3.7.8 -
CentOS 7.2 - ESXi 6.0 U1


How reproducible:


Steps to Reproduce:
1. Create Distributed-Replicate volume, Enable Sharding and add it to Virt
group.
2. Mount the volume in ESXi.
3. Clone a VM or create a new VM in the mounted volume.

Actual results:
Error: the virtual disk is either corrupted or not a supported format.


Expected results:
The VM created or migrated without errors.

Additional info:

Volume Info;

Volume Name: m
Type: Distributed-Replicate
Volume ID: db22b441-089d-413d-bbc7-b65dfe559ef9
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 192.168.208.138:/mnt/b1/m
Brick2: 192.168.208.138:/mnt/b2/m
Brick3: 192.168.208.138:/mnt/b3/m
Brick4: 192.168.208.138:/mnt/b4/m
Brick5: 192.168.208.138:/mnt/b5/m
Brick6: 192.168.208.138:/mnt/b6/m
Options Reconfigured:
features.shard-block-size: 256MB
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
features.shard: on
performance.readdir-ahead: on


NFS.log;

[2016-03-16 07:30:04.770129] I [MSGID: 109036]
[dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-m-dht: Setting layout
of /Asterisk-B with [Subvol_name: m-replicate-0, Err: -1 , Start: 2863311530 ,
Stop: 4294967295 , Hash: 1 ], [Subvol_name: m-replicate-1, Err: -1 , Start: 0 ,
Stop: 1431655764 , Hash: 1 ], [Subvol_name: m-replicate-2, Err: -1 , Start:
1431655765 , Stop: 2863311529 , Hash: 1 ], 
[2016-03-16 07:30:05.192586] I [MSGID: 109066] [dht-rename.c:1413:dht_rename]
0-m-dht: renaming /Asterisk-B/Asterisk-B-000001.vmdk~
(hash=m-replicate-0/cache=m-replicate-0) => /Asterisk-B/Asterisk-B-000001.vmdk
(hash=m-replicate-0/cache=m-replicate-0)
[2016-03-16 07:30:05.326397] W [MSGID: 112032] [nfs3.c:3622:nfs3svc_rmdir_cbk]
0-nfs: 610b075: /Asterisk-B => -1 (Directory not empty) [Directory not empty]
[2016-03-16 07:30:05.329332] W [MSGID: 112032] [nfs3.c:3622:nfs3svc_rmdir_cbk]
0-nfs: 610b079: /Asterisk-B => -1 (Directory not empty) [Directory not empty]


ESXi VMkernel.log

2016-03-16T07:30:05.605Z cpu35:32825)WARNING: NFS: 4566: Short read for object
b00f 60 281d92d6 653bd40b 4c474f3a 41b422db 3d419d08 5db6c7bb f99e55fe 5c536468
dfe44d99a34381c7 f652310f 0 431200000000 offset: 0x0 requested: 0x200 read:
0x94
2016-03-16T07:30:05.608Z cpu34:35990)WARNING: NFS: 4566: Short read for object
b00f 60 281d92d6 653bd40b 4c474f3a 41b422db 3d419d08 5db6c7bb f99e55fe 5c536468
dfe44d99a34381c7 f652310f 0 431200000000 offset: 0x0 requested: 0x200 read:
0x94
2016-03-16T07:30:05.610Z cpu34:35990)WARNING: NFS: 4566: Short read for object
b00f 60 281d92d6 653bd40b 4c474f3a 41b422db 3d419d08 5db6c7bb f99e55fe 5c536468
dfe44d99a34381c7 f652310f 0 431200000000 offset: 0x0 requested: 0x200 read:
0x94
2016-03-16T07:30:07.579Z cpu6:33451)NMP: nmp_ResetDeviceLogThrottling:3345:
last error status from device naa.618e7283727721a01da402e8058b1c21 repeated 7
times
2016-03-16T07:30:26.724Z cpu4:33530)NMP: nmp_ThrottleLogForDevice:3178: Cmd
0x1a (0x439e182fa080, 0) to dev "naa.618e7283727721a01da402e8058b1c21" on path
"vmhba0:C2:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
Act:NONE
2016-03-16T07:30:30.145Z cpu4:33530)ScsiDeviceIO: 2645: Cmd(0x439e16d9cb40)
0x1a, CmdSN 0x62ff from world 0 to dev "naa.618e7283727721a01da402e8058b1c21"
failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2016-03-16T07:30:30.149Z cpu4:33530)ScsiDeviceIO: 2645: Cmd(0x439e15dabc40)
0x1a, CmdSN 0x6304 from world 0 to dev "naa.618e7283727721a01da402e8058b1c21"
failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2016-03-16T07:30:30.155Z cpu4:33530)ScsiDeviceIO: 2645: Cmd(0x439e17ea0100)
0x1a, CmdSN 0x6309 from world 0 to dev "naa.618e7283727721a01da402e8058b1c21"
failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
2016-03-16T07:30:30.785Z cpu21:89337)WARNING: NFS: 2208: Failed to get
attributes (No connection)
2016-03-16T07:30:30.785Z cpu21:89337)NFS: 2264: Failed to get object 60
f73a15b1 1413267 4c474f3a 9d4bd56f df417ad4 5f9b7995 f70bd841 0 0 1000000 0 0
:No connection

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list