[Gluster-users] Replicated striped data lose

Mahdi Adnan mahdi.adnan at earthlinktele.com
Tue Mar 15 12:51:14 UTC 2016


and here's the log of the ESXi;

2016-03-15T12:50:17.982Z cpu41:33260)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:17.983Z cpu41:33260)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:17.984Z cpu41:33260)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:17.995Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.000Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.031Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.032Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.032Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.043Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95
2016-03-15T12:50:18.048Z cpu41:35990)WARNING: NFS: 4566: Short read for 
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc 
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0 
requested: 0x200 read: 0x95

Respectfully*
**Mahdi A. Mahdi*

Skype: mahdi.adnan at outlook.com <mailto:mahdi.adnan at outlook.com>
On 03/15/2016 03:06 PM, Mahdi Adnan wrote:
> [2016-03-15 14:12:01.421615] I [MSGID: 109036] 
> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-v-dht: 
> Setting layout of /New Virtual Machine_2 with [Subvol_name: 
> v-replicate-0, Err: -1 , Start: 0 , Stop: 1431655764 , Hash: 1 ], 
> [Subvol_name: v-replicate-1, Err: -1 , Start: 1431655765 , Stop: 
> 2863311529 , Hash: 1 ], [Subvol_name: v-replicate-2, Err: -1 , Start: 
> 2863311530 , Stop: 4294967295 , Hash: 1 ],
> [2016-03-15 14:12:02.001167] I [MSGID: 109066] 
> [dht-rename.c:1413:dht_rename] 0-v-dht: renaming /New Virtual 
> Machine_2/New Virtual Machine.vmdk~ 
> (hash=v-replicate-2/cache=v-replicate-2) => /New Virtual Machine_2/New 
> Virtual Machine.vmdk (hash=v-replicate-2/cache=v-replicate-2)
> [2016-03-15 14:12:02.248164] W [MSGID: 112032] 
> [nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7d9f: /New Virtual 
> Machine_2 => -1 (Directory not empty) [Directory not empty]
> [2016-03-15 14:12:02.259015] W [MSGID: 112032] 
> [nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7da3: /New Virtual 
> Machine_2 => -1 (Directory not empty) [Directory not empty]
>
>
> Respectfully*
> **Mahdi A. Mahdi*
>
> On 03/15/2016 03:03 PM, Krutika Dhananjay wrote:
>> Hmm ok. Could you share the nfs.log content?
>>
>> -Krutika
>>
>> On Tue, Mar 15, 2016 at 1:45 PM, Mahdi Adnan 
>> <mahdi.adnan at earthlinktele.com 
>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>
>>     Okay, here's what i did;
>>
>>     Volume Name: v
>>     Type: Distributed-Replicate
>>     Volume ID: b348fd8e-b117-469d-bcc0-56a56bdfc930
>>     Status: Started
>>     Number of Bricks: 3 x 2 = 6
>>     Transport-type: tcp
>>     Bricks:
>>     Brick1: gfs001:/bricks/b001/v
>>     Brick2: gfs001:/bricks/b002/v
>>     Brick3: gfs001:/bricks/b003/v
>>     Brick4: gfs002:/bricks/b004/v
>>     Brick5: gfs002:/bricks/b005/v
>>     Brick6: gfs002:/bricks/b006/v
>>     Options Reconfigured:
>>     features.shard-block-size: 128MB
>>     features.shard: enable
>>     cluster.server-quorum-type: server
>>     cluster.quorum-type: auto
>>     network.remote-dio: enable
>>     cluster.eager-lock: enable
>>     performance.stat-prefetch: off
>>     performance.io-cache: off
>>     performance.read-ahead: off
>>     performance.quick-read: off
>>     performance.readdir-ahead: on
>>
>>
>>     same error.
>>     and still mounting using glusterfs will work just fine.
>>
>>     Respectfully*
>>     **Mahdi A. Mahdi*
>>
>>
>>     On 03/15/2016 11:04 AM, Krutika Dhananjay wrote:
>>>     OK but what if you use it with replication? Do you still see the
>>>     error? I think not.
>>>     Could you give it a try and tell me what you find?
>>>
>>>     -Krutika
>>>
>>>     On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan
>>>     <mahdi.adnan at earthlinktele.com> wrote:
>>>
>>>         Hi,
>>>
>>>         I have created the following volume;
>>>
>>>         Volume Name: v
>>>         Type: Distribute
>>>         Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043
>>>         Status: Started
>>>         Number of Bricks: 3
>>>         Transport-type: tcp
>>>         Bricks:
>>>         Brick1: gfs001:/bricks/b001/v
>>>         Brick2: gfs001:/bricks/b002/v
>>>         Brick3: gfs001:/bricks/b003/v
>>>         Options Reconfigured:
>>>         features.shard-block-size: 128MB
>>>         features.shard: enable
>>>         cluster.server-quorum-type: server
>>>         cluster.quorum-type: auto
>>>         network.remote-dio: enable
>>>         cluster.eager-lock: enable
>>>         performance.stat-prefetch: off
>>>         performance.io-cache: off
>>>         performance.read-ahead: off
>>>         performance.quick-read: off
>>>         performance.readdir-ahead: on
>>>
>>>         and after mounting it in ESXi and trying to clone a VM to
>>>         it, i got the same error.
>>>
>>>
>>>         Respectfully*
>>>         **Mahdi A. Mahdi*
>>>
>>>
>>>         On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:
>>>>         Hi,
>>>>
>>>>         Do not use sharding and stripe together in the same volume
>>>>         because
>>>>         a) It is not recommended and there is no point in using
>>>>         both. Using sharding alone on your volume should work fine.
>>>>         b) Nobody tested it.
>>>>         c) Like Niels said, stripe feature is virtually deprecated.
>>>>
>>>>         I would suggest that you create an nx3 volume where n is
>>>>         the number of distribute subvols you prefer, enable group
>>>>         virt options on it, and enable sharding on it,
>>>>         set the shard-block-size that you feel appropriate and then
>>>>         just start off with VM image creation etc.
>>>>         If you run into any issues even after you do this, let us
>>>>         know and we'll help you out.
>>>>
>>>>         -Krutika
>>>>
>>>>         On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan
>>>>         <mahdi.adnan at earthlinktele.com> wrote:
>>>>
>>>>             Thanks Krutika,
>>>>
>>>>             I have deleted the volume and created a new one.
>>>>             I found that it may be an issue with the NFS itself, i
>>>>             have created a new striped volume and enabled sharding
>>>>             and mounted it via glusterfs and it worked just fine,
>>>>             if i mount it with nfs it will fail and gives me the
>>>>             same errors.
>>>>
>>>>             Respectfully*
>>>>             **Mahdi A. Mahdi*
>>>>
>>>>             On 03/15/2016 06:24 AM, Krutika Dhananjay wrote:
>>>>>             Hi,
>>>>>
>>>>>             So could you share the xattrs associated with the file
>>>>>             at
>>>>>             <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>>
>>>>>             Here's what you need to execute:
>>>>>
>>>>>             # getfattr -d -m . -e hex
>>>>>             /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>>             on the first node and
>>>>>
>>>>>             # getfattr -d -m . -e hex
>>>>>             /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>>             on the second.
>>>>>
>>>>>
>>>>>             Also, it is normally advised to use a replica 3 volume
>>>>>             as opposed to replica 2 volume to guard against
>>>>>             split-brains.
>>>>>
>>>>>             -Krutika
>>>>>
>>>>>             On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan
>>>>>             <mahdi.adnan at earthlinktele.com> wrote:
>>>>>
>>>>>                 sorry for serial posting but, i got new logs it
>>>>>                 might help..
>>>>>
>>>>>                 the message appear during the migration;
>>>>>
>>>>>                 /var/log/glusterfs/nfs.log
>>>>>
>>>>>
>>>>>                 [2016-03-14 09:45:04.573765] I [MSGID: 109036]
>>>>>                 [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
>>>>>                 0-testv-dht: Setting layout of /New Virtual
>>>>>                 Machine_1 with [Subvol_name: testv-stripe-0, Err:
>>>>>                 -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
>>>>>                 [2016-03-14 09:45:04.957499] E
>>>>>                 [shard.c:369:shard_modify_size_and_block_count]
>>>>>                 (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
>>>>>                 [0x7f27a13c067f]
>>>>>                 -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
>>>>>                 [0x7f27a116681c]
>>>>>                 -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
>>>>>                 [0x7f27a116584d] ) 0-testv-shard: Failed to get
>>>>>                 trusted.glusterfs.shard.file-size for
>>>>>                 c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>>                 [2016-03-14 09:45:04.957577] W [MSGID: 112199]
>>>>>                 [nfs3-helpers.c:3418:nfs3_log_common_res]
>>>>>                 0-nfs-nfsv3: /New Virtual Machine_1/New Virtual
>>>>>                 Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS:
>>>>>                 22(Invalid argument for operation), POSIX:
>>>>>                 22(Invalid argument)) [Invalid argument]
>>>>>                 [2016-03-14 09:45:05.079657] E [MSGID: 112069]
>>>>>                 [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No
>>>>>                 such file or directory: (192.168.221.52:826
>>>>>                 <http://192.168.221.52:826>) testv :
>>>>>                 00000000-0000-0000-0000-000000000001
>>>>>
>>>>>
>>>>>
>>>>>                 Respectfully*
>>>>>                 **Mahdi A. Mahd
>>>>>
>>>>>                 *
>>>>>                 On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>>>>>>                 So i have deployed a new server "Cisco UCS
>>>>>>                 C220M4" and created a new volume;
>>>>>>
>>>>>>                 Volume Name: testv
>>>>>>                 Type: Stripe
>>>>>>                 Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
>>>>>>                 Status: Started
>>>>>>                 Number of Bricks: 1 x 2 = 2
>>>>>>                 Transport-type: tcp
>>>>>>                 Bricks:
>>>>>>                 Brick1: 10.70.0.250:/mnt/b1/v
>>>>>>                 Brick2: 10.70.0.250:/mnt/b2/v
>>>>>>                 Options Reconfigured:
>>>>>>                 nfs.disable: off
>>>>>>                 features.shard-block-size: 64MB
>>>>>>                 features.shard: enable
>>>>>>                 cluster.server-quorum-type: server
>>>>>>                 cluster.quorum-type: auto
>>>>>>                 network.remote-dio: enable
>>>>>>                 cluster.eager-lock: enable
>>>>>>                 performance.stat-prefetch: off
>>>>>>                 performance.io-cache: off
>>>>>>                 performance.read-ahead: off
>>>>>>                 performance.quick-read: off
>>>>>>                 performance.readdir-ahead: off
>>>>>>
>>>>>>                 same error ..
>>>>>>
>>>>>>                 can anyone share with me the info of a working
>>>>>>                 striped volume ?
>>>>>>
>>>>>>                 On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>>>>>>>                 I have a pool of two bricks in the same server;
>>>>>>>
>>>>>>>                 Volume Name: k
>>>>>>>                 Type: Stripe
>>>>>>>                 Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>>>>>>>                 Status: Started
>>>>>>>                 Number of Bricks: 1 x 2 = 2
>>>>>>>                 Transport-type: tcp
>>>>>>>                 Bricks:
>>>>>>>                 Brick1: gfs001:/bricks/t1/k
>>>>>>>                 Brick2: gfs001:/bricks/t2/k
>>>>>>>                 Options Reconfigured:
>>>>>>>                 features.shard-block-size: 64MB
>>>>>>>                 features.shard: on
>>>>>>>                 cluster.server-quorum-type: server
>>>>>>>                 cluster.quorum-type: auto
>>>>>>>                 network.remote-dio: enable
>>>>>>>                 cluster.eager-lock: enable
>>>>>>>                 performance.stat-prefetch: off
>>>>>>>                 performance.io-cache: off
>>>>>>>                 performance.read-ahead: off
>>>>>>>                 performance.quick-read: off
>>>>>>>                 performance.readdir-ahead: off
>>>>>>>
>>>>>>>                 same issue ...
>>>>>>>                 glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>>>>>>
>>>>>>>
>>>>>>>                 Respectfully*
>>>>>>>                 **Mahdi A. Mahdi*
>>>>>>>
>>>>>>>                 Systems Administrator
>>>>>>>                 IT. Department
>>>>>>>                 Earthlink Telecommunications
>>>>>>>                 <https://www.facebook.com/earthlinktele>
>>>>>>>
>>>>>>>                 Cell: 07903316180
>>>>>>>                 Work: 3352
>>>>>>>                 Skype: mahdi.adnan at outlook.com
>>>>>>>                 <mailto:mahdi.adnan at outlook.com>
>>>>>>>                 On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>>>>>>>                 On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote:
>>>>>>>>>                 It would be better to use sharding over stripe for your vm use case. It
>>>>>>>>>                 offers better distribution and utilisation of bricks and better heal
>>>>>>>>>                 performance.
>>>>>>>>>                 And it is well tested.
>>>>>>>>                 Basically the "striping" feature is deprecated, "sharding" is its
>>>>>>>>                 improved replacement. I expect to see "striping" completely dropped in
>>>>>>>>                 the next major release.
>>>>>>>>
>>>>>>>>                 Niels
>>>>>>>>
>>>>>>>>
>>>>>>>>>                 Couple of things to note before you do that:
>>>>>>>>>                 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised
>>>>>>>>>                 that you use 3.7.8 or above.
>>>>>>>>>                 2. When you enable sharding on a volume, already existing files in the
>>>>>>>>>                 volume do not get sharded. Only the files that are newly created from the
>>>>>>>>>                 time sharding is enabled will.
>>>>>>>>>                      If you do want to shard the existing files, then you would need to cp
>>>>>>>>>                 them to a temp name within the volume, and then rename them back to the
>>>>>>>>>                 original file name.
>>>>>>>>>
>>>>>>>>>                 HTH,
>>>>>>>>>                 Krutika
>>>>>>>>>
>>>>>>>>>                 On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com
>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>>>                 wrote:
>>>>>>>>>>                 I couldn't find anything related to cache in the HBAs.
>>>>>>>>>>                 what logs are useful in my case ? i see only bricks logs which contains
>>>>>>>>>>                 nothing during the failure.
>>>>>>>>>>
>>>>>>>>>>                 ###
>>>>>>>>>>                 [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>>>                 0-vmware-posix: mknod on
>>>>>>>>>>                 /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>>>>>>>>                 [File exists]
>>>>>>>>>>                 [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>>>                 0-vmware-posix: mknod on
>>>>>>>>>>                 /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>>>>>>>>                 [File exists]
>>>>>>>>>>                 [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash:
>>>>>>>>>>                 rmdir issued on /.trashcan/, which is not permitted
>>>>>>>>>>                 [2016-03-13 18:07:55.027635] I [MSGID: 115056]
>>>>>>>>>>                 [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR
>>>>>>>>>>                 /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op)
>>>>>>>>>>                 ==> (Operation not permitted) [Operation not permitted]
>>>>>>>>>>                 [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>>>                 user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>>>                 [2016-03-13 18:11:34.353463] I [MSGID: 115029]
>>>>>>>>>>                 [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>>>                 from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
>>>>>>>>>>                 3.7.8)
>>>>>>>>>>                 [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>>>                 user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>>>                 [2016-03-13 18:11:34.591173] I [MSGID: 115029]
>>>>>>>>>>                 [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>>>                 from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
>>>>>>>>>>                 3.7.8)
>>>>>>>>>>                 ###
>>>>>>>>>>
>>>>>>>>>>                 ESXi just keeps telling me "Cannot clone T: The virtual disk is either
>>>>>>>>>>                 corrupted or not a supported format.
>>>>>>>>>>                 error
>>>>>>>>>>                 3/13/2016 9:06:20 PM
>>>>>>>>>>                 Clone virtual machine
>>>>>>>>>>                 T
>>>>>>>>>>                 VCENTER.LOCAL\Administrator
>>>>>>>>>>                 "
>>>>>>>>>>
>>>>>>>>>>                 My setup is 2 servers with a floating ip controlled by CTDB and my ESXi
>>>>>>>>>>                 server mount the NFS via the floating ip.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                 On 03/13/2016 08:40 PM, pkoelle wrote:
>>>>>>>>>>
>>>>>>>>>>>                 Am 13.03.2016 um 18:22 schrieb David Gossage:
>>>>>>>>>>>
>>>>>>>>>>>>                 On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <
>>>>>>>>>>>>                 mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>
>>>>>>>>>>>>>                 wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>                 My HBAs are LSISAS1068E, and the filesystem is XFS.
>>>>>>>>>>>>>                 I tried EXT4 and it did not help.
>>>>>>>>>>>>>                 I have created a stripted volume in one server with two bricks, same
>>>>>>>>>>>>>                 issue.
>>>>>>>>>>>>>                 and i tried a replicated volume with just "sharding enabled" same issue,
>>>>>>>>>>>>>                 as soon as i disable the sharding it works just fine, niether sharding
>>>>>>>>>>>>>                 nor
>>>>>>>>>>>>>                 striping works for me.
>>>>>>>>>>>>>                 i did follow up with some of threads in the mailing list and tried some
>>>>>>>>>>>>>                 of
>>>>>>>>>>>>>                 the fixes that worked with the others, none worked for me. :(
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>                 Is it possible the LSI has write-cache enabled?
>>>>>>>>>>>>
>>>>>>>>>>>                 Why is that relevant? Even the backing filesystem has no idea if there is
>>>>>>>>>>>                 a RAID or write cache or whatever. There are blocks and sync(), end of
>>>>>>>>>>>                 story.
>>>>>>>>>>>                 If you lose power and screw up your recovery OR do funky stuff with SAS
>>>>>>>>>>>                 multipathing that might be an issue with a controller cache. AFAIK thats
>>>>>>>>>>>                 not what we are talking about.
>>>>>>>>>>>
>>>>>>>>>>>                 I'm afraid but unless the OP has some logs from the server, a
>>>>>>>>>>>                 reproducible testcase or a backtrace from client or server this isn't
>>>>>>>>>>>                 getting us anywhere.
>>>>>>>>>>>
>>>>>>>>>>>                 cheers
>>>>>>>>>>>                 Paul
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>                 On 03/13/2016 06:54 PM, David Gossage wrote:
>>>>>>>>>>>>>                 On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <
>>>>>>>>>>>>>                 mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 Okay so i have enabled shard in my test volume and it did not help,
>>>>>>>>>>>>>>                 stupidly enough, i have enabled it in a production volume
>>>>>>>>>>>>>>                 "Distributed-Replicate" and it currpted  half of my VMs.
>>>>>>>>>>>>>>                 I have updated Gluster to the latest and nothing seems to be changed in
>>>>>>>>>>>>>>                 my situation.
>>>>>>>>>>>>>>                 below the info of my volume;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>                 I was pointing at the settings in that email as an example for
>>>>>>>>>>>>>                 corruption
>>>>>>>>>>>>>                 fixing. I wouldn't recommend enabling sharding if you haven't gotten the
>>>>>>>>>>>>>                 base working yet on that cluster. What HBA's are you using and what is
>>>>>>>>>>>>>                 layout of filesystem for bricks?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                 Number of Bricks: 3 x 2 = 6
>>>>>>>>>>>>>>                 Transport-type: tcp
>>>>>>>>>>>>>>                 Bricks:
>>>>>>>>>>>>>>                 Brick1: gfs001:/bricks/b001/vmware
>>>>>>>>>>>>>>                 Brick2: gfs002:/bricks/b004/vmware
>>>>>>>>>>>>>>                 Brick3: gfs001:/bricks/b002/vmware
>>>>>>>>>>>>>>                 Brick4: gfs002:/bricks/b005/vmware
>>>>>>>>>>>>>>                 Brick5: gfs001:/bricks/b003/vmware
>>>>>>>>>>>>>>                 Brick6: gfs002:/bricks/b006/vmware
>>>>>>>>>>>>>>                 Options Reconfigured:
>>>>>>>>>>>>>>                 performance.strict-write-ordering: on
>>>>>>>>>>>>>>                 cluster.server-quorum-type: server
>>>>>>>>>>>>>>                 cluster.quorum-type: auto
>>>>>>>>>>>>>>                 network.remote-dio: enable
>>>>>>>>>>>>>>                 performance.stat-prefetch: disable
>>>>>>>>>>>>>>                 performance.io-cache: off
>>>>>>>>>>>>>>                 performance.read-ahead: off
>>>>>>>>>>>>>>                 performance.quick-read: off
>>>>>>>>>>>>>>                 cluster.eager-lock: enable
>>>>>>>>>>>>>>                 features.shard-block-size: 16MB
>>>>>>>>>>>>>>                 features.shard: on
>>>>>>>>>>>>>>                 performance.readdir-ahead: off
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                 On 03/12/2016 08:11 PM, David Gossage wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                 On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <
>>>>>>>>>>>>>>                 <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                 Both servers have HBA no RAIDs and i can setup a replicated or
>>>>>>>>>>>>>>>                 dispensers without any issues.
>>>>>>>>>>>>>>>                 Logs are clean and when i tried to migrate a vm and got the error,
>>>>>>>>>>>>>>>                 nothing showed up in the logs.
>>>>>>>>>>>>>>>                 i tried mounting the volume into my laptop and it mounted fine but,
>>>>>>>>>>>>>>>                 if i
>>>>>>>>>>>>>>>                 use dd to create a data file it just hang and i cant cancel it, and i
>>>>>>>>>>>>>>>                 cant
>>>>>>>>>>>>>>>                 unmount it or anything, i just have to reboot.
>>>>>>>>>>>>>>>                 The same servers have another volume on other bricks in a distributed
>>>>>>>>>>>>>>>                 replicas, works fine.
>>>>>>>>>>>>>>>                 I have even tried the same setup in a virtual environment (created two
>>>>>>>>>>>>>>>                 vms and install gluster and created a replicated striped) and again
>>>>>>>>>>>>>>>                 same
>>>>>>>>>>>>>>>                 thing, data corruption.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                 I'd look through mail archives for a topic "Shard in Production" I
>>>>>>>>>>>>>>                 think
>>>>>>>>>>>>>>                 it's called.  The shard portion may not be relevant but it does discuss
>>>>>>>>>>>>>>                 certain settings that had to be applied with regards to avoiding
>>>>>>>>>>>>>>                 corruption
>>>>>>>>>>>>>>                 with VM's.  You may want to try and disable the
>>>>>>>>>>>>>>                 performance.readdir-ahead
>>>>>>>>>>>>>>                 also.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                 On 03/12/2016 07:02 PM, David Gossage wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                 On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <
>>>>>>>>>>>>>>>                 <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                 Thanks David,
>>>>>>>>>>>>>>>>                 My settings are all defaults, i have just created the pool and
>>>>>>>>>>>>>>>>                 started
>>>>>>>>>>>>>>>>                 it.
>>>>>>>>>>>>>>>>                 I have set the settings as your recommendation and it seems to be the
>>>>>>>>>>>>>>>>                 same issue;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 Type: Striped-Replicate
>>>>>>>>>>>>>>>>                 Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>>>>>>>>>>>>>                 Status: Started
>>>>>>>>>>>>>>>>                 Number of Bricks: 1 x 2 x 2 = 4
>>>>>>>>>>>>>>>>                 Transport-type: tcp
>>>>>>>>>>>>>>>>                 Bricks:
>>>>>>>>>>>>>>>>                 Brick1: gfs001:/bricks/t1/s
>>>>>>>>>>>>>>>>                 Brick2: gfs002:/bricks/t1/s
>>>>>>>>>>>>>>>>                 Brick3: gfs001:/bricks/t2/s
>>>>>>>>>>>>>>>>                 Brick4: gfs002:/bricks/t2/s
>>>>>>>>>>>>>>>>                 Options Reconfigured:
>>>>>>>>>>>>>>>>                 performance.stat-prefetch: off
>>>>>>>>>>>>>>>>                 network.remote-dio: on
>>>>>>>>>>>>>>>>                 cluster.eager-lock: enable
>>>>>>>>>>>>>>>>                 performance.io-cache: off
>>>>>>>>>>>>>>>>                 performance.read-ahead: off
>>>>>>>>>>>>>>>>                 performance.quick-read: off
>>>>>>>>>>>>>>>>                 performance.readdir-ahead: on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                 Is their a raid controller perhaps doing any caching?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                 In the gluster logs any errors being reported during migration
>>>>>>>>>>>>>>>                 process?
>>>>>>>>>>>>>>>                 Since they aren't in use yet have you tested making just mirrored
>>>>>>>>>>>>>>>                 bricks
>>>>>>>>>>>>>>>                 using different pairings of servers two at a time to see if problem
>>>>>>>>>>>>>>>                 follows
>>>>>>>>>>>>>>>                 certain machine or network ports?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <
>>>>>>>>>>>>>>>>                 <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>>>                 <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 Dears,
>>>>>>>>>>>>>>>>>                 I have created a replicated striped volume with two bricks and two
>>>>>>>>>>>>>>>>>                 servers but I can't use it because when I mount it in ESXi and try
>>>>>>>>>>>>>>>>>                 to
>>>>>>>>>>>>>>>>>                 migrate a VM to it, the data get corrupted.
>>>>>>>>>>>>>>>>>                 Is any one have any idea why is this happening ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                 Dell 2950 x2
>>>>>>>>>>>>>>>>>                 Seagate 15k 600GB
>>>>>>>>>>>>>>>>>                 CentOS 7.2
>>>>>>>>>>>>>>>>>                 Gluster 3.7.8
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                 Appreciate your help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 Most reports of this I have seen end up being settings related.  Post
>>>>>>>>>>>>>>>>                 gluster volume info. Below is what I have seen as most common
>>>>>>>>>>>>>>>>                 recommended
>>>>>>>>>>>>>>>>                 settings.
>>>>>>>>>>>>>>>>                 I'd hazard a guess you may have some the read ahead cache or prefetch
>>>>>>>>>>>>>>>>                 on.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                 quick-read=off
>>>>>>>>>>>>>>>>                 read-ahead=off
>>>>>>>>>>>>>>>>                 io-cache=off
>>>>>>>>>>>>>>>>                 stat-prefetch=off
>>>>>>>>>>>>>>>>                 eager-lock=enable
>>>>>>>>>>>>>>>>                 remote-dio=on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                 Mahdi Adnan
>>>>>>>>>>>>>>>>>                 System Admin
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                 _______________________________________________
>>>>>>>>>>>>>>>>>                 Gluster-users mailing list
>>>>>>>>>>>>>>>>>                 <Gluster-users at gluster.org>
>>>>>>>>>>>>>>>>>                 <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>>>>>>>>>>>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>>>>>>>                 <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>>>                 <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>                 _______________________________________________
>>>>>>>>>>>>                 Gluster-users mailing list
>>>>>>>>>>>>                 Gluster-users at gluster.org
>>>>>>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>                 _______________________________________________
>>>>>>>>>>>                 Gluster-users mailing list
>>>>>>>>>>>                 Gluster-users at gluster.org
>>>>>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>                 _______________________________________________
>>>>>>>>>>                 Gluster-users mailing list
>>>>>>>>>>                 Gluster-users at gluster.org
>>>>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>                 _______________________________________________
>>>>>>>>>                 Gluster-users mailing list
>>>>>>>>>                 Gluster-users at gluster.org
>>>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                 _______________________________________________
>>>>>>>                 Gluster-users mailing list
>>>>>>>                 Gluster-users at gluster.org
>>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 _______________________________________________
>>>>>>                 Gluster-users mailing list
>>>>>>                 Gluster-users at gluster.org
>>>>>>                 <mailto:Gluster-users at gluster.org>
>>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>>                 _______________________________________________
>>>>>                 Gluster-users mailing list
>>>>>                 Gluster-users at gluster.org
>>>>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/0ce56308/attachment.html>


More information about the Gluster-users mailing list