[Gluster-users] Replicated striped data lose
Mahdi Adnan
mahdi.adnan at earthlinktele.com
Tue Mar 15 12:51:14 UTC 2016
and here's the log of the ESXi;
2016-03-15T12:50:17.982Z cpu41:33260)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:17.983Z cpu41:33260)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:17.984Z cpu41:33260)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:17.995Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.000Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.031Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.032Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.032Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.043Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
2016-03-15T12:50:18.048Z cpu41:35990)WARNING: NFS: 4566: Short read for
object b00f 60 6b5d5087 c5d851fc 4c474f3a 8efd48b3 9d4617b1 a556c0bc
30c9df6b 16c12f10 250219b3f4146a7 431899d1 0 431200000000 offset: 0x0
requested: 0x200 read: 0x95
Respectfully*
**Mahdi A. Mahdi*
Skype: mahdi.adnan at outlook.com <mailto:mahdi.adnan at outlook.com>
On 03/15/2016 03:06 PM, Mahdi Adnan wrote:
> [2016-03-15 14:12:01.421615] I [MSGID: 109036]
> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-v-dht:
> Setting layout of /New Virtual Machine_2 with [Subvol_name:
> v-replicate-0, Err: -1 , Start: 0 , Stop: 1431655764 , Hash: 1 ],
> [Subvol_name: v-replicate-1, Err: -1 , Start: 1431655765 , Stop:
> 2863311529 , Hash: 1 ], [Subvol_name: v-replicate-2, Err: -1 , Start:
> 2863311530 , Stop: 4294967295 , Hash: 1 ],
> [2016-03-15 14:12:02.001167] I [MSGID: 109066]
> [dht-rename.c:1413:dht_rename] 0-v-dht: renaming /New Virtual
> Machine_2/New Virtual Machine.vmdk~
> (hash=v-replicate-2/cache=v-replicate-2) => /New Virtual Machine_2/New
> Virtual Machine.vmdk (hash=v-replicate-2/cache=v-replicate-2)
> [2016-03-15 14:12:02.248164] W [MSGID: 112032]
> [nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7d9f: /New Virtual
> Machine_2 => -1 (Directory not empty) [Directory not empty]
> [2016-03-15 14:12:02.259015] W [MSGID: 112032]
> [nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7da3: /New Virtual
> Machine_2 => -1 (Directory not empty) [Directory not empty]
>
>
> Respectfully*
> **Mahdi A. Mahdi*
>
> On 03/15/2016 03:03 PM, Krutika Dhananjay wrote:
>> Hmm ok. Could you share the nfs.log content?
>>
>> -Krutika
>>
>> On Tue, Mar 15, 2016 at 1:45 PM, Mahdi Adnan
>> <mahdi.adnan at earthlinktele.com
>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>
>> Okay, here's what i did;
>>
>> Volume Name: v
>> Type: Distributed-Replicate
>> Volume ID: b348fd8e-b117-469d-bcc0-56a56bdfc930
>> Status: Started
>> Number of Bricks: 3 x 2 = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs001:/bricks/b001/v
>> Brick2: gfs001:/bricks/b002/v
>> Brick3: gfs001:/bricks/b003/v
>> Brick4: gfs002:/bricks/b004/v
>> Brick5: gfs002:/bricks/b005/v
>> Brick6: gfs002:/bricks/b006/v
>> Options Reconfigured:
>> features.shard-block-size: 128MB
>> features.shard: enable
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.readdir-ahead: on
>>
>>
>> same error.
>> and still mounting using glusterfs will work just fine.
>>
>> Respectfully*
>> **Mahdi A. Mahdi*
>>
>>
>> On 03/15/2016 11:04 AM, Krutika Dhananjay wrote:
>>> OK but what if you use it with replication? Do you still see the
>>> error? I think not.
>>> Could you give it a try and tell me what you find?
>>>
>>> -Krutika
>>>
>>> On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan
>>> <mahdi.adnan at earthlinktele.com> wrote:
>>>
>>> Hi,
>>>
>>> I have created the following volume;
>>>
>>> Volume Name: v
>>> Type: Distribute
>>> Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043
>>> Status: Started
>>> Number of Bricks: 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gfs001:/bricks/b001/v
>>> Brick2: gfs001:/bricks/b002/v
>>> Brick3: gfs001:/bricks/b003/v
>>> Options Reconfigured:
>>> features.shard-block-size: 128MB
>>> features.shard: enable
>>> cluster.server-quorum-type: server
>>> cluster.quorum-type: auto
>>> network.remote-dio: enable
>>> cluster.eager-lock: enable
>>> performance.stat-prefetch: off
>>> performance.io-cache: off
>>> performance.read-ahead: off
>>> performance.quick-read: off
>>> performance.readdir-ahead: on
>>>
>>> and after mounting it in ESXi and trying to clone a VM to
>>> it, i got the same error.
>>>
>>>
>>> Respectfully*
>>> **Mahdi A. Mahdi*
>>>
>>>
>>> On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:
>>>> Hi,
>>>>
>>>> Do not use sharding and stripe together in the same volume
>>>> because
>>>> a) It is not recommended and there is no point in using
>>>> both. Using sharding alone on your volume should work fine.
>>>> b) Nobody tested it.
>>>> c) Like Niels said, stripe feature is virtually deprecated.
>>>>
>>>> I would suggest that you create an nx3 volume where n is
>>>> the number of distribute subvols you prefer, enable group
>>>> virt options on it, and enable sharding on it,
>>>> set the shard-block-size that you feel appropriate and then
>>>> just start off with VM image creation etc.
>>>> If you run into any issues even after you do this, let us
>>>> know and we'll help you out.
>>>>
>>>> -Krutika
>>>>
>>>> On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan
>>>> <mahdi.adnan at earthlinktele.com> wrote:
>>>>
>>>> Thanks Krutika,
>>>>
>>>> I have deleted the volume and created a new one.
>>>> I found that it may be an issue with the NFS itself, i
>>>> have created a new striped volume and enabled sharding
>>>> and mounted it via glusterfs and it worked just fine,
>>>> if i mount it with nfs it will fail and gives me the
>>>> same errors.
>>>>
>>>> Respectfully*
>>>> **Mahdi A. Mahdi*
>>>>
>>>> On 03/15/2016 06:24 AM, Krutika Dhananjay wrote:
>>>>> Hi,
>>>>>
>>>>> So could you share the xattrs associated with the file
>>>>> at
>>>>> <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>>
>>>>> Here's what you need to execute:
>>>>>
>>>>> # getfattr -d -m . -e hex
>>>>> /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>> on the first node and
>>>>>
>>>>> # getfattr -d -m . -e hex
>>>>> /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>> on the second.
>>>>>
>>>>>
>>>>> Also, it is normally advised to use a replica 3 volume
>>>>> as opposed to replica 2 volume to guard against
>>>>> split-brains.
>>>>>
>>>>> -Krutika
>>>>>
>>>>> On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan
>>>>> <mahdi.adnan at earthlinktele.com> wrote:
>>>>>
>>>>> sorry for serial posting but, i got new logs it
>>>>> might help..
>>>>>
>>>>> the message appear during the migration;
>>>>>
>>>>> /var/log/glusterfs/nfs.log
>>>>>
>>>>>
>>>>> [2016-03-14 09:45:04.573765] I [MSGID: 109036]
>>>>> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
>>>>> 0-testv-dht: Setting layout of /New Virtual
>>>>> Machine_1 with [Subvol_name: testv-stripe-0, Err:
>>>>> -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
>>>>> [2016-03-14 09:45:04.957499] E
>>>>> [shard.c:369:shard_modify_size_and_block_count]
>>>>> (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
>>>>> [0x7f27a13c067f]
>>>>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
>>>>> [0x7f27a116681c]
>>>>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
>>>>> [0x7f27a116584d] ) 0-testv-shard: Failed to get
>>>>> trusted.glusterfs.shard.file-size for
>>>>> c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>> [2016-03-14 09:45:04.957577] W [MSGID: 112199]
>>>>> [nfs3-helpers.c:3418:nfs3_log_common_res]
>>>>> 0-nfs-nfsv3: /New Virtual Machine_1/New Virtual
>>>>> Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS:
>>>>> 22(Invalid argument for operation), POSIX:
>>>>> 22(Invalid argument)) [Invalid argument]
>>>>> [2016-03-14 09:45:05.079657] E [MSGID: 112069]
>>>>> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No
>>>>> such file or directory: (192.168.221.52:826
>>>>> <http://192.168.221.52:826>) testv :
>>>>> 00000000-0000-0000-0000-000000000001
>>>>>
>>>>>
>>>>>
>>>>> Respectfully*
>>>>> **Mahdi A. Mahd
>>>>>
>>>>> *
>>>>> On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>>>>>> So i have deployed a new server "Cisco UCS
>>>>>> C220M4" and created a new volume;
>>>>>>
>>>>>> Volume Name: testv
>>>>>> Type: Stripe
>>>>>> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
>>>>>> Status: Started
>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: 10.70.0.250:/mnt/b1/v
>>>>>> Brick2: 10.70.0.250:/mnt/b2/v
>>>>>> Options Reconfigured:
>>>>>> nfs.disable: off
>>>>>> features.shard-block-size: 64MB
>>>>>> features.shard: enable
>>>>>> cluster.server-quorum-type: server
>>>>>> cluster.quorum-type: auto
>>>>>> network.remote-dio: enable
>>>>>> cluster.eager-lock: enable
>>>>>> performance.stat-prefetch: off
>>>>>> performance.io-cache: off
>>>>>> performance.read-ahead: off
>>>>>> performance.quick-read: off
>>>>>> performance.readdir-ahead: off
>>>>>>
>>>>>> same error ..
>>>>>>
>>>>>> can anyone share with me the info of a working
>>>>>> striped volume ?
>>>>>>
>>>>>> On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>>>>>>> I have a pool of two bricks in the same server;
>>>>>>>
>>>>>>> Volume Name: k
>>>>>>> Type: Stripe
>>>>>>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>>>>>>> Status: Started
>>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: gfs001:/bricks/t1/k
>>>>>>> Brick2: gfs001:/bricks/t2/k
>>>>>>> Options Reconfigured:
>>>>>>> features.shard-block-size: 64MB
>>>>>>> features.shard: on
>>>>>>> cluster.server-quorum-type: server
>>>>>>> cluster.quorum-type: auto
>>>>>>> network.remote-dio: enable
>>>>>>> cluster.eager-lock: enable
>>>>>>> performance.stat-prefetch: off
>>>>>>> performance.io-cache: off
>>>>>>> performance.read-ahead: off
>>>>>>> performance.quick-read: off
>>>>>>> performance.readdir-ahead: off
>>>>>>>
>>>>>>> same issue ...
>>>>>>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>>>>>>
>>>>>>>
>>>>>>> Respectfully*
>>>>>>> **Mahdi A. Mahdi*
>>>>>>>
>>>>>>> Systems Administrator
>>>>>>> IT. Department
>>>>>>> Earthlink Telecommunications
>>>>>>> <https://www.facebook.com/earthlinktele>
>>>>>>>
>>>>>>> Cell: 07903316180
>>>>>>> Work: 3352
>>>>>>> Skype: mahdi.adnan at outlook.com
>>>>>>> <mailto:mahdi.adnan at outlook.com>
>>>>>>> On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>>>>>>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote:
>>>>>>>>> It would be better to use sharding over stripe for your vm use case. It
>>>>>>>>> offers better distribution and utilisation of bricks and better heal
>>>>>>>>> performance.
>>>>>>>>> And it is well tested.
>>>>>>>> Basically the "striping" feature is deprecated, "sharding" is its
>>>>>>>> improved replacement. I expect to see "striping" completely dropped in
>>>>>>>> the next major release.
>>>>>>>>
>>>>>>>> Niels
>>>>>>>>
>>>>>>>>
>>>>>>>>> Couple of things to note before you do that:
>>>>>>>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised
>>>>>>>>> that you use 3.7.8 or above.
>>>>>>>>> 2. When you enable sharding on a volume, already existing files in the
>>>>>>>>> volume do not get sharded. Only the files that are newly created from the
>>>>>>>>> time sharding is enabled will.
>>>>>>>>> If you do want to shard the existing files, then you would need to cp
>>>>>>>>> them to a temp name within the volume, and then rename them back to the
>>>>>>>>> original file name.
>>>>>>>>>
>>>>>>>>> HTH,
>>>>>>>>> Krutika
>>>>>>>>>
>>>>>>>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com
>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>>> wrote:
>>>>>>>>>> I couldn't find anything related to cache in the HBAs.
>>>>>>>>>> what logs are useful in my case ? i see only bricks logs which contains
>>>>>>>>>> nothing during the failure.
>>>>>>>>>>
>>>>>>>>>> ###
>>>>>>>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>>> 0-vmware-posix: mknod on
>>>>>>>>>> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>>>>>>>> [File exists]
>>>>>>>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>>> 0-vmware-posix: mknod on
>>>>>>>>>> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>>>>>>>> [File exists]
>>>>>>>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash:
>>>>>>>>>> rmdir issued on /.trashcan/, which is not permitted
>>>>>>>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056]
>>>>>>>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR
>>>>>>>>>> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op)
>>>>>>>>>> ==> (Operation not permitted) [Operation not permitted]
>>>>>>>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029]
>>>>>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
>>>>>>>>>> 3.7.8)
>>>>>>>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029]
>>>>>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
>>>>>>>>>> 3.7.8)
>>>>>>>>>> ###
>>>>>>>>>>
>>>>>>>>>> ESXi just keeps telling me "Cannot clone T: The virtual disk is either
>>>>>>>>>> corrupted or not a supported format.
>>>>>>>>>> error
>>>>>>>>>> 3/13/2016 9:06:20 PM
>>>>>>>>>> Clone virtual machine
>>>>>>>>>> T
>>>>>>>>>> VCENTER.LOCAL\Administrator
>>>>>>>>>> "
>>>>>>>>>>
>>>>>>>>>> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi
>>>>>>>>>> server mount the NFS via the floating ip.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 03/13/2016 08:40 PM, pkoelle wrote:
>>>>>>>>>>
>>>>>>>>>>> Am 13.03.2016 um 18:22 schrieb David Gossage:
>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <
>>>>>>>>>>>> mahdi.adnan at earthlinktele.com
>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>> My HBAs are LSISAS1068E, and the filesystem is XFS.
>>>>>>>>>>>>> I tried EXT4 and it did not help.
>>>>>>>>>>>>> I have created a stripted volume in one server with two bricks, same
>>>>>>>>>>>>> issue.
>>>>>>>>>>>>> and i tried a replicated volume with just "sharding enabled" same issue,
>>>>>>>>>>>>> as soon as i disable the sharding it works just fine, niether sharding
>>>>>>>>>>>>> nor
>>>>>>>>>>>>> striping works for me.
>>>>>>>>>>>>> i did follow up with some of threads in the mailing list and tried some
>>>>>>>>>>>>> of
>>>>>>>>>>>>> the fixes that worked with the others, none worked for me. :(
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Is it possible the LSI has write-cache enabled?
>>>>>>>>>>>>
>>>>>>>>>>> Why is that relevant? Even the backing filesystem has no idea if there is
>>>>>>>>>>> a RAID or write cache or whatever. There are blocks and sync(), end of
>>>>>>>>>>> story.
>>>>>>>>>>> If you lose power and screw up your recovery OR do funky stuff with SAS
>>>>>>>>>>> multipathing that might be an issue with a controller cache. AFAIK thats
>>>>>>>>>>> not what we are talking about.
>>>>>>>>>>>
>>>>>>>>>>> I'm afraid but unless the OP has some logs from the server, a
>>>>>>>>>>> reproducible testcase or a backtrace from client or server this isn't
>>>>>>>>>>> getting us anywhere.
>>>>>>>>>>>
>>>>>>>>>>> cheers
>>>>>>>>>>> Paul
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On 03/13/2016 06:54 PM, David Gossage wrote:
>>>>>>>>>>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <
>>>>>>>>>>>>> mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Okay so i have enabled shard in my test volume and it did not help,
>>>>>>>>>>>>>> stupidly enough, i have enabled it in a production volume
>>>>>>>>>>>>>> "Distributed-Replicate" and it currpted half of my VMs.
>>>>>>>>>>>>>> I have updated Gluster to the latest and nothing seems to be changed in
>>>>>>>>>>>>>> my situation.
>>>>>>>>>>>>>> below the info of my volume;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I was pointing at the settings in that email as an example for
>>>>>>>>>>>>> corruption
>>>>>>>>>>>>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the
>>>>>>>>>>>>> base working yet on that cluster. What HBA's are you using and what is
>>>>>>>>>>>>> layout of filesystem for bricks?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Number of Bricks: 3 x 2 = 6
>>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>>> Brick1: gfs001:/bricks/b001/vmware
>>>>>>>>>>>>>> Brick2: gfs002:/bricks/b004/vmware
>>>>>>>>>>>>>> Brick3: gfs001:/bricks/b002/vmware
>>>>>>>>>>>>>> Brick4: gfs002:/bricks/b005/vmware
>>>>>>>>>>>>>> Brick5: gfs001:/bricks/b003/vmware
>>>>>>>>>>>>>> Brick6: gfs002:/bricks/b006/vmware
>>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>>> performance.strict-write-ordering: on
>>>>>>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>>>>>> network.remote-dio: enable
>>>>>>>>>>>>>> performance.stat-prefetch: disable
>>>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>>>> features.shard-block-size: 16MB
>>>>>>>>>>>>>> features.shard: on
>>>>>>>>>>>>>> performance.readdir-ahead: off
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 03/12/2016 08:11 PM, David Gossage wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <
>>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Both servers have HBA no RAIDs and i can setup a replicated or
>>>>>>>>>>>>>>> dispensers without any issues.
>>>>>>>>>>>>>>> Logs are clean and when i tried to migrate a vm and got the error,
>>>>>>>>>>>>>>> nothing showed up in the logs.
>>>>>>>>>>>>>>> i tried mounting the volume into my laptop and it mounted fine but,
>>>>>>>>>>>>>>> if i
>>>>>>>>>>>>>>> use dd to create a data file it just hang and i cant cancel it, and i
>>>>>>>>>>>>>>> cant
>>>>>>>>>>>>>>> unmount it or anything, i just have to reboot.
>>>>>>>>>>>>>>> The same servers have another volume on other bricks in a distributed
>>>>>>>>>>>>>>> replicas, works fine.
>>>>>>>>>>>>>>> I have even tried the same setup in a virtual environment (created two
>>>>>>>>>>>>>>> vms and install gluster and created a replicated striped) and again
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>> thing, data corruption.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'd look through mail archives for a topic "Shard in Production" I
>>>>>>>>>>>>>> think
>>>>>>>>>>>>>> it's called. The shard portion may not be relevant but it does discuss
>>>>>>>>>>>>>> certain settings that had to be applied with regards to avoiding
>>>>>>>>>>>>>> corruption
>>>>>>>>>>>>>> with VM's. You may want to try and disable the
>>>>>>>>>>>>>> performance.readdir-ahead
>>>>>>>>>>>>>> also.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 03/12/2016 07:02 PM, David Gossage wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <
>>>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks David,
>>>>>>>>>>>>>>>> My settings are all defaults, i have just created the pool and
>>>>>>>>>>>>>>>> started
>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>> I have set the settings as your recommendation and it seems to be the
>>>>>>>>>>>>>>>> same issue;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Type: Striped-Replicate
>>>>>>>>>>>>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>>>>>>>>>>>>> Status: Started
>>>>>>>>>>>>>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>>>>> Brick1: gfs001:/bricks/t1/s
>>>>>>>>>>>>>>>> Brick2: gfs002:/bricks/t1/s
>>>>>>>>>>>>>>>> Brick3: gfs001:/bricks/t2/s
>>>>>>>>>>>>>>>> Brick4: gfs002:/bricks/t2/s
>>>>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is their a raid controller perhaps doing any caching?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the gluster logs any errors being reported during migration
>>>>>>>>>>>>>>> process?
>>>>>>>>>>>>>>> Since they aren't in use yet have you tested making just mirrored
>>>>>>>>>>>>>>> bricks
>>>>>>>>>>>>>>> using different pairings of servers two at a time to see if problem
>>>>>>>>>>>>>>> follows
>>>>>>>>>>>>>>> certain machine or network ports?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <
>>>>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>>>> I have created a replicated striped volume with two bricks and two
>>>>>>>>>>>>>>>>> servers but I can't use it because when I mount it in ESXi and try
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> migrate a VM to it, the data get corrupted.
>>>>>>>>>>>>>>>>> Is any one have any idea why is this happening ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dell 2950 x2
>>>>>>>>>>>>>>>>> Seagate 15k 600GB
>>>>>>>>>>>>>>>>> CentOS 7.2
>>>>>>>>>>>>>>>>> Gluster 3.7.8
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Appreciate your help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Most reports of this I have seen end up being settings related. Post
>>>>>>>>>>>>>>>> gluster volume info. Below is what I have seen as most common
>>>>>>>>>>>>>>>> recommended
>>>>>>>>>>>>>>>> settings.
>>>>>>>>>>>>>>>> I'd hazard a guess you may have some the read ahead cache or prefetch
>>>>>>>>>>>>>>>> on.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> quick-read=off
>>>>>>>>>>>>>>>> read-ahead=off
>>>>>>>>>>>>>>>> io-cache=off
>>>>>>>>>>>>>>>> stat-prefetch=off
>>>>>>>>>>>>>>>> eager-lock=enable
>>>>>>>>>>>>>>>> remote-dio=on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Mahdi Adnan
>>>>>>>>>>>>>>>>> System Admin
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>>>>>>> <Gluster-users at gluster.org>
>>>>>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>>>>>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/0ce56308/attachment.html>
More information about the Gluster-users
mailing list