[Gluster-users] Replicated striped data lose
Mahdi Adnan
mahdi.adnan at earthlinktele.com
Tue Mar 15 12:06:25 UTC 2016
[2016-03-15 14:12:01.421615] I [MSGID: 109036]
[dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-v-dht: Setting
layout of /New Virtual Machine_2 with [Subvol_name: v-replicate-0, Err:
-1 , Start: 0 , Stop: 1431655764 , Hash: 1 ], [Subvol_name:
v-replicate-1, Err: -1 , Start: 1431655765 , Stop: 2863311529 , Hash: 1
], [Subvol_name: v-replicate-2, Err: -1 , Start: 2863311530 , Stop:
4294967295 , Hash: 1 ],
[2016-03-15 14:12:02.001167] I [MSGID: 109066]
[dht-rename.c:1413:dht_rename] 0-v-dht: renaming /New Virtual
Machine_2/New Virtual Machine.vmdk~
(hash=v-replicate-2/cache=v-replicate-2) => /New Virtual Machine_2/New
Virtual Machine.vmdk (hash=v-replicate-2/cache=v-replicate-2)
[2016-03-15 14:12:02.248164] W [MSGID: 112032]
[nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7d9f: /New Virtual Machine_2
=> -1 (Directory not empty) [Directory not empty]
[2016-03-15 14:12:02.259015] W [MSGID: 112032]
[nfs3.c:3622:nfs3svc_rmdir_cbk] 0-nfs: 3fed7da3: /New Virtual Machine_2
=> -1 (Directory not empty) [Directory not empty]
Respectfully*
**Mahdi A. Mahdi*
On 03/15/2016 03:03 PM, Krutika Dhananjay wrote:
> Hmm ok. Could you share the nfs.log content?
>
> -Krutika
>
> On Tue, Mar 15, 2016 at 1:45 PM, Mahdi Adnan
> <mahdi.adnan at earthlinktele.com <mailto:mahdi.adnan at earthlinktele.com>>
> wrote:
>
> Okay, here's what i did;
>
> Volume Name: v
> Type: Distributed-Replicate
> Volume ID: b348fd8e-b117-469d-bcc0-56a56bdfc930
> Status: Started
> Number of Bricks: 3 x 2 = 6
> Transport-type: tcp
> Bricks:
> Brick1: gfs001:/bricks/b001/v
> Brick2: gfs001:/bricks/b002/v
> Brick3: gfs001:/bricks/b003/v
> Brick4: gfs002:/bricks/b004/v
> Brick5: gfs002:/bricks/b005/v
> Brick6: gfs002:/bricks/b006/v
> Options Reconfigured:
> features.shard-block-size: 128MB
> features.shard: enable
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: on
>
>
> same error.
> and still mounting using glusterfs will work just fine.
>
> Respectfully*
> **Mahdi A. Mahdi*
>
>
> On 03/15/2016 11:04 AM, Krutika Dhananjay wrote:
>> OK but what if you use it with replication? Do you still see the
>> error? I think not.
>> Could you give it a try and tell me what you find?
>>
>> -Krutika
>>
>> On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan
>> <mahdi.adnan at earthlinktele.com
>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>
>> Hi,
>>
>> I have created the following volume;
>>
>> Volume Name: v
>> Type: Distribute
>> Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043
>> Status: Started
>> Number of Bricks: 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs001:/bricks/b001/v
>> Brick2: gfs001:/bricks/b002/v
>> Brick3: gfs001:/bricks/b003/v
>> Options Reconfigured:
>> features.shard-block-size: 128MB
>> features.shard: enable
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.readdir-ahead: on
>>
>> and after mounting it in ESXi and trying to clone a VM to it,
>> i got the same error.
>>
>>
>> Respectfully*
>> **Mahdi A. Mahdi*
>>
>>
>> On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:
>>> Hi,
>>>
>>> Do not use sharding and stripe together in the same volume
>>> because
>>> a) It is not recommended and there is no point in using
>>> both. Using sharding alone on your volume should work fine.
>>> b) Nobody tested it.
>>> c) Like Niels said, stripe feature is virtually deprecated.
>>>
>>> I would suggest that you create an nx3 volume where n is the
>>> number of distribute subvols you prefer, enable group virt
>>> options on it, and enable sharding on it,
>>> set the shard-block-size that you feel appropriate and then
>>> just start off with VM image creation etc.
>>> If you run into any issues even after you do this, let us
>>> know and we'll help you out.
>>>
>>> -Krutika
>>>
>>> On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan
>>> <mahdi.adnan at earthlinktele.com
>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>
>>> Thanks Krutika,
>>>
>>> I have deleted the volume and created a new one.
>>> I found that it may be an issue with the NFS itself, i
>>> have created a new striped volume and enabled sharding
>>> and mounted it via glusterfs and it worked just fine, if
>>> i mount it with nfs it will fail and gives me the same
>>> errors.
>>>
>>> Respectfully*
>>> **Mahdi A. Mahdi*
>>>
>>> On 03/15/2016 06:24 AM, Krutika Dhananjay wrote:
>>>> Hi,
>>>>
>>>> So could you share the xattrs associated with the file
>>>> at
>>>> <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>>
>>>> Here's what you need to execute:
>>>>
>>>> # getfattr -d -m . -e hex
>>>> /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>> on the first node and
>>>>
>>>> # getfattr -d -m . -e hex
>>>> /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>> on the second.
>>>>
>>>>
>>>> Also, it is normally advised to use a replica 3 volume
>>>> as opposed to replica 2 volume to guard against
>>>> split-brains.
>>>>
>>>> -Krutika
>>>>
>>>> On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan
>>>> <mahdi.adnan at earthlinktele.com
>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>
>>>> sorry for serial posting but, i got new logs it
>>>> might help..
>>>>
>>>> the message appear during the migration;
>>>>
>>>> /var/log/glusterfs/nfs.log
>>>>
>>>>
>>>> [2016-03-14 09:45:04.573765] I [MSGID: 109036]
>>>> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
>>>> 0-testv-dht: Setting layout of /New Virtual
>>>> Machine_1 with [Subvol_name: testv-stripe-0, Err:
>>>> -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
>>>> [2016-03-14 09:45:04.957499] E
>>>> [shard.c:369:shard_modify_size_and_block_count]
>>>> (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
>>>> [0x7f27a13c067f]
>>>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
>>>> [0x7f27a116681c]
>>>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
>>>> [0x7f27a116584d] ) 0-testv-shard: Failed to get
>>>> trusted.glusterfs.shard.file-size for
>>>> c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>> [2016-03-14 09:45:04.957577] W [MSGID: 112199]
>>>> [nfs3-helpers.c:3418:nfs3_log_common_res]
>>>> 0-nfs-nfsv3: /New Virtual Machine_1/New Virtual
>>>> Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS:
>>>> 22(Invalid argument for operation), POSIX:
>>>> 22(Invalid argument)) [Invalid argument]
>>>> [2016-03-14 09:45:05.079657] E [MSGID: 112069]
>>>> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No
>>>> such file or directory: (192.168.221.52:826
>>>> <http://192.168.221.52:826>) testv :
>>>> 00000000-0000-0000-0000-000000000001
>>>>
>>>>
>>>>
>>>> Respectfully*
>>>> **Mahdi A. Mahd
>>>>
>>>> *
>>>> On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>>>>> So i have deployed a new server "Cisco UCS C220M4"
>>>>> and created a new volume;
>>>>>
>>>>> Volume Name: testv
>>>>> Type: Stripe
>>>>> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
>>>>> Status: Started
>>>>> Number of Bricks: 1 x 2 = 2
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.70.0.250:/mnt/b1/v
>>>>> Brick2: 10.70.0.250:/mnt/b2/v
>>>>> Options Reconfigured:
>>>>> nfs.disable: off
>>>>> features.shard-block-size: 64MB
>>>>> features.shard: enable
>>>>> cluster.server-quorum-type: server
>>>>> cluster.quorum-type: auto
>>>>> network.remote-dio: enable
>>>>> cluster.eager-lock: enable
>>>>> performance.stat-prefetch: off
>>>>> performance.io-cache: off
>>>>> performance.read-ahead: off
>>>>> performance.quick-read: off
>>>>> performance.readdir-ahead: off
>>>>>
>>>>> same error ..
>>>>>
>>>>> can anyone share with me the info of a working
>>>>> striped volume ?
>>>>>
>>>>> On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>>>>>> I have a pool of two bricks in the same server;
>>>>>>
>>>>>> Volume Name: k
>>>>>> Type: Stripe
>>>>>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>>>>>> Status: Started
>>>>>> Number of Bricks: 1 x 2 = 2
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: gfs001:/bricks/t1/k
>>>>>> Brick2: gfs001:/bricks/t2/k
>>>>>> Options Reconfigured:
>>>>>> features.shard-block-size: 64MB
>>>>>> features.shard: on
>>>>>> cluster.server-quorum-type: server
>>>>>> cluster.quorum-type: auto
>>>>>> network.remote-dio: enable
>>>>>> cluster.eager-lock: enable
>>>>>> performance.stat-prefetch: off
>>>>>> performance.io-cache: off
>>>>>> performance.read-ahead: off
>>>>>> performance.quick-read: off
>>>>>> performance.readdir-ahead: off
>>>>>>
>>>>>> same issue ...
>>>>>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>>>>>
>>>>>>
>>>>>> Respectfully*
>>>>>> **Mahdi A. Mahdi*
>>>>>>
>>>>>> Systems Administrator
>>>>>> IT. Department
>>>>>> Earthlink Telecommunications
>>>>>> <https://www.facebook.com/earthlinktele>
>>>>>>
>>>>>> Cell: 07903316180
>>>>>> Work: 3352
>>>>>> Skype: mahdi.adnan at outlook.com
>>>>>> <mailto:mahdi.adnan at outlook.com>
>>>>>> On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>>>>>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote:
>>>>>>>> It would be better to use sharding over stripe for your vm use case. It
>>>>>>>> offers better distribution and utilisation of bricks and better heal
>>>>>>>> performance.
>>>>>>>> And it is well tested.
>>>>>>> Basically the "striping" feature is deprecated, "sharding" is its
>>>>>>> improved replacement. I expect to see "striping" completely dropped in
>>>>>>> the next major release.
>>>>>>>
>>>>>>> Niels
>>>>>>>
>>>>>>>
>>>>>>>> Couple of things to note before you do that:
>>>>>>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised
>>>>>>>> that you use 3.7.8 or above.
>>>>>>>> 2. When you enable sharding on a volume, already existing files in the
>>>>>>>> volume do not get sharded. Only the files that are newly created from the
>>>>>>>> time sharding is enabled will.
>>>>>>>> If you do want to shard the existing files, then you would need to cp
>>>>>>>> them to a temp name within the volume, and then rename them back to the
>>>>>>>> original file name.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>> Krutika
>>>>>>>>
>>>>>>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com
>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>> wrote:
>>>>>>>>> I couldn't find anything related to cache in the HBAs.
>>>>>>>>> what logs are useful in my case ? i see only bricks logs which contains
>>>>>>>>> nothing during the failure.
>>>>>>>>>
>>>>>>>>> ###
>>>>>>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>> 0-vmware-posix: mknod on
>>>>>>>>> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>>>>>>> [File exists]
>>>>>>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>> 0-vmware-posix: mknod on
>>>>>>>>> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>>>>>>> [File exists]
>>>>>>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash:
>>>>>>>>> rmdir issued on /.trashcan/, which is not permitted
>>>>>>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056]
>>>>>>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR
>>>>>>>>> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op)
>>>>>>>>> ==> (Operation not permitted) [Operation not permitted]
>>>>>>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029]
>>>>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
>>>>>>>>> 3.7.8)
>>>>>>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029]
>>>>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
>>>>>>>>> 3.7.8)
>>>>>>>>> ###
>>>>>>>>>
>>>>>>>>> ESXi just keeps telling me "Cannot clone T: The virtual disk is either
>>>>>>>>> corrupted or not a supported format.
>>>>>>>>> error
>>>>>>>>> 3/13/2016 9:06:20 PM
>>>>>>>>> Clone virtual machine
>>>>>>>>> T
>>>>>>>>> VCENTER.LOCAL\Administrator
>>>>>>>>> "
>>>>>>>>>
>>>>>>>>> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi
>>>>>>>>> server mount the NFS via the floating ip.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 03/13/2016 08:40 PM, pkoelle wrote:
>>>>>>>>>
>>>>>>>>>> Am 13.03.2016 um 18:22 schrieb David Gossage:
>>>>>>>>>>
>>>>>>>>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <
>>>>>>>>>>> mahdi.adnan at earthlinktele.com
>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>> My HBAs are LSISAS1068E, and the filesystem is XFS.
>>>>>>>>>>>> I tried EXT4 and it did not help.
>>>>>>>>>>>> I have created a stripted volume in one server with two bricks, same
>>>>>>>>>>>> issue.
>>>>>>>>>>>> and i tried a replicated volume with just "sharding enabled" same issue,
>>>>>>>>>>>> as soon as i disable the sharding it works just fine, niether sharding
>>>>>>>>>>>> nor
>>>>>>>>>>>> striping works for me.
>>>>>>>>>>>> i did follow up with some of threads in the mailing list and tried some
>>>>>>>>>>>> of
>>>>>>>>>>>> the fixes that worked with the others, none worked for me. :(
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Is it possible the LSI has write-cache enabled?
>>>>>>>>>>>
>>>>>>>>>> Why is that relevant? Even the backing filesystem has no idea if there is
>>>>>>>>>> a RAID or write cache or whatever. There are blocks and sync(), end of
>>>>>>>>>> story.
>>>>>>>>>> If you lose power and screw up your recovery OR do funky stuff with SAS
>>>>>>>>>> multipathing that might be an issue with a controller cache. AFAIK thats
>>>>>>>>>> not what we are talking about.
>>>>>>>>>>
>>>>>>>>>> I'm afraid but unless the OP has some logs from the server, a
>>>>>>>>>> reproducible testcase or a backtrace from client or server this isn't
>>>>>>>>>> getting us anywhere.
>>>>>>>>>>
>>>>>>>>>> cheers
>>>>>>>>>> Paul
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 03/13/2016 06:54 PM, David Gossage wrote:
>>>>>>>>>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <
>>>>>>>>>>>> mahdi.adnan at earthlinktele.com
>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Okay so i have enabled shard in my test volume and it did not help,
>>>>>>>>>>>>> stupidly enough, i have enabled it in a production volume
>>>>>>>>>>>>> "Distributed-Replicate" and it currpted half of my VMs.
>>>>>>>>>>>>> I have updated Gluster to the latest and nothing seems to be changed in
>>>>>>>>>>>>> my situation.
>>>>>>>>>>>>> below the info of my volume;
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> I was pointing at the settings in that email as an example for
>>>>>>>>>>>> corruption
>>>>>>>>>>>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the
>>>>>>>>>>>> base working yet on that cluster. What HBA's are you using and what is
>>>>>>>>>>>> layout of filesystem for bricks?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Number of Bricks: 3 x 2 = 6
>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>> Brick1: gfs001:/bricks/b001/vmware
>>>>>>>>>>>>> Brick2: gfs002:/bricks/b004/vmware
>>>>>>>>>>>>> Brick3: gfs001:/bricks/b002/vmware
>>>>>>>>>>>>> Brick4: gfs002:/bricks/b005/vmware
>>>>>>>>>>>>> Brick5: gfs001:/bricks/b003/vmware
>>>>>>>>>>>>> Brick6: gfs002:/bricks/b006/vmware
>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>> performance.strict-write-ordering: on
>>>>>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>>>>> network.remote-dio: enable
>>>>>>>>>>>>> performance.stat-prefetch: disable
>>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>>> features.shard-block-size: 16MB
>>>>>>>>>>>>> features.shard: on
>>>>>>>>>>>>> performance.readdir-ahead: off
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 03/12/2016 08:11 PM, David Gossage wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <
>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Both servers have HBA no RAIDs and i can setup a replicated or
>>>>>>>>>>>>>> dispensers without any issues.
>>>>>>>>>>>>>> Logs are clean and when i tried to migrate a vm and got the error,
>>>>>>>>>>>>>> nothing showed up in the logs.
>>>>>>>>>>>>>> i tried mounting the volume into my laptop and it mounted fine but,
>>>>>>>>>>>>>> if i
>>>>>>>>>>>>>> use dd to create a data file it just hang and i cant cancel it, and i
>>>>>>>>>>>>>> cant
>>>>>>>>>>>>>> unmount it or anything, i just have to reboot.
>>>>>>>>>>>>>> The same servers have another volume on other bricks in a distributed
>>>>>>>>>>>>>> replicas, works fine.
>>>>>>>>>>>>>> I have even tried the same setup in a virtual environment (created two
>>>>>>>>>>>>>> vms and install gluster and created a replicated striped) and again
>>>>>>>>>>>>>> same
>>>>>>>>>>>>>> thing, data corruption.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd look through mail archives for a topic "Shard in Production" I
>>>>>>>>>>>>> think
>>>>>>>>>>>>> it's called. The shard portion may not be relevant but it does discuss
>>>>>>>>>>>>> certain settings that had to be applied with regards to avoiding
>>>>>>>>>>>>> corruption
>>>>>>>>>>>>> with VM's. You may want to try and disable the
>>>>>>>>>>>>> performance.readdir-ahead
>>>>>>>>>>>>> also.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 03/12/2016 07:02 PM, David Gossage wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <
>>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks David,
>>>>>>>>>>>>>>> My settings are all defaults, i have just created the pool and
>>>>>>>>>>>>>>> started
>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>> I have set the settings as your recommendation and it seems to be the
>>>>>>>>>>>>>>> same issue;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Type: Striped-Replicate
>>>>>>>>>>>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>>>>>>>>>>>> Status: Started
>>>>>>>>>>>>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>>>>> Bricks:
>>>>>>>>>>>>>>> Brick1: gfs001:/bricks/t1/s
>>>>>>>>>>>>>>> Brick2: gfs002:/bricks/t1/s
>>>>>>>>>>>>>>> Brick3: gfs001:/bricks/t2/s
>>>>>>>>>>>>>>> Brick4: gfs002:/bricks/t2/s
>>>>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is their a raid controller perhaps doing any caching?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the gluster logs any errors being reported during migration
>>>>>>>>>>>>>> process?
>>>>>>>>>>>>>> Since they aren't in use yet have you tested making just mirrored
>>>>>>>>>>>>>> bricks
>>>>>>>>>>>>>> using different pairings of servers two at a time to see if problem
>>>>>>>>>>>>>> follows
>>>>>>>>>>>>>> certain machine or network ports?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <
>>>>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dears,
>>>>>>>>>>>>>>>> I have created a replicated striped volume with two bricks and two
>>>>>>>>>>>>>>>> servers but I can't use it because when I mount it in ESXi and try
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> migrate a VM to it, the data get corrupted.
>>>>>>>>>>>>>>>> Is any one have any idea why is this happening ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dell 2950 x2
>>>>>>>>>>>>>>>> Seagate 15k 600GB
>>>>>>>>>>>>>>>> CentOS 7.2
>>>>>>>>>>>>>>>> Gluster 3.7.8
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Appreciate your help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Most reports of this I have seen end up being settings related. Post
>>>>>>>>>>>>>>> gluster volume info. Below is what I have seen as most common
>>>>>>>>>>>>>>> recommended
>>>>>>>>>>>>>>> settings.
>>>>>>>>>>>>>>> I'd hazard a guess you may have some the read ahead cache or prefetch
>>>>>>>>>>>>>>> on.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> quick-read=off
>>>>>>>>>>>>>>> read-ahead=off
>>>>>>>>>>>>>>> io-cache=off
>>>>>>>>>>>>>>> stat-prefetch=off
>>>>>>>>>>>>>>> eager-lock=enable
>>>>>>>>>>>>>>> remote-dio=on
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Mahdi Adnan
>>>>>>>>>>>>>>>> System Admin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>>>>>> <Gluster-users at gluster.org>
>>>>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>>>>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Gluster-users mailing list
>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> <mailto:Gluster-users at gluster.org>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> <mailto:Gluster-users at gluster.org>
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> <mailto:Gluster-users at gluster.org>
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/122a3644/attachment.html>
More information about the Gluster-users
mailing list