[Gluster-users] Gluster 3.8.10 rebalance VMs corruption

Krutika Dhananjay kdhananj at redhat.com
Tue Mar 21 12:02:55 UTC 2017


Hi,

So it looks like Satheesaran managed to recreate this issue. We will be
seeking his help in debugging this. It will be easier that way.

-Krutika

On Tue, Mar 21, 2017 at 1:35 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
wrote:

> Hello and thank you for your email.
> Actually no, i didn't check the gfid of the vms.
> If this will help, i can setup a new test cluster and get all the data you
> need.
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
> From: Nithya Balachandran
> Sent: Monday, March 20, 20:57
> Subject: Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
> To: Krutika Dhananjay
> Cc: Mahdi Adnan, Gowdappa, Raghavendra, Susant Palai,
> gluster-users at gluster.org List
>
> Hi,
>
> Do you know the GFIDs of the VM images which were corrupted?
>
> Regards,
>
> Nithya
>
> On 20 March 2017 at 20:37, Krutika Dhananjay <kdhananj at redhat.com> wrote:
>
> I looked at the logs.
>
> From the time the new graph (since the add-brick command you shared where
> bricks 41 through 44 are added) is switched to (line 3011 onwards in
> nfs-gfapi.log), I see the following kinds of errors:
>
> 1. Lookups to a bunch of files failed with ENOENT on both replicas which
> protocol/client converts to ESTALE. I am guessing these entries got
> migrated to
>
> other subvolumes leading to 'No such file or directory' errors.
>
> DHT and thereafter shard get the same error code and log the following:
>
>  0 [2017-03-17 14:04:26.353444] E [MSGID: 109040] [dht-helper.c:1198:dht_migration_complete_check_task]
> 17-vmware2-dht: <gfid:a68ce411-e381-46a3-93cd-d2af6a7c3532>: failed
> to lookup the file on vmware2-dht [Stale file handle]
>
>
>   1 [2017-03-17 14:04:26.353528] E [MSGID: 133014]
> [shard.c:1253:shard_common_stat_cbk] 17-vmware2-shard: stat failed:
> a68ce411-e381-46a3-93cd-d2af6a7c3532 [Stale file handle]
>
> which is fine.
>
> 2. The other kind are from AFR logging of possible split-brain which I
> suppose are harmless too.
> [2017-03-17 14:23:36.968883] W [MSGID: 108008]
> [afr-read-txn.c:228:afr_read_txn] 17-vmware2-replicate-13: Unreadable
> subvolume -1 found with event generation 2 for gfid
> 74d49288-8452-40d4-893e-ff4672557ff9. (Possible split-brain)
>
> Since you are saying the bug is hit only on VMs that are undergoing IO
> while rebalance is running (as opposed to those that remained powered off),
>
> rebalance + IO could be causing some issues.
>
> CC'ing DHT devs
>
> Raghavendra/Nithya/Susant,
>
> Could you take a look?
>
> -Krutika
>
>
> On Sun, Mar 19, 2017 at 4:55 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> Thank you for your email mate.
>
> Yes, im aware of this but, to save costs i chose replica 2, this cluster
> is all flash.
>
> In version 3.7.x i had issues with ping timeout, if one hosts went down
> for few seconds the whole cluster hangs and become unavailable, to avoid
> this i adjusted the ping timeout to 5 seconds.
>
> As for choosing Ganesha over gfapi, VMWare does not support Gluster (FUSE
> or gfapi) im stuck with NFS for this volume.
>
> The other volume is mounted using gfapi in oVirt cluster.
>
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
> *Sent:* Sunday, March 19, 2017 2:01:49 PM
>
> *To:* Mahdi Adnan
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
>
>
> While I'm still going through the logs, just wanted to point out a couple
> of things:
>
> 1. It is recommended that you use 3-way replication (replica count 3) for
> VM store use case
>
> 2. network.ping-timeout at 5 seconds is way too low. Please change it to
> 30.
>
> Is there any specific reason for using NFS-Ganesha over gfapi/FUSE?
>
> Will get back with anything else I might find or more questions if I have
> any.
>
> -Krutika
>
> On Sun, Mar 19, 2017 at 2:36 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> Thanks mate,
>
> Kindly, check the attachment.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
> *Sent:* Sunday, March 19, 2017 10:00:22 AM
>
> *To:* Mahdi Adnan
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
>
>
> In that case could you share the ganesha-gfapi logs?
>
> -Krutika
>
> On Sun, Mar 19, 2017 at 12:13 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> I have two volumes, one is mounted using libgfapi for ovirt mount, the
> other one is exported via NFS-Ganesha for VMWare which is the one im
> testing now.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
> *Sent:* Sunday, March 19, 2017 8:02:19 AM
>
> *To:* Mahdi Adnan
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
>
>
> On Sat, Mar 18, 2017 at 10:36 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> Kindly, check the attached new log file, i dont know if it's helpful or
> not but, i couldn't find the log with the name you just described.
>
>
> No. Are you using FUSE or libgfapi for accessing the volume? Or is it NFS?
>
>
>
> -Krutika
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
> *Sent:* Saturday, March 18, 2017 6:10:40 PM
>
> *To:* Mahdi Adnan
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
>
>
> mnt-disk11-vmware2.log seems like a brick log. Could you attach the fuse
> mount logs? It should be right under /var/log/glusterfs/ directory
>
> named after the mount point name, only hyphenated.
>
> -Krutika
>
> On Sat, Mar 18, 2017 at 7:27 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> Hello Krutika,
>
> Kindly, check the attached logs.
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
>
> *Sent:* Saturday, March 18, 2017 3:29:03 PM
> *To:* Mahdi Adnan
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Gluster 3.8.10 rebalance VMs corruption
>
>
>
> Hi Mahdi,
>
> Could you attach mount, brick and rebalance logs?
>
> -Krutika
>
> On Sat, Mar 18, 2017 at 12:14 AM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
> Hi,
>
> I have upgraded to Gluster 3.8.10 today and ran the add-brick procedure in
> a volume contains few VMs.
>
> After the completion of rebalance, i have rebooted the VMs, some of ran
> just fine, and others just crashed.
>
> Windows boot to recovery mode and Linux throw xfs errors and does not boot.
>
> I ran the test again and it happened just as the first one, but i have
> noticed only VMs doing disk IOs are affected by this bug.
>
> The VMs in power off mode started fine and even md5 of the disk file did
> not change after the rebalance.
>
> anyone else can confirm this ?
>
> Volume info:
>
>
>
> Volume Name: vmware2
>
> Type: Distributed-Replicate
>
> Volume ID: 02328d46-a285-4533-aa3a-fb9bfeb688bf
>
> Status: Started
>
> Snapshot Count: 0
>
> Number of Bricks: 22 x 2 = 44
>
> Transport-type: tcp
>
> Bricks:
>
> Brick1: gluster01:/mnt/disk1/vmware2
>
> Brick2: gluster03:/mnt/disk1/vmware2
>
> Brick3: gluster02:/mnt/disk1/vmware2
>
> Brick4: gluster04:/mnt/disk1/vmware2
>
> Brick5: gluster01:/mnt/disk2/vmware2
>
> Brick6: gluster03:/mnt/disk2/vmware2
>
> Brick7: gluster02:/mnt/disk2/vmware2
>
> Brick8: gluster04:/mnt/disk2/vmware2
>
> Brick9: gluster01:/mnt/disk3/vmware2
>
> Brick10: gluster03:/mnt/disk3/vmware2
>
> Brick11: gluster02:/mnt/disk3/vmware2
>
> Brick12: gluster04:/mnt/disk3/vmware2
>
> Brick13: gluster01:/mnt/disk4/vmware2
>
> Brick14: gluster03:/mnt/disk4/vmware2
>
> Brick15: gluster02:/mnt/disk4/vmware2
>
> Brick16: gluster04:/mnt/disk4/vmware2
>
> Brick17: gluster01:/mnt/disk5/vmware2
>
> Brick18: gluster03:/mnt/disk5/vmware2
>
> Brick19: gluster02:/mnt/disk5/vmware2
>
> Brick20: gluster04:/mnt/disk5/vmware2
>
> Brick21: gluster01:/mnt/disk6/vmware2
>
> Brick22: gluster03:/mnt/disk6/vmware2
>
> Brick23: gluster02:/mnt/disk6/vmware2
>
> Brick24: gluster04:/mnt/disk6/vmware2
>
> Brick25: gluster01:/mnt/disk7/vmware2
>
> Brick26: gluster03:/mnt/disk7/vmware2
>
> Brick27: gluster02:/mnt/disk7/vmware2
>
> Brick28: gluster04:/mnt/disk7/vmware2
>
> Brick29: gluster01:/mnt/disk8/vmware2
>
> Brick30: gluster03:/mnt/disk8/vmware2
>
> Brick31: gluster02:/mnt/disk8/vmware2
>
> Brick32: gluster04:/mnt/disk8/vmware2
>
> Brick33: gluster01:/mnt/disk9/vmware2
>
> Brick34: gluster03:/mnt/disk9/vmware2
>
> Brick35: gluster02:/mnt/disk9/vmware2
>
> Brick36: gluster04:/mnt/disk9/vmware2
>
> Brick37: gluster01:/mnt/disk10/vmware2
>
> Brick38: gluster03:/mnt/disk10/vmware2
>
> Brick39: gluster02:/mnt/disk10/vmware2
>
> Brick40: gluster04:/mnt/disk10/vmware2
>
> Brick41: gluster01:/mnt/disk11/vmware2
>
> Brick42: gluster03:/mnt/disk11/vmware2
>
> Brick43: gluster02:/mnt/disk11/vmware2
>
> Brick44: gluster04:/mnt/disk11/vmware2
>
> Options Reconfigured:
>
> cluster.server-quorum-type: server
>
> nfs.disable: on
>
> performance.readdir-ahead: on
>
> transport.address-family: inet
>
> performance.quick-read: off
>
> performance.read-ahead: off
>
> performance.io-cache: off
>
> performance.stat-prefetch: off
>
> cluster.eager-lock: enable
>
> network.remote-dio: enable
>
> features.shard: on
>
> cluster.data-self-heal-algorithm: full
>
> features.cache-invalidation: on
>
> ganesha.enable: on
>
> features.shard-block-size: 256MB
>
> client.event-threads: 2
>
> server.event-threads: 2
>
> cluster.favorite-child-policy: size
>
> storage.build-pgfid: off
>
> network.ping-timeout: 5
>
> cluster.enable-shared-storage: enable
>
> nfs-ganesha: enable
>
> cluster.server-quorum-ratio: 51%
>
> Adding bricks:
>
> gluster volume add-brick vmware2 replica 2 gluster01:/mnt/disk11/vmware2
> gluster03:/mnt/disk11/vmware2 gluster02:/mnt/disk11/vmware2
> gluster04:/mnt/disk11/vmware2
>
> starting fix layout:
>
> gluster volume rebalance vmware2 fix-layout start
>
> Starting rebalance:
>
> gluster volume rebalance vmware2  start
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170321/30b644c1/attachment.html>


More information about the Gluster-users mailing list