[Gluster-users] Rebalance + VM corruption - current status and request for feedback

Krutika Dhananjay kdhananj at redhat.com
Mon May 29 06:20:29 UTC 2017


Hi,

I took a look at your logs.
It very much seems like an issue that is caused by a mismatch in glusterfs
client and server packages.
So your client (mount) seems to be still running 3.7.20, as confirmed by
the occurrence of the following log message:

[2017-05-26 08:58:23.647458] I [MSGID: 100030] [glusterfsd.c:2338:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
(args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
--volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
/rhev/data-center/mnt/glusterSD/s1:_testvol)
[2017-05-26 08:58:40.901204] I [MSGID: 100030] [glusterfsd.c:2338:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
(args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
--volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
/rhev/data-center/mnt/glusterSD/s1:_testvol)
[2017-05-26 08:58:48.923452] I [MSGID: 100030] [glusterfsd.c:2338:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.20
(args: /usr/sbin/glusterfs --volfile-server=s1 --volfile-server=s2
--volfile-server=s3 --volfile-server=s4 --volfile-id=/testvol
/rhev/data-center/mnt/glusterSD/s1:_testvol)

whereas the servers have rightly been upgraded to 3.10.2, as seen in
rebalance log:

[2017-05-26 09:36:36.075940] I [MSGID: 100030] [glusterfsd.c:2475:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.10.2
(args: /usr/sbin/glusterfs -s localhost --volfile-id rebalance/testvol
--xlator-option *dht.use-readdirp=yes --xlator-option
*dht.lookup-unhashed=yes --xlator-option *dht.assert-no-child-down=yes
--xlator-option *replicate*.data-self-heal=off --xlator-option
*replicate*.metadata-self-heal=off --xlator-option
*replicate*.entry-self-heal=off --xlator-option *dht.readdir-optimize=on
--xlator-option *dht.rebalance-cmd=5 --xlator-option
*dht.node-uuid=7c0bf49e-1ede-47b1-b9a5-bfde6e60f07b --xlator-option
*dht.commit-hash=3376396580 --socket-file
/var/run/gluster/gluster-rebalance-801faefa-a583-46b4-8eef-e0ec160da9ea.sock
--pid-file
/var/lib/glusterd/vols/testvol/rebalance/7c0bf49e-1ede-47b1-b9a5-bfde6e60f07b.pid
-l /var/log/glusterfs/testvol-rebalance.log)


Could you upgrade all packages to 3.10.2 and try again?

-Krutika


On Fri, May 26, 2017 at 4:46 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
wrote:

> Hi,
>
>
> Attached are the logs for both the rebalance and the mount.
>
>
>
> --
>
> Respectfully
> *Mahdi A. Mahdi*
>
> ------------------------------
> *From:* Krutika Dhananjay <kdhananj at redhat.com>
> *Sent:* Friday, May 26, 2017 1:12:28 PM
> *To:* Mahdi Adnan
> *Cc:* gluster-user; Gandalf Corvotempesta; Lindsay Mathieson; Kevin
> Lemonnier
> *Subject:* Re: Rebalance + VM corruption - current status and request for
> feedback
>
> Could you provide the rebalance and mount logs?
>
> -Krutika
>
> On Fri, May 26, 2017 at 3:17 PM, Mahdi Adnan <mahdi.adnan at outlook.com>
> wrote:
>
>> Good morning,
>>
>>
>> So i have tested the new Gluster 3.10.2, and after starting rebalance two
>> VMs were paused due to storage error and third one was not responding.
>>
>> After rebalance completed i started the VMs and it did not boot, and
>> throw an XFS wrong inode error into the screen.
>>
>>
>> My setup:
>>
>> 4 nodes running CentOS7.3 with Gluster 3.10.2
>>
>> 4 bricks in distributed replica with group set to virt.
>>
>> I added the volume to ovirt and created three VMs, i ran a loop to create
>> 5GB file inside the VMs.
>>
>> Added new 4 bricks to the existing nodes.
>>
>> Started rebalane "with force to bypass the warning message"
>>
>> VMs started to fail after rebalancing.
>>
>>
>>
>>
>> --
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>> ------------------------------
>> *From:* Krutika Dhananjay <kdhananj at redhat.com>
>> *Sent:* Wednesday, May 17, 2017 6:59:20 AM
>> *To:* gluster-user
>> *Cc:* Gandalf Corvotempesta; Lindsay Mathieson; Kevin Lemonnier; Mahdi
>> Adnan
>> *Subject:* Rebalance + VM corruption - current status and request for
>> feedback
>>
>> Hi,
>>
>> In the past couple of weeks, we've sent the following fixes concerning VM
>> corruption upon doing rebalance - https://review.gluster.org/#/q
>> /status:merged+project:glusterfs+branch:master+topic:bug-1440051
>>
>> These fixes are very much part of the latest 3.10.2 release.
>>
>> Satheesaran within Red Hat also verified that they work and he's not
>> seeing corruption issues anymore.
>>
>> I'd like to hear feedback from the users themselves on these fixes (on
>> your test environments to begin with) before even changing the status of
>> the bug to CLOSED.
>>
>> Although 3.10.2 has a patch that prevents rebalance sub-commands from
>> being executed on sharded volumes, you can override the check by using the
>> 'force' option.
>>
>> For example,
>>
>> # gluster volume rebalance myvol start force
>>
>> Very much looking forward to hearing from you all.
>>
>> Thanks,
>> Krutika
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170529/2f62a55a/attachment.html>


More information about the Gluster-users mailing list