[Gluster-users] Very poor heal behaviour in 3.7.9
Lindsay Mathieson
lindsay.mathieson at gmail.com
Mon Mar 28 01:08:12 UTC 2016
On 27/03/2016 12:33 AM, Lindsay Mathieson wrote:
> On 26/03/2016 11:58 PM, Pranith Kumar Karampuri wrote:
>>> Is that the same issue I posted earlier re "gluster volume heal
>>> info" appearing to block I/O?
>>>
>> I don't think it is heal info that is blocking I/O. I think it is
>> client triggering heal and block the fop until heal completes that
>> results in this pattern. This data-heal disabling should get you out
>> of this problem.
>
>
> I tried it earlier and it didn't seem to help.
>
> Does anything need to be restarted after cluster.data-self-heal is set
> off?
Tried again this morning. 100% replicate the behaviour I noted in
> After testing the heal process by killing glusterfsd on a node I
> noticed the following.
>
> - I/O continued at normal speed while glusterfsd was down.
>
> - After restarting glusterfsd, I/O still continued as normal
>
> - performing a "gluster volume heal datastore2 info" whould show some
> info then hang.
>
> - I/O on the cluster would cease. e.g in a VM where I was running a
> command line build of a large project, the build just stopped. The VM
> itself was mostly responsive but anything that involved accessing the
> disk hung.
>
> - if I killed the "gluster volume heal datastore2 info" command then
> I/O in the VM's resumed at a normal pace.
>
> - if I then reissued the "gluster volume heal datastore2 info" command
> I/O would continue for a short while (seconds - minutes) before
> hanging again.
>
> - killing the heal info command would resume I/O again.
iowait and cpu are under 4% on all three nodes.
Even after I shutdown all vm's on datastore2 "gluster volume heal
datastore2 info" hung indefinitely with no output.
I had to stop/start the datastore2 before the info would work, it
rteurned very quickly with:
Brick vnb.proxmox.softlog:/tank/vmdata/datastore2
Number of entries: 0
Brick vng.proxmox.softlog:/tank/vmdata/datastore2
/.shard - Possibly undergoing heal
Number of entries: 1
Brick vna.proxmox.softlog:/tank/vmdata/datastore2
/.shard - Possibly undergoing heal
Number of entries: 1
Unfortunately its stayed that way for 10 minutes now.
I'd like to recheck this behaviour under 3.7.7 - can I just revert to
that (debian packages) without recreating the datastore?
thanks,
--
Lindsay Mathieson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160328/3db31634/attachment.html>
More information about the Gluster-users
mailing list