<div dir="ltr"><div>Next time when this happens, could you collect statedump of the brick processes where this activity is going on at intervals of 10 seconds?<br><br></div>You can refer about how to take statedump at: <a href="https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/">https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump/</a><br><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 16, 2017 at 7:43 PM, Jan Wrona <span dir="ltr"><<a href="mailto:wrona@cesnet.cz" target="_blank">wrona@cesnet.cz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>Hi,</p>
<p>I have three servers in the linked list topology [1], GlusterFS
3.8.10, CentOS 7. Each server has two bricks, both on the same XFS
filesystem. The XFS is constructed over the whole MD RAID device:<br>
<tt>md5 : active raid5 sdj1[6] sdh1[8] sde1[2] sdg1[9] sdd1[1]
sdi1[5] sdf1[3] sdc1[0]</tt><tt><br>
</tt><tt> 6836411904 blocks super 1.2 level 5, 512k chunk,
algorithm 2 [8/8] [UUUUUUUU]</tt><tt><br>
</tt><tt> bitmap: 2/8 pages [8KB], 65536KB chunk</tt><br>
</p>
<p>Everything works fine until one of the RAID devices starts its
regular check. During the check, the client's mount sometimes
completely stops responding. I'm mounting using the Pacemaker's
Filesystem OCF RA [2] with <span class="gmail-m_2626572502731844984pl-s">OCF_CHECK_LEVEL=20,
which basically tries to write a small status file to the
filesystem every 2 minutes to see if its OK. But even this small
write operation sometimes times out (2 minutes) during the
check. Pacemaker then remounts the Gluster and everything goes
back to normal.<br>
</span></p>
<p>I understand that the RAID check is draining a lot of I/O
performance, but the underlying XFS remains responsive (of course
it is slower, but by far not as much as Gluster). The check
intervals on the servers are not overlapping. I've even decreased
the /proc/sys/dev/raid/speed_<wbr>limit_max from the default 200 MB/s
to the 50 MB/s, but it helped only a little, the mount still tends
to freeze for a few seconds during the check.</p>
<p>What are your suggestions to solve this issue?</p>
<p>Regards,<br>
Jan Wrona<br>
</p>
<p>[1]
<a class="gmail-m_2626572502731844984moz-txt-link-freetext" href="https://joejulian.name/blog/how-to-expand-glusterfs-replicated-clusters-by-one-server/" target="_blank">https://joejulian.name/blog/<wbr>how-to-expand-glusterfs-<wbr>replicated-clusters-by-one-<wbr>server/</a><br>
[2]
<a class="gmail-m_2626572502731844984moz-txt-link-freetext" href="https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/Filesystem" target="_blank">https://github.com/<wbr>ClusterLabs/resource-agents/<wbr>blob/master/heartbeat/<wbr>Filesystem</a><br>
</p>
</div>
<br>______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature"><div dir="ltr">Pranith<br></div></div>
</div></div></div></div>