<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jan 19, 2017 at 8:06 AM, Jeff Darcy <span dir="ltr">&lt;<a href="mailto:jdarcy@redhat.com" target="_blank">jdarcy@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="gmail-">&gt; The more relevant question would be with TCP_KEEPALIVE and TCP_USER_TIMEOUT<br>

&gt; on sockets, do we really need ping-pong framework in Clients? We might need<br>

&gt; that in transport/rdma setups, but my question is concentrating on<br>

&gt; transport/rdma. In other words would like to hear why do we need heart-beat<br>

&gt; mechanism in the first place. One scenario might be a healthy socket level<br>

&gt; connection but an unhealthy brick/client (like a deadlocked one).<br>

<br>

</span>This is an important case to consider.  On the one hand, I think it answers<br>

your question about TCP_KEEPALIVE.  What we really care about is whether a<br>

brick&#39;s request queue is moving.  In other words, what&#39;s the time since the<br>

last reply from that brick, and does that time exceed some threshold?  On a<br>

busy system, we don&#39;t even need ping packets to know that.  We can just use<br>

responses on other requests to set/reset that timer.  We only need to send<br>

ping packets when our *outbound* queue has remained empty for some fraction<br>

of our timeout.<br>

<br>

However, it&#39;s important that our measurements be *end to end* and not just<br>

at the transport level.  This is particularly true with multiplexing,<br>

where multiple bricks will share and contend on various resources.  We<br>

should ping *through* client and server, with separate translators above<br>

and below each.  This would give us a true end-to-end ping *for that<br>

brick*, and also keep the code nicely modular.<br>

</blockquote></div><br></div><div class="gmail_extra">+1 to this. Having ping, pong xlators immediately above and below protocol translators would also address the problem of epoll threads getting blocked in gluster&#39;s xlator stacks in busy systems.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Having said that, I do see value in Rafi&#39;s patch that prompted this thread. Would it not help to prioritize ping - pong traffic in all parts of the gluster stack including the send queue on the client?</div><div class="gmail_extra"><br></div><div class="gmail_extra">Regards,</div><div class="gmail_extra">Vijay</div></div>