[Gluster-devel] Priority based ping packet for 3.10

Jeff Darcy jdarcy at redhat.com
Thu Jan 19 13:06:09 UTC 2017


> The more relevant question would be with TCP_KEEPALIVE and TCP_USER_TIMEOUT
> on sockets, do we really need ping-pong framework in Clients? We might need
> that in transport/rdma setups, but my question is concentrating on
> transport/rdma. In other words would like to hear why do we need heart-beat
> mechanism in the first place. One scenario might be a healthy socket level
> connection but an unhealthy brick/client (like a deadlocked one).

This is an important case to consider.  On the one hand, I think it answers
your question about TCP_KEEPALIVE.  What we really care about is whether a
brick's request queue is moving.  In other words, what's the time since the
last reply from that brick, and does that time exceed some threshold?  On a
busy system, we don't even need ping packets to know that.  We can just use
responses on other requests to set/reset that timer.  We only need to send
ping packets when our *outbound* queue has remained empty for some fraction
of our timeout.

However, it's important that our measurements be *end to end* and not just
at the transport level.  This is particularly true with multiplexing,
where multiple bricks will share and contend on various resources.  We
should ping *through* client and server, with separate translators above
and below each.  This would give us a true end-to-end ping *for that
brick*, and also keep the code nicely modular.


More information about the Gluster-devel mailing list