[Gluster-devel] decoupling network.ping-timeout and transport.tcp-user-timeout

Milind Changire mchangir at redhat.com
Wed Jan 11 09:51:05 UTC 2017


The management connection uses network.ping-timeout to time out and
retry connection to a different server if the existing connection
end-point is unreachable from the client.
Due to the nature of the parameters involved in the TCP/IP network
stack, it becomes imperative to control the other network connections
using the socket level tunables:
* SO_KEEPALIVE
* TCP_KEEPIDLE
* TCP_KEEPINTVL
* TCP_KEEPCNT

So, I'd like to decouple the network.ping-timeout and
transport.tcp-user-timeout since they are tunables for different
aspects of gluster application. network-ping-timeout monitors the
brick/node level responsiveness and transport.tcp-user-timeout is one
of the attributes that is used to manage the state of the socket.

Saying so, we could do away with network.ping-timeout altogether and
stick with transport.tcp-user-timeout for types of sockets. It becomes
increasingly difficult to work with different tunables across gluster.

I believe, there have not been many cases in which the community has
found the existing defaults for socket timeout unusable. So we could
stick with the system defaults and add the following socket level
tunables and make them open for configuration:
* client.tcp-user-timeout
      which sets transport.tcp-user-timeout
* client.keepalive-time
      which sets transport.socket.keepalive-time
* client.keepalive-interval
      which sets transport.socket.keepalive-interval
* client.keepalive-count
      which sets transport.socket.keepalive-count
* server.tcp-user-timeout
      which sets transport.tcp-user-timeout
* server.keepalive-time
      which sets transport.socket.keepalive-time
* server.keepalive-interval
      which sets transport.socket.keepalive-interval
* server.keepalive-count
      which sets transport.socket.keepalive-count

However, these settings would effect all sockets in gluster.
In cases where aggressive timeouts are needed, the community can find
gluster options which have 1:1 mapping with socket level options as
documented in tcp(7).

Please share your thoughts about the risks or effectiveness of the
decoupling.

-- 
Milind


More information about the Gluster-devel mailing list