[Gluster-devel] Review attention needed for refactoring of ping-timer implementation in glusterfs

Krishnan Parthasarathi kparthas at redhat.com
Mon Apr 14 11:22:00 UTC 2014


All,

The patch, http://review.gluster.org/5202, refactors the existing ping-timer implementation
such that any new rpc program that is introduced into glusterfs codebase
gets a heart-beating mechanism, which is already in use between gluster client(s)
and bricks, for 'free'. The problem its trying to solve is the lack of heart-beating
mechanism among glusterd processes in a cluster. Without this, one is likely to
see the cluster is 'hung', when a node goes down, until the network disconnection 
is detected by other peers. This can take upto 30mins (default TCP Re-transmission timeout).

This patch also moves the ping-timer logic to the 'right' layer.
Previously, the client xlator had its own private ping timer implementation.
With this patch, this implementation is moved into the 'rpc'layer, so that
other message channels like glusterd-glusterd can benefit from the ping-timer.

This patch has been out there for review for quite some time. It would be 
really helpful if it gets some review attention. It has been tested in the
following scenarios,

- Performed both dropping of incoming and outgoing packets to glusterd, using iptables
  To block incoming packets,
  eg.  iptables -I INPUT -p tcp --dport 553:24007 -j DROP
  
  To block outgoing packets,
  eg. iptables -I INPUT 1 -p tcp --dport 553:24007 -j DROP 

  // please use the above iptables rules carefully and only in your test machines :-)
  
- Tested if 'old' client, one without the new ping timer implementation, works with 'new' server,
  one with the ping timer implementation as in this patch.

thanks,
Krish




More information about the Gluster-devel mailing list