[Gluster-users] All issues resolved after disabling RDMA

Mauro M. gluster at ezplanet.net
Mon Dec 14 15:34:49 UTC 2015


I have been experiencing several issues with glusterfs for several months

These started more or less after upgrading to release 3.7 from 3.5. I
skipped 3.6 series. Almost at the same time I had introduced and
Infiniband point to point network between my two gluster bricks.

The symptoms were failures to start the volume even when both nodes were
up and running, failed synchronizations, unexplicable split-brain even for
those files that I had certainty were only accessed by a single client
only.

I was about to give up glusterfs altogether. As a last resort first I
tried again disabling RDMA (over infiniband) and I rebuilt the bricks from
scratch using only TCP from the start (I had tried before to disable RDMA,
but without starting from scratch, so I must have experienced what were
latent issues).

I cannot tell whether the RDMA defects are caused by gluster, the hardware
or the operating system, however now using TCP only over infiniband I have
had a stable cluster with an active node and a second node which I usually
leave turned off and that synchronizes perfectly every time I turn it back
on.

I hope this helps.

Mauro




More information about the Gluster-users mailing list