[Gluster-users] Behaviour of two node degraded cluster

Sat Jul 20 07:36:18 UTC 2013

Hi all

We are running tests on gluster to see if it is suitable for inclusion
in a live environment.

Software is 3.4.0beta4
Cluster is Proxmax with 2 nodes + quorum disc.
Gluster is set to replicate mode - 2 replicas.

Many tests are very satisfactory but last week we discovered two facts
which make gluster unsuitable for our application.

I assume this is a misunderstanding or misconfiguration on my part -
once again I ask for your help.

The intended use is that we require the data on gluster volumes to be
available when the cluster is degraded - i.e. running on a single node
(+ quorum disc).

This happens for ext4/drbd primary/secondary mode. The virtual server
moves to the surviving node (if it is not already there).

This happens with unison/inotify. Reads are unaffected on the surviving
node. Writes are queued - to be transfered to the failed node when it is
later restored to the cluster.

Preliminary tests with gfs2/drbd primary/primary indicate that writes
are blocked for about 60 seconds then it continues normally on the
surviving node. The updates are transfered to the failed node when it is
later restored to the cluster. If we eliminate gluster in these trials
we will put the effort into more testing of gfs2/drbd.

Gluster behaves differently:

1. when one node dies the volume is half-umounted on the surviving node.
i.e. it still shows with the mount command but we get the error
'transport endpoint disconnected'.

2. it is impossible to mount the volume again although a local copy of
all the data is available in the bricks. umount reports no error and
mount then correctly shows the gluster mount is not there. A subsequent
mount command of the gluster volume waits a long time and then reports
(via the logs) that the other server is dead.

The reason why this is unworkable is that it makes a virtual server
which uses a gluster volume depend on BOTH nodes being online. This is
the exact opposite of high-availablity.

What have I configured wrong?

I can partly understand the logic of this behaviour - you cannot
possibly replicate to 2 nodes if only a single node is available.
However to deny even read access to the available data cannot be right.

What I really wanted was that 'writes' are queued and written later when
the dead node is available again (i.e. the same behaviour as gfs2 and
unison).

Any help or clarification would be appreciated.

My question in it's simplest form is:

Is this the intended behaviour in these circumstances?
Is it possible to configure for the behaviour I expected?
If so, how do I do that?

Thanks in advance

Allan