[Gluster-users] 2 replica setup fails when one underlying brick fails

Tue Jan 8 22:40:07 UTC 2013

Hello gluster-users!

I experienced some odd behavior with glusterfs. Now I am a bit confused.

On one node the raid controller hosting the array that was the base for
a gluster brick desided to fail. My replicated gluster volume failed as
well.

I recreated a similar setup with VMs to simulate what happened.

My glusterfs version is 3.2.5 on ubuntu 12.04

node1:
    brick1: xfs on /brick1
node2:
    brick2: xfs on /brick2

For my simulation I set up the bricks without seperate file systems
underneath. My bricks are just ordinary directories within the root fs.

Gluster volume gv0 is a replica setup made using these bricks.
on both nodes localhost:gv0 is mounted to /gv0.

Everything works fine so far. When I simulate a full failure of a node
through simply powering of one VM the the mounted glusterfs first is
unresponive, but after 30 seconds the volume is usable again. From both
nodes. This is roughly what I would expect in this situation.

Then I went on simulating what happend with my hardware setup at work by
simply renamed the /brickX directory on one node to something other. I
thought this is close to my hardware failure from glusters point of
view. At least it is some kind of severe failure of the layer beneath
the brick.

What I would expect to happen:

The gluster daemon on the node with the renamed brick dir recognises
that something is really wrong causing it to fail. No problem, since the
clients simply go on with the remaining gluster server and its remaining
underlying brick. The mounted fs just continues as if nothing happend.

This is what actually happened:

The mounted glusterfs on both nodes simply failed and did not come up
again, even after 15 minutes of waiting.

The failure was easy to see:

    $ ls -la
    [...]
    d?????????  ? ?    ?           ?            ? gv0/
    drwxr-xr-x  3 root root     4096 Nov  7 23:17 home/
    [...]

The mounted fs was not accessible. On none of the nodes.

    $ cd /gv0
    -bash: cd: /gv0: Eingabe-/Ausgabefehler

(German for i/o error…)

I thougt this is a case where a replicated gluster setup would save me.

Am I wrong? Am I missing something? Is this kind of failure for some
reason out of scope?

Background:

We need highly available storage space for:
    * backing up data from other servers (/gv0/backup exported via samba)
    * for postfix+dovecot servers running on the gluster nodes operating
on the same dir structure in /gv0/mailboxes

Greetings,

JvB