[Gluster-users] Self healing of 3.3.0 cause our 2 bricks replicated cluster freeze (client read/write timeout)

ZHANG Cheng czhang.oss at gmail.com
Thu Nov 29 05:24:40 UTC 2012

I dig out an gluster-users m-list thread dated 2011-June at

In this post, Marco Agostini said:
Craig Carl said me, three days ago:
 that happens because Gluster's self heal is a blocking operation. We
are working on a non-blocking self heal, we are hoping to ship it in
early September.

Looks like even with release of 3.3.1, self heal is still a blocking
operation. I am wondering why the official Administration Guide
doesn't warn the reader about such important thing regarding
production operation.

On Mon, Nov 26, 2012 at 5:46 PM, ZHANG Cheng <czhang.oss at gmail.com> wrote:
> Early this morning our 2 bricks replicated cluster had an outage. The
> disk space for one of the brick server (brick02) was used up. When we
> responded to the disk full alert, the issue already lasted for a few
> hours. We reclaimed some disk space, and reboot the brick02 server,
> expecting once it come back it will go self healing.
> It did go self healing, but just after couple minutes, access to
> gluster filesystem freeze. Tons of "nfs: server brick not responding,
> still trying" popped up in dmesg. The load average on app server went
> up to 200 something from usual 0.10. We had to shutdown brick02 server
> or stop gluster server process on it, to get the gluster cluster back
> working.
> How could we deal with this issue? Thanks in advance.
> Our gluster setup is followed the official doc.
> gluster> volume info
> Volume Name: staticvol
> Type: Replicate
> Volume ID: fdcbf635-5faf-45d6-ab4e-be97c74d7715
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: brick01:/exports/static
> Brick2: brick02:/exports/static
> Underlying filesystem is xfs (on a lvm volume), as:
> /dev/mapper/vg_node-brick on /exports/static type xfs
> (rw,noatime,nodiratime,nobarrier,logbufs=8)
> The brick servers don't act as gluster client.
> Our app servers are the gluster client, mount via nfs.
> brick:/staticvol on /mnt/gfs-static type nfs
> (rw,noatime,nodiratime,vers=3,rsize=8192,wsize=8192,addr=
> brick is a DNS round-robin record for brick01 and brick02.

More information about the Gluster-users mailing list