[Gluster-users] Random and frequent split brain

Nilesh Govindrajan me at nileshgr.com
Thu Jul 17 01:56:10 UTC 2014


I'm having a weird issue. I have this config:

node2 ~ # gluster peer status
Number of Peers: 1

Hostname: sto1
Uuid: f7570524-811a-44ed-b2eb-d7acffadfaa5
State: Peer in Cluster (Connected)

node1 ~ # gluster peer status
Number of Peers: 1

Hostname: sto2
Port: 24007
Uuid: 3a69faa9-f622-4c35-ac5e-b14a6826f5d9
State: Peer in Cluster (Connected)

Volume Name: home
Type: Replicate
Volume ID: 54fef941-2e33-4acf-9e98-1f86ea4f35b7
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Brick1: sto1:/data/gluster/home
Brick2: sto2:/data/gluster/home
Options Reconfigured:
performance.write-behind-window-size: 2GB
performance.flush-behind: on
performance.cache-size: 2GB
cluster.choose-local: on
storage.linux-aio: on
transport.keepalive: on
performance.quick-read: on
performance.io-cache: on
performance.stat-prefetch: on
performance.read-ahead: on
cluster.data-self-heal-algorithm: diff
nfs.disable: on

sto1/2 is alias to node1/2 respectively.

As you see, NFS is disabled so I'm using the native fuse mount on both nodes.
The volume contains files and php scripts that are served on various
websites. When both nodes are active, I get split brain on many files
and the mount on node2 going 'input/output error' on many of them
which causes HTTP 500 errors.

I delete the files from the brick using find -samefile. It fixes for a
few minutes and then the problem is back.

What could be the issue? This happens even if I use the NFS mounting method.

Gluster 3.4.4 on Gentoo.

More information about the Gluster-users mailing list