[Gluster-devel] split brain with all-zero pending matrix

Emmanuel Dreyfus manu at netbsd.org
Fri Jun 21 00:53:31 UTC 2013


On 3.4.0beta3, after using the volume for a while, I get split-brain errors 
with an unhelpful pending matrix.

[2013-06-20 08:44:00.665731] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-gfs34-replicate-1: Unable to self-heal contents of '/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix:  [ [ 0 0 ] [ 0 0 ] ]
[2013-06-20 08:44:00.666431] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-gfs34-replicate-1: background  data self-heal failed on /manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o
[2013-06-20 08:44:00.667201] W [afr-open.c:213:afr_open] 0-gfs34-replicate-1: failed to open as split brain seen, returning EIO
[2013-06-20 08:44:00.668193] W [fuse-bridge.c:875:fuse_fd_cbk] 0-glusterfs-fuse: 8711927: OPEN() /manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o => -1 (Input/output error)

On the bricks (ls -l, md5 and first 32 bytes):
brick0
-rw-r--r--  2 manu  manu  7216 Jun 20 09:56 /export/wd3a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o
MD5 (/export/wd3a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o) = 24ee57aa8e2aeb6102ba170fb81bbf22
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 03 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
brick1
-rw-r--r--  2 manu  manu  7216 Jun 20 09:56 /export/wd1a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o
MD5 (/export/wd1a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o) = 24ee57aa8e2aeb6102ba170fb81bbf22
00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 03 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
brick2
-rw-r--r--  2 manu  manu  6256 Jun 20 09:56 /export/wd3a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o
MD5 (/export/wd3a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o) = 58a4b8c5929cac2f799c60b3dd2acc2f
brick3
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
-rw-r--r--  2 manu  manu  7216 Jun 20 09:56 /export/wd1a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o
MD5 (/export/wd1a/manu/netbsd/usr/src/tools/gcc/obj/build/gcc/cfgbuild.o) = 2bc383747432c49793a2cf1a2c0e8cfa
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

That shows the offending copies without any doubt, but I still wonder how 
we come to that situation. There are a lot of such files, making the glusterfs
volume just unusable at this point.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org




More information about the Gluster-devel mailing list