[Bugs] [Bug 1229226] New: Gluster split-brain not logged and data integrity not enforced

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 8 10:04:28 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1229226

            Bug ID: 1229226
           Summary: Gluster split-brain not logged and data integrity not
                    enforced
           Product: GlusterFS
           Version: 3.7.0
         Component: replicate
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: dblack at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:
Able to consistently reproduce a split-brain state that is never logged and
where EIO is never triggered, leaving the file available for error-free rw
access while in a split-brain state.

Version-Release number of selected component (if applicable):
[root at n2 ~]# gluster --version
glusterfs 3.7.0 built on May 20 2015 13:30:05
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.

[root at n2 ~]# rpm -qa |grep gluster
glusterfs-libs-3.7.0-2.el7.x86_64
glusterfs-cli-3.7.0-2.el7.x86_64
glusterfs-3.7.0-2.el7.x86_64
glusterfs-fuse-3.7.0-2.el7.x86_64
glusterfs-client-xlators-3.7.0-2.el7.x86_64
glusterfs-server-3.7.0-2.el7.x86_64
glusterfs-api-3.7.0-2.el7.x86_64
glusterfs-geo-replication-3.7.0-2.el7.x86_64

How reproducible:
Consistently

Steps to Reproduce:
1. A test file is created:

[root at n1 ~]# dd if=/dev/urandom of=/rhgs/client/rep01/file002 bs=1k count=1k


2. Confirm that file hashes to bricks on n1 and n2:

[root at n1 ~]# ls -lh /rhgs/bricks/rep01/file002 
-rw-r--r-- 2 root root 22 Jun  3 12:18 /rhgs/bricks/rep01/file002

[root at n2 ~]# ls -lh /rhgs/bricks/rep01/file002
-rw-r--r-- 2 root root 22 Jun  3 12:18 /rhgs/bricks/rep01/file002


3. A network split is induced by using iptables to drop all packets from n1 to
n2, and data is appended to the test file from n1:

#!/bin/bash
exe() { echo "\$ $@" ; "$@" ; }

if [ $HOSTNAME == "n1" ]; then
   echo "Inducing network split with iptables..."
   exe iptables -F
   exe iptables -A OUTPUT -d n2 -j DROP
   echo "Adding 1MB of random data to file002..."
   exe dd if=/dev/urandom bs=1k count=1k >> /rhgs/client/rep01/file002
   echo "Generating md5sum for file002..."
   exe md5sum /rhgs/client/rep01/file002
else
   echo "Wrong host!"
fi


4. Data is appended to the test file from n2:

#!/bin/bash
exe() { echo "\$ $@" ; "$@" ; }

if [ $HOSTNAME == "n2" ]; then
   echo "Adding 2MB of random data to file002..."
   exe dd if=/dev/urandom bs=1k count=2k >> /rhgs/client/rep01/file002
   echo "Generating md5sum for file002..."
   exe md5sum /rhgs/client/rep01/file002
else
   echo "Wrong host!"
fi


5. Correct the network split and stat the file from the client:

#!/bin/bash
exe() { echo "\$ $@" ; "$@" ; }

if [ $HOSTNAME == "n1" ]; then
   echo "Correcting network split with iptables..."
   exe iptables -F OUTPUT
   echo "Statting file002 to induce heal..."
   exe stat /rhgs/client/rep01/file002
else
   echo "Wrong host!"
fi


6. Cat the file (this should result in EIO, but does not):

[root at n1 ~]# cat /rhgs/client/rep01/file002  > /dev/null


7. Add new data to the file from n1:

[root at n1 ~]# dd if=/dev/urandom bs=1k count=1k >> /rhgs/client/rep01/file002 
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.138334 s, 7.6 MB/s


8. Look for expected split-brain errors in the gluster logs (nothing is
returned):

[root at n1 ~]# grep -i split /var/log/glusterfs/{*,bricks/*} 2>/dev/null

[root at n2 ~]# grep -i split /var/log/glusterfs/{*,bricks/*} 2>/dev/null


9. Confirm files are different, and both copies think themselves as WISE:

[root at n1 ~]# md5sum /rhgs/bricks/rep01/file002
d70a816aab125567c185bc047f4358b0  /rhgs/bricks/rep01/file002
[root at n1 ~]# getfattr -d -m . -e hex /rhgs/bricks/rep01/file002
getfattr: Removing leading '/' from absolute path names
# file: rhgs/bricks/rep01/file002
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep01-client-0=0x000000000000000000000000
trusted.afr.rep01-client-1=0x000000910000000000000000
trusted.bit-rot.version=0x0200000000000000556dd9df000a770c
trusted.gfid=0x8740772d4f204ce183f010a80e76015c

[root at n2 ~]# md5sum /rhgs/bricks/rep01/file002
bcb17a86bf54db36fa874030fde8da4b  /rhgs/bricks/rep01/file002
[root at n2 ~]# getfattr -d -m . -e hex /rhgs/bricks/rep01/file002
getfattr: Removing leading '/' from absolute path names
# file: rhgs/bricks/rep01/file002
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep01-client-0=0x000000310000000000000000
trusted.afr.rep01-client-1=0x000000000000000000000000
trusted.bit-rot.version=0x0200000000000000556dd9de000db404
trusted.gfid=0x8740772d4f204ce183f010a80e76015c

Actual results:
Able to stat, ls, and cat the split file from the client without error


Expected results:
File operations should result in EIO


Additional info:
Topology for volume rep01:

Distribute set
 |    
 +-- Replica set 0
 |    |    
 |    +-- Brick 0: n1:/rhgs/bricks/rep01
 |    |    
 |    +-- Brick 1: n2:/rhgs/bricks/rep01
 |    
 +-- Replica set 1
       |    
       +-- Brick 0: n3:/rhgs/bricks/rep01
       |    
       +-- Brick 1: n4:/rhgs/bricks/rep01


[root at n1 ~]# gluster volume info rep01

Volume Name: rep01
Type: Distributed-Replicate
Volume ID: 6ff17d21-035d-47e7-8bd1-d4a9e850be31
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: n1:/rhgs/bricks/rep01
Brick2: n2:/rhgs/bricks/rep01
Brick3: n3:/rhgs/bricks/rep01
Brick4: n4:/rhgs/bricks/rep01
Options Reconfigured:
performance.readdir-ahead: on


Client mounts on n1 and n2:

[root at n1 ~]# grep client /etc/fstab
n1:rep01        /rhgs/client/rep01    glusterfs    _netdev    0 0
[root at n1 ~]# mount | grep client
n1:rep01 on /rhgs/client/rep01 type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

[root at n2 ~]# grep client /etc/fstab
n1:rep01        /rhgs/client/rep01    glusterfs    _netdev    0 0
[root at n2 ~]# mount | grep client
n1:rep01 on /rhgs/client/rep01 type fuse.glusterfs
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list