[Gluster-users] [gluster] possible split-brain issue

Pranith Kumar Karampuri pkarampu at redhat.com
Thu Jan 29 09:28:46 UTC 2015


Arnold Yang,
        I see that the directories /export/vdb1/brick/, 
/export/vdb1/brick/mpdis/ are in metadata split-brain. You can follow 
the document: 
https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md to 
fix the split-brain.

export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1, 
export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2 don't seem to be in 
split-brain as per the extended attributes, could you send the stat of 
these two files on both the bricks?

Pranith
On 01/23/2015 02:18 PM, Arnold Yang wrote:
>
> Hi Pranith,
>
> No worries!
>
> Here is the output of the other brick:
>
> [root at dmf-wpst-2 ~]# getfattr -d -m. -e hex /export/vdb1/brick/
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: export/vdb1/brick/
>
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
> trusted.afr.gv0-client-0=0x000000000000001500000000
>
> trusted.afr.gv0-client-1=0x000000000000000000000000
>
> trusted.gfid=0x00000000000000000000000000000001
>
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> trusted.glusterfs.volume-id=0x51de44c3f01e486da6b710c7b7a270d7
>
> [root at dmf-wpst-2 ~]#  getfattr -d -m. -e hex /export/vdb1/brick/mpdis/
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: export/vdb1/brick/mpdis/
>
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
> trusted.afr.gv0-client-0=0x000000000000000200000000
>
> trusted.afr.gv0-client-1=0x000000000000000000000000
>
> trusted.gfid=0x8ff7afeb996244cd9d1bf213568398d7
>
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
> [root at dmf-wpst-2 ~]# getfattr -d -m. -e hex 
> /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1
>
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
> trusted.afr.gv0-client-0=0x000000000000000000000000
>
> trusted.afr.gv0-client-1=0x000000000000000000000000
>
> trusted.gfid=0x85ed306b179b46819d7c02eb336543b8
>
> [root at dmf-wpst-2 ~]# getfattr -d -m. -e hex 
> /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2
>
> getfattr: Removing leading '/' from absolute path names
>
> # file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2
>
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
> trusted.afr.gv0-client-0=0x000000000000000000000000
>
> trusted.afr.gv0-client-1=0x000000000000000000000000
>
> trusted.gfid=0xa826a389e7a042c2b5175a1acbecae9b
>
> *From:*Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
> *Sent:* Friday, January 23, 2015 3:42 PM
> *To:* Arnold Yang; Jifeng Li; Gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] [gluster] possible split-brain issue
>
> hi Arnold,
>    You gave the output only on one brick it seems? Could you also 
> provide it on other brick as well. Sorry I didn't make that clear in 
> my earlier mail.
>
> Pranith
>
> On 01/23/2015 10:44 AM, Arnold Yang wrote:
>
>     Hi Pranith,
>
>     Here is the output for the commands provide by you, anything more
>     you need, please tell us!
>
>     Thanks!
>
>     [root at dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/
>
>     getfattr: Removing leading '/' from absolute path names
>
>     # file: export/vdb1/brick/
>
>     security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
>     trusted.afr.gv0-client-0=0x000000000000000000000000
>
>     trusted.afr.gv0-client-1=0x000000000000001400000000
>
>     trusted.gfid=0x00000000000000000000000000000001
>
>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>     trusted.glusterfs.volume-id=0x51de44c3f01e486da6b710c7b7a270d7
>
>     [root at dmf-wpst-1 ~]# getfattr -d -m. -e hex /export/vdb1/brick/mpdis/
>
>     getfattr: Removing leading '/' from absolute path names
>
>     # file: export/vdb1/brick/mpdis/
>
>     security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
>     trusted.afr.gv0-client-0=0x000000000000000000000000
>
>     trusted.afr.gv0-client-1=0x000000000000000400000000
>
>     trusted.gfid=0x8ff7afeb996244cd9d1bf213568398d7
>
>     trusted.glusterfs.dht=0x000000010000000000000000ffffffff
>
>     [root at dmf-wpst-1 ~]# getfattr -d -m. -e hex
>     /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1
>
>     getfattr: Removing leading '/' from absolute path names
>
>     # file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf1
>
>     security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
>     trusted.afr.gv0-client-0=0x000000000000000000000000
>
>     trusted.afr.gv0-client-1=0x000000000000000000000000
>
>     trusted.gfid=0x85ed306b179b46819d7c02eb336543b8
>
>     [root at dmf-wpst-1 ~]# getfattr -d -m. -e hex
>     /export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2
>
>     getfattr: Removing leading '/' from absolute path names
>
>     # file: export/vdb1/brick/mpdis/test.rep.00.00.00.00.dmf2
>
>     security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000
>
>     trusted.afr.gv0-client-0=0x000000000000000000000000
>
>     trusted.afr.gv0-client-1=0x000000000000000000000000
>
>     trusted.gfid=0xa826a389e7a042c2b5175a1acbecae9b
>
>     *From:*Pranith Kumar Karampuri [mailto:pkarampu at redhat.com]
>     *Sent:* Thursday, January 22, 2015 12:14 AM
>     *To:* Jifeng Li; Gluster-users at gluster.org
>     <mailto:Gluster-users at gluster.org>; Arnold Yang
>     *Subject:* Re: [Gluster-users] [gluster] possible split-brain issue
>
>     On 01/14/2015 04:48 PM, Jifeng Li wrote:
>
>         Hi ,
>
>         [issue]: To ensure the glusterFS mount point work, a script
>         will periodically using HTTP put a file to subdirectory under
>         mount point which is used as  Apache DocumentRoot. But after
>         running some time,  some errors show below:
>
>         [2015-01-14 09:18:40.915639] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis
>
>         [2015-01-14 09:18:41.924584] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]
>
>         [2015-01-14 09:18:41.925182] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /
>
>         [2015-01-14 09:18:41.934827] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/mpdis'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]
>
>         [2015-01-14 09:18:41.935375] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis
>
>         [2015-01-14 09:18:42.943742] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]
>
>         [2015-01-14 09:18:42.944432] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /
>
>         [2015-01-14 09:18:42.946664] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/mpdis'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]
>
>         [2015-01-14 09:18:42.947323] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /mpdis
>
>         [2015-01-14 09:18:43.955929] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 20 ] [ 21 0 ] ]
>
>         [2015-01-14 09:18:43.956701] E
>         [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
>         0-gv0-replicate-0:  metadata self heal  failed,   on /
>
>         [2015-01-14 09:18:43.958874] E
>         [afr-self-heal-common.c:233:afr_sh_print_split_brain_log]
>         0-gv0-replicate-0: Unable to self-heal contents of '/mpdis'
>         (possible split-brain). Please delete the file from all but
>         the preferred subvolume.- Pending matrix:  [ [ 0 4 ] [ 2 0 ] ]
>
>         Besides, I find  Input/output error shown below  when listing
>         the files of under mount point:
>
>         [root at dmf-wpst-2 mpdis]# ll
>
>         total 0
>
>         -rwxr-xr-x. 1 apache apache 0 Jan 14 04:21
>         test.rep.00.00.00.00.dmf1
>
>         -rw-r--r--. 1 apache apache 0 Jan 14 04:21
>         test.rep.00.00.00.00.dmf2
>
>         [root at dmf-wpst-2 mpdis]# ll
>
>         total 0
>
>         -rwxr-xr-x. 1 apache apache 0 Jan 14 04:21
>         test.rep.00.00.00.00.dmf1
>
>         -rw-r--r--. 1 apache apache 0 Jan 14 04:21
>         test.rep.00.00.00.00.dmf2
>
>         [root at dmf-wpst-2 mpdis]# ll
>
>         ls: cannot open directory .: Input/output error
>
>         [root at dmf-wpst-2 mpdis]# ll
>
>         ls: cannot access test.rep.00.00.00.00.dmf1: Input/output error
>
>         ls: cannot access test.rep.00.00.00.00.dmf2: Input/output error
>
>         total 0
>
>         ?????????? ? ? ? ?            ? test.rep.00.00.00.00.dmf1
>
>         ?????????? ? ? ? ?            ? test.rep.00.00.00.00.dmf2
>
>         *    Any tips about debugging further or getting this fixed up
>         would be appreciated. *
>
>         [version]: 3.5.3
>
>         [environment]: two virtual server each has one brick :
>
>         root at dmf-wpst-2 mpdis]# gluster volume status
>
>         Status of volume: gv0
>
>         Gluster process Port       Online   Pid
>
>         ------------------------------------------------------------------------------
>
>         Brick dmf-ha-1-glusterfs:/export/vdb1/brick 49152   
>         Y              332
>
>         Brick dmf-ha-2-glusterfs:/export/vdb1/brick 49154   
>         Y              19396
>
>         Self-heal Daemon on localhost N/A        Y              19410
>
>         Self-heal Daemon on 10.175.123.246 N/A        Y              999
>
>         [root at dmf-wpst-1 mpdis]# gluster volume info
>
>         Volume Name: gv0
>
>         Type: Replicate
>
>         Volume ID: 51de44c3-f01e-486d-a6b7-10c7b7a270d7
>
>         Status: Started
>
>         Number of Bricks: 1 x 2 = 2
>
>         Transport-type: tcp
>
>         Bricks:
>
>         Brick1: dmf-ha-1-glusterfs:/export/vdb1/brick
>
>         Brick2: dmf-ha-2-glusterfs:/export/vdb1/brick
>
>         Options Reconfigured:
>
>         nfs.disable: ON
>
>         network.ping-timeout: 2
>
>         storage.bd-aio: on
>
>         storage.linux-aio: on
>
>         cluster.eager-lock: on
>
>         performance.client-io-threads: on
>
>         performance.cache-refresh-timeout: 60
>
>         performance.io-thread-count: 64
>
>         performance.cache-size: 8GB
>
>         cluster.server-quorum-type: none
>
>           [mount-point info]:
>
>         1.mount command
>
>         glusterfs -p /var/run/glusterfs.pid
>         --volfile-server=dmf-ha-1-glusterfs
>         --volfile-server=dmf-ha-2-glusterfs --volfile-id=gv0 /dmfcontents
>
>         2.mount point directory hierarchy
>
>         [root at dmf-wpst-2 /]# ls -ld /dmfcontents/
>
>         drwxr-xr-x. 5 root root 71 Jan 14 04:39 /dmfcontents/
>
>                        [root at dmf-wpst-2 /]# ls -ld /dmfcontents/mpdis/
>
>                       drwxr-xr-x. 2 apache apache 89 Jan 14 04:39
>         /dmfcontents/mpdis/
>
>     hi Jifeng Li,
>          Sorry for the delay in response. Could you post the output of:
>     'getfattr -d -m. -e hex <brick-path>'
>     'getfattr -d -m. -e hex <brick-path>/mpdis'
>     'getfattr -d -m. -e hex <brick-path>/mpdis/test.rep.00.00.00.00.dmf1'
>     'getfattr -d -m. -e hex <brick-path>/mpdis/test.rep.00.00.00.00.dmf2'
>
>     Pranith
>
>
>
>
>
>
>     _______________________________________________
>
>     Gluster-users mailing list
>
>     Gluster-users at gluster.org  <mailto:Gluster-users at gluster.org>
>
>     http://www.gluster.org/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150129/056b05b3/attachment.html>


More information about the Gluster-users mailing list