[Gluster-users] Monitoring and solving split-brain

Ravishankar N ravishankar at redhat.com
Wed Oct 14 17:13:53 UTC 2015



On 10/14/2015 10:05 PM, Игорь Бирюлин wrote:
> Thanks for your replay.
>
> If I do listing in mount point (/repo):
> # ls /repo/xxx/keyrings/debian-keyring.gpg
> ls: cannot access /repo/xxx/keyrings/debian-keyring.gpg: Input/output 
> error
> #
> In log /var/log/glusterfs/repo.log I see:
> [2015-10-14 16:27:36.006815] W [MSGID: 108008] 
> [afr-self-heal-name.c:359:afr_selfheal_name_gfid_mismatch_check] 
> 0-repofiles-replicate-0: GFID mismatch for 
> <gfid:4a99bf9d-7423-47d9-a09d-fabaa333eccf>/debian-keyring.gpg 
> 69aaeee6-624b-400a-aa46-b5c6166c014c on repofiles-client-1 and 
> b95ad06e-786a-44e5-ba71-af661982071f on repofiles-client-0

So the file has ended up in GFID split-brain (The trusted.gfid value is 
different in both bricks as seen in your output below.), which cannot be 
handled by the split-brain resolution commands. These commands can only 
resolve data and metadata split-brain. I'm afraid you'll manually need 
to delete one of the file and the .glusterfs hardlink from the brick. 
Not sure why the parent-directory was not listed in 'gluster v heal 
VOLNAME info split-brain' output.

> [2015-10-14 16:27:36.008996] W [fuse-bridge.c:451:fuse_entry_cbk] 
> 0-glusterfs-fuse: 65961: LOOKUP() /xxx/keyrings/debian-keyring.gpg => 
> -1 (Input/output error)
>
> On first node getfattr return:
> # getfattr -d -m . -e hex 
> /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> getfattr: Removing leading '/' from absolute path names
> # file: storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.repofiles-client-1=0x000000020000000100000000
> trusted.bit-rot.version=0x020000000000000055fdf0910003b37b
> trusted.gfid=0xb95ad06e786a44e5ba71af661982071f
> # ls -l /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> -rw-r--r-- 2 root root 3456271 Oct 13 19:00 
> /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> #
>
> On second node getfattr return:
> # getfattr -d -m . -e hex 
> /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> getfattr: Removing leading '/' from absolute path names
> # file: storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> trusted.afr.dirty=0x000000000000000000000000
> trusted.afr.repofiles-client-0=0x000000000000000000000000
> trusted.bit-rot.version=0x020000000000000055f97b57000dc3c6
> trusted.gfid=0x69aaeee6624b400aaa46b5c6166c014c
> # ls -l /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> -rw-r--r-- 2 root root 3450346 Oct  9 16:22 
> /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg
> #
>
> Best regards,
> Igor
>
>
>
>
>
>
> 2015-10-14 19:14 GMT+03:00 Ravishankar N <ravishankar at redhat.com 
> <mailto:ravishankar at redhat.com>>:
>
>
>
>     On 10/14/2015 07:02 PM, Игорь Бирюлин wrote:
>>     Hello,
>>     today in my 2 nodes replica set I've found split-brain. Command
>>     'ls' start told 'Input/output error'.
>
>     What does the mount log (/var/log/glusterfs/<path-to-mount>.log)
>     say when you get this  error?
>
>     Can you run getfattr as root for the file from *both* bricks and
>     share the result?
>     `getfattr -d -m . -e hex
>     /storage/gluster_brick_repofiles/xxx/keyrings/debian-keyring.gpg`
>
>     Thanks.
>     Ravi
>
>
>>     But command 'gluster v heal VOLNAME info split-brain' does not
>>     show problem files:
>>     # gluster v heal repofiles info split-brain
>>     Brick dist-int-master03.xxx:/storage/gluster_brick_repofiles
>>     Number of entries in split-brain: 0
>>
>>     Brick dist-int-master04.xxx:/storage/gluster_brick_repofiles
>>     Number of entries in split-brain: 0
>>     #
>>     In output of 'gluster v heal VOLNAME info' I see problem files
>>     (/xxx/keyrings/debian-keyring.gpg, /repos.json), but without
>>     split-brain markers:
>>     # gluster v heal repofiles info
>>     Brick dist-int-master03.xxx:/storage/gluster_brick_repofiles
>>     /xxx/keyrings/debian-keyring.gpg
>>     <gfid:09ec49c9-911a-4b83-abe8-080fe79e7c69>
>>     <gfid:35c51b11-a7fb-496d-9e88-6d5a54fda7da>
>>     /repos.json
>>     <gfid:4f5cb2b5-30e2-43b0-a935-cfc42af883bf>
>>     <gfid:9d2fc354-37c0-47a7-b9f3-379504cba797>
>>     <gfid:cd86a246-9fc4-47d2-bb4d-67566677f77a>
>>     <gfid:b932eed0-07e9-45c5-943e-7478e9f654b4>
>>     <gfid:28bf2ffe-948c-4c7d-bce6-966242338581>
>>     <gfid:ee5659ae-1335-42c5-a852-790387b4213b>
>>     <gfid:fdfb6b8c-3c04-435a-b8d3-8d8341b66409>
>>     Number of entries: 11
>>
>>     Brick dist-int-master04.xxx:/storage/gluster_brick_repofiles
>>     Number of entries: 0
>>     #
>>
>>     I couldn't solve split-brain by new standard command:
>>     # gluster v heal repofiles  split-brain bigger-file /repos.json
>>     Lookup failed on /repos.json:Input/output error
>>     Volume heal failed.
>>     #
>>
>>     Additional info:
>>     # gluster v info
>>      Volume Name: repofiles
>>      Type: Replicate
>>      Volume ID: 4b0e2a74-f1ca-4fe7-8518-23919e1b5fa0
>>      Status: Started
>>      Number of Bricks: 1 x 2 = 2
>>      Transport-type: tcp
>>      Bricks:
>>      Brick1: dist-int-master03.xxx:/storage/gluster_brick_repofiles
>>      Brick2: dist-int-master04.xxx:/storage/gluster_brick_repofiles
>>      Options Reconfigured:
>>      performance.readdir-ahead: on
>>      client.event-threads: 4
>>      server.event-threads: 4
>>      cluster.lookup-optimize: on
>>     # cat /etc/issue
>>     Ubuntu 14.04.3 LTS \n \l
>>     # dpkg -l | grep glusterfs
>>     ii  glusterfs-client 3.7.5-ubuntu1~trusty1                amd64
>>     clustered file-system (client package)
>>     ii  glusterfs-common 3.7.5-ubuntu1~trusty1                amd64
>>     GlusterFS common libraries and translator modules
>>     ii  glusterfs-server 3.7.5-ubuntu1~trusty1                amd64
>>     clustered file-system (server package)
>>     #
>>
>>     I have 2 questions:
>>     1. Why 'gluster v heal VOLNAME info split-brain' doesn't show
>>     actual split-brain? Why in 'gluster v heal VOLNAME info' I
>>     doesn't see markers like 'possible in split-brain'?
>>     How I can monitor my gluster installation if these commands
>>     doesn't show problems?
>>     2. Why 'gluster volume heal VOLNAME split-brain bigger-file FILE'
>>     doesn't solve split-brain? I understand that I can solve
>>     split-brain remove files from brick but I thought to use this
>>     killer feature.
>>
>>     Best regards,
>>     Igor
>>
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>     http://www.gluster.org/mailman/listinfo/gluster-users
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151014/810ee41f/attachment.html>


More information about the Gluster-users mailing list