[Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

Zhou, Cynthia (NSB - CN/Hangzhou) cynthia.zhou at nokia-sbell.com
Thu Sep 28 06:41:44 UTC 2017


The version I am using is glusterfs 3.6.9
Best regards,
Cynthia (周琳)
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com]
Sent: Thursday, September 28, 2017 2:37 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
Cc: Gluster-users at gluster.org; gluster-devel at gluster.org
Subject: Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !



On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Hi,
Thanks for reply!
I’ve checked [1]. But the problem is that there is nothing shown in command “gluster volume heal <volume-name> info”. So these split-entry files could only be detected when app try to visit them.
I can find gfid mismatch for those in-split-brain entries from mount log, however, nothing show in shd log, the shd log does not know those split-brain entries. Because there is nothing in indices/xattrop directory.
I guess it was there before, and then it got cleared by one of the heal process either client side or server side. I wanted to check that by examining the logs.
Which version of gluster you are running by the way?

The log is not available right now, when it reproduced, I will provide it to your, Thanks!
Ok.

Best regards,
Cynthia (周琳)
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311

From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com<mailto:ksubrahm at redhat.com>]
Sent: Thursday, September 28, 2017 2:02 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>; gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>
Subject: Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !

Hi,
To resolve the gfid split-brain you can follow the steps at [1].
Since we don't have the pending markers set on the files, it is not showing in the heal info.
To debug this issue, need some more data from you. Could you provide these things?
1. volume info
2. mount log
3. brick logs
4. shd log

May I also know which version of gluster you are running. From the info you have provided it looks like an old version.
If it is, then it would be great if you can upgarde to one of the latest supported release.

[1] http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain

Thanks & Regards,
Karthik
On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:

HI gluster experts,

I meet a tough problem about “split-brain” issue. Sometimes, after hard reboot, we will find some files in split-brain, however its parent directory or anything could be shown in command “gluster volume heal <volume-name> info”, also, no entry in .glusterfs/indices/xattrop directory, can you help to shed some lights on this issue? Thanks!



Following is some info from our env,

Checking from sn-0 cliet, nothing is shown in-split-brain!

[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
# gluster v heal services info
Brick sn-0:/mnt/bricks/services/brick/
Number of entries: 0

Brick sn-1:/mnt/bricks/services/brick/
Number of entries: 0

[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
# gluster v heal services info split-brain
Gathering list of split brain entries on volume services has been successful

Brick sn-0.local:/mnt/bricks/services/brick
Number of entries: 0

Brick sn-1.local:/mnt/bricks/services/brick
Number of entries: 0

[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
# ls -l /mnt/services/netserv/ethip/
ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error
ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error
total 3
-rw-r--r-- 1 root root 144 Sep 26 20:35 as-0
-rw-r--r-- 1 root root 144 Sep 26 20:35 as-1
-rw-r--r-- 1 root root 145 Sep 26 20:35 as-2
-rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0
-????????? ? ?    ?      ?            ? mn-1
-rw-r--r-- 1 root root  73 Sep 26 20:35 sn-0
-rw-r--r-- 1 root root  73 Sep 26 20:35 sn-1
-????????? ? ?    ?      ?            ? sn-2
[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]

Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is different

[SN-0]
[root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
# getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip
getfattr: Removing leading '/' from absolute path names
# file: mnt/bricks/services/brick/netserv/ethip
trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
# getfattr -m . -d -e hex mn-1
# file: mn-1
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.services-client-0=0x000000000000000000000000
trusted.afr.services-client-1=0x000000000000000000000000
trusted.gfid=0x53a33f437464475486f31c4e44d83afd
[root at sn-0:/mnt/bricks/services/brick/netserv/ethip]
# stat mn-1
  File: mn-1
  Size: 237              Blocks: 16         IO Block: 4096   regular file
Device: fd51h/64849d    Inode: 2536        Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-09-26 20:30:25.679000000 +0300
Modify: 2017-09-26 20:30:24.604000000 +0300
Change: 2017-09-26 20:30:24.610000000 +0300
Birth: -
[root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
# ls
xattrop-63f8bbcb-7fa6-4fc8-b721-675a05de0ab3
[root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

[root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
# ls
53a33f43-7464-4754-86f3-1c4e44d83afd
[root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3]
# stat 53a33f43-7464-4754-86f3-1c4e44d83afd
  File: 53a33f43-7464-4754-86f3-1c4e44d83afd
  Size: 237              Blocks: 16         IO Block: 4096   regular file
Device: fd51h/64849d    Inode: 2536        Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-09-26 20:30:25.679000000 +0300
Modify: 2017-09-26 20:30:24.604000000 +0300
Change: 2017-09-26 20:30:24.610000000 +0300
Birth: -

#
[SN-1]

[root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]
#  getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip
getfattr: Removing leading '/' from absolute path names
# file: mnt/bricks/services/brick/netserv/ethip
trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

[root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]
#
[root at sn-1:/mnt/bricks/services/brick/netserv/ethip]
# getfattr -m . -d -e hex mn-1
# file: mn-1
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.services-client-0=0x000000000000000000000000
trusted.afr.services-client-1=0x000000000000000000000000
trusted.gfid=0xf7f10f980acc4041a015e48018571d4a

[root at sn-1:/mnt/bricks/services/brick/netserv/ethip]
# stat mn-1
  File: mn-1
  Size: 237              Blocks: 16         IO Block: 4096   regular file
Device: fd41h/64833d    Inode: 2608        Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-09-26 20:31:48.231000000 +0300
Modify: 2017-09-26 20:31:46.872000000 +0300
Change: 2017-09-26 20:31:46.875000000 +0300
Birth: -
[root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]
# ls
xattrop-240713ea-eda3-4914-a55d-7dd4aed724ed
[root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop]

[root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1]
# stat f7f10f98-0acc-4041-a015-e48018571d4a
  File: f7f10f98-0acc-4041-a015-e48018571d4a
  Size: 237              Blocks: 16         IO Block: 4096   regular file
Device: fd41h/64833d    Inode: 2608        Links: 2
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-09-26 20:31:48.231000000 +0300
Modify: 2017-09-26 20:31:46.872000000 +0300
Change: 2017-09-26 20:31:46.875000000 +0300
Birth: -


Best regards,
Cynthia (周琳)
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311



Best regards,
Cynthia (周琳)
MBB SM HETRAN SW3 MATRIX
Storage
Mobile: +86 (0)18657188311




_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>
http://lists.gluster.org/mailman/listinfo/gluster-users


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170928/ca2cd461/attachment.html>


More information about the Gluster-users mailing list