[Gluster-devel] [Gluster-users] gluster volume heal info split brain command not showing files in split-brain

ABHISHEK PALIWAL abhishpaliwal at gmail.com
Fri Mar 18 05:48:21 UTC 2016


On Fri, Mar 18, 2016 at 1:41 AM, Anuradha Talur <atalur at redhat.com> wrote:

>
>
> ----- Original Message -----
> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > To: "Anuradha Talur" <atalur at redhat.com>
> > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
> > Sent: Thursday, March 17, 2016 4:00:58 PM
> > Subject: Re: [Gluster-users] gluster volume heal info split brain
> command not showing files in split-brain
> >
> > Hi Anuradha,
> >
> > Please confirm me, this is bug in glusterfs or we need to do something at
> > our end.
> >
> > Because this problem is stopping our development.
> Hi Abhishek,
>
> When you say file is not getting sync, do you mean that the files are not
> in sync after healing or that the existing GFID mismatch that you tried to
> heal failed?
> In one of the previous mails, you said that the GFID mismatch problem is
> resolved, is it not so?
>

As I mentioned I have two scenario:
1. First scenario is where files are in split-brain but not recognized by
the the split-brain and heal info commands. So we are identifying those
file when I/O errors occurred on those files (the same method mentioned in
the link which you shared earlier) but this method is not reliable in our
case because other modules have the dependencies on this file and those
modules can't wait until heal in progress. In this case we required manual
identification of the file those are falling in I/O error which is somehow
not the correct way. It is better if the split-brain or heal info command
identify the files and based on the output we will perform the self healing
on those files only.

2. Second scenario in which we have one log file which have the fixed size
and wrapping of data properties and continuously written by the system even
when the other brick is down or rebooting. In this case, we have two brick
in replica mode and when one goes down and comes up but this file remains
out of sync. We are not getting any of the following on this file:
A. Not recognized by the split-brain and heal info command.
B. Not getting any I/O error
C. Do not have the GFID mismatch

Here, are the getfattr output of this file

Brick B which rebooted and have the file out of sync

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml

# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae


Brick A where file was getting updated when Brick B was rebooting

getfattr -d -m . -e hex
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

This scenario is not 100% reproducible but out of 20 cycle we can reproduce
it one or two times.


> To your question about finding the files in split-brain, can you try
> running gluster volume heal <volname> info? Heal info is also supposed to
> show
> the files in split-brain.


This heal info command is also not working.


> If the GFID mismatch is not resolved yet, it would really help understand
> the underlying problem if you give the output of getfattr -m. -de hex
> <path-to-parent-directory-of-the-file-in-GFID-mismatch>.
>

It resolved the problem but we don't want to go with that solution
(mentioned in the link provided by) we want the consistency and out come
from the split-brain or heal info files.

The output you are asking for parent directory I have already shared you in
this mail chain. I am also attaching one more time for your reference.

Please let me know if you have any query regarding the above scenarios.

Regards,
Abhishek

> >
> > Regards,
> > Abhishek
> >
> > On Thu, Mar 17, 2016 at 1:54 PM, ABHISHEK PALIWAL <
> abhishpaliwal at gmail.com>
> > wrote:
> >
> > > Hi Anuradha,
> > >
> > > But in this case I need to do tail on each file which is time taking
> > > process and other end I can't pause my module until these file is
> getting
> > > healed.
> > >
> > > Any how I need the output of the split-brain to resolve this problem.
> > >
> > > Regards,
> > > Abhishek
> > >
> > > On Wed, Mar 16, 2016 at 6:21 PM, ABHISHEK PALIWAL <
> abhishpaliwal at gmail.com
> > > > wrote:
> > >
> > >> Hi Anuradha,
> > >>
> > >> The issue is resolved but we have one more issue something similar to
> > >> this one in which the file is not getting sync after the steps
> followed,
> > >> mentioned in the link which you shared in the previous mail.
> > >>
> > >> And problem is that why split-brain command is not showing split-brain
> > >> entries.
> > >>
> > >> Regards,
> > >> Abhishek
> > >>
> > >> On Wed, Mar 16, 2016 at 6:06 PM, Anuradha Talur <atalur at redhat.com>
> > >> wrote:
> > >>
> > >>>
> > >>>
> > >>> ----- Original Message -----
> > >>> > From: "Anuradha Talur" <atalur at redhat.com>
> > >>> > To: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > >>> > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
> > >>> > Sent: Wednesday, March 16, 2016 5:32:26 PM
> > >>> > Subject: Re: [Gluster-users] gluster volume heal info split brain
> > >>> command not showing files in split-brain
> > >>> >
> > >>> >
> > >>> >
> > >>> > ----- Original Message -----
> > >>> > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > >>> > > To: "Anuradha Talur" <atalur at redhat.com>
> > >>> > > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
> > >>> > > Sent: Wednesday, March 16, 2016 4:39:26 PM
> > >>> > > Subject: Re: [Gluster-users] gluster volume heal info split brain
> > >>> command
> > >>> > > not showing files in split-brain
> > >>> > >
> > >>> > > Hi Anuradha,
> > >>> > >
> > >>> > > I am doing the same which is mentioned in the link you shared
> and It
> > >>> has
> > >>> > > been resolved the issue.
> > >>> > >
> > >>> > > But my question is if it is the split-brain scenario then why the
> > >>> command
> > >>> > > "gluster volume heal info split-brain"
> > >>> > >  not showing these files in the output even not the parent
> directory
> > >>> is
> > >>> > > present in split-brain.
> > >>> > >
> > >>> > > Please find the requested logs
> > >>>
> > >>> Abhishek,
> > >>>
> > >>> Yes, ideally it should show. I will look into it. The only reason I
> can
> > >>> think of, is when parent directory did not have any pending markers
> to
> > >>> indicate split-brain; which is why I asked getfattr output for the
> parent
> > >>> directory too. But if the issue is resolved, there isn't much info
> we can
> > >>> get out of it. Thanks for sharing the logs. Will see what could have
> > >>> caused
> > >>> this.
> > >>>
> > >>> > Abhishek,
> > >>> >
> > >>> > Yes, ideally it should show. I will look into it.
> > >>> > I saw another case with this issue. The parent directory did not
> have
> > >>> any pe
> > >>> > >
> > >>> > > On Wed, Mar 16, 2016 at 4:20 PM, Anuradha Talur <
> atalur at redhat.com>
> > >>> wrote:
> > >>> > >
> > >>> > > > Hi Abhishek,
> > >>> > > >
> > >>> > > > The files that are reporting i/o error have gfid-mismatch. This
> > >>> situation
> > >>> > > > is called directory or entry split-brain. You can find steps to
> > >>> resolve
> > >>> > > > this kind of split brain here :
> > >>> > > >
> > >>>
> https://gluster.readthedocs.org/en/latest/Troubleshooting/split-brain/ .
> > >>> > > >
> > >>> > > > Ideally, the parent directories of these files have to be
> listed
> > >>> in heal
> > >>> > > > info split-brain output. Can you please get extended
> attributes of
> > >>> parent
> > >>> > > > directories of the files that show i/o error (Same getfattr
> > >>> command that
> > >>> > > > you previously used.) ?
> > >>> > > >
> > >>> > > > ----- Original Message -----
> > >>> > > > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > >>> > > > > To: "Anuradha Talur" <atalur at redhat.com>
> > >>> > > > > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
> > >>> > > > > Sent: Thursday, March 10, 2016 11:22:35 AM
> > >>> > > > > Subject: Re: [Gluster-users] gluster volume heal info split
> brain
> > >>> > > > command not showing files in split-brain
> > >>> > > > >
> > >>> > > > > Hi Anuradha,
> > >>> > > > >
> > >>> > > > > Please find the glusterfs and glusterd logs directory as an
> > >>> attachment.
> > >>> > > > >
> > >>> > > > > Regards,
> > >>> > > > > Abhishek
> > >>> > > > >
> > >>> > > > >
> > >>> > > > >
> > >>> > > > > On Wed, Mar 9, 2016 at 5:54 PM, ABHISHEK PALIWAL <
> > >>> > > > abhishpaliwal at gmail.com>
> > >>> > > > > wrote:
> > >>> > > > >
> > >>> > > > > > Hi Anuradha,
> > >>> > > > > >
> > >>> > > > > > Sorry for late reply.
> > >>> > > > > >
> > >>> > > > > > Please find the requested logs below:
> > >>> > > > > >
> > >>> > > > > > Remote: 10.32.0.48
> > >>> > > > > > Local : 10.32.1.144
> > >>> > > > > >
> > >>> > > > > > Local:
> > >>> > > > > > #gluster volume heal c_glusterfs info split-brain
> > >>> > > > > > Brick 10.32.1.144:/opt/lvmdir/c2/brick
> > >>> > > > > > Number of entries in split-brain: 0
> > >>> > > > > >
> > >>> > > > > > Brick 10.32.0.48:/opt/lvmdir/c2/brick
> > >>> > > > > > Number of entries in split-brain: 0
> > >>> > > > > >
> > >>> > > > > > Remote:
> > >>> > > > > > #gluster volume heal c_glusterfs info split-brain
> > >>> > > > > > Brick 10.32.1.144:/opt/lvmdir/c2/brick
> > >>> > > > > > Number of entries in split-brain: 0
> > >>> > > > > >
> > >>> > > > > > Brick 10.32.0.48:/opt/lvmdir/c2/brick
> > >>> > > > > > Number of entries in split-brain: 0
> > >>> > > > > >
> > >>> > > > > > auto-sync.sh.
> > >>> > > > > > Here you can see that i/o error is detected. Below is the
> > >>> required
> > >>> > > > > > meta
> > >>> > > > > > data from both the bricks.
> > >>> > > > > >
> > >>> > > > > > 1)
> > >>> > > > > > stat: cannot stat
> > >>> '/mnt/c//public_html/cello/ior_files/nameroot.ior':
> > >>> > > > > > Input/output error
> > >>> > > > > > Remote:
> > >>> > > > > >
> > >>> > > > > > getfattr -d -m . -e hex
> > >>> > > > > >
> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
> > >>> > > > > > # file:
> > >>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
> > >>> > > > > > trusted.afr.dirty=0x000000000000000000000000
> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded2f6000ad80f
> > >>> > > > > > trusted.gfid=0x771221a7bb3c4f1aade40ce9e38a95ee
> > >>> > > > > >
> > >>> > > > > > Local:
> > >>> > > > > >
> > >>> > > > > > getfattr -d -m . -e hex
> > >>> > > > > >
> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
> > >>> > > > > > # file:
> > >>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded38f000e3a51
> > >>> > > > > > trusted.gfid=0x8ea33f46703c4e2d95c09153c1b858fd
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > > > 2)
> > >>> > > > > > stat: cannot stat '/mnt/c//security/corbasecurity':
> > >>> Input/output
> > >>> > > > > > error
> > >>> > > > > > Remote:
> > >>> > > > > >
> > >>> > > > > > getfattr -d -m . -e hex
> > >>> opt/lvmdir/c2/brick/security/corbasecurity
> > >>> > > > > > # file: opt/lvmdir/c2/brick/security/corbasecurity
> > >>> > > > > > trusted.afr.dirty=0x000000000000000000000000
> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded2f6000ad80f
> > >>> > > > > > trusted.gfid=0xd298b7a0c8834f3e99abb39741363013
> > >>> > > > > >
> > >>> > > > > > Local:
> > >>> > > > > >
> > >>> > > > > > getfattr -d -m . -e hex
> > >>> opt/lvmdir/c2/brick/security/corbasecurity
> > >>> > > > > > # file: opt/lvmdir/c2/brick/security/corbasecurity
> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded38f000e3a51
> > >>> > > > > > trusted.gfid=0x890df0f706184b52803fac3242a2f15b
> > >>> > > > > >
> > >>> > > > > > I observed that getfattr command output doesn't show all
> the
> > >>> fields
> > >>> > > > > > all
> > >>> > > > > > the times.
> > >>> > > > > >
> > >>> > > > > > Here you can check that gluster split-brain command hasn't
> > >>> reported
> > >>> > > > > > any
> > >>> > > > > > split-brains but resulted in IO errors when accessed few
> files.
> > >>> > > > > >
> > >>> > > > > > Could you please tell me if "split-brain" command doesn't
> > >>> reported
> > >>> > > > > > any
> > >>> > > > > > entry as output, then is there any way through which we can
> > >>> find out
> > >>> > > > that
> > >>> > > > > > the files are in split-brain if we are getting the IO
> error on
> > >>> those
> > >>> > > > file.
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > > > Regards,
> > >>> > > > > > Abhishek
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > > > On Thu, Mar 3, 2016 at 5:32 PM, Anuradha Talur <
> > >>> atalur at redhat.com>
> > >>> > > > wrote:
> > >>> > > > > >
> > >>> > > > > >>
> > >>> > > > > >>
> > >>> > > > > >> ----- Original Message -----
> > >>> > > > > >> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
> > >>> > > > > >> > To: gluster-users at gluster.org,
> gluster-devel at gluster.org
> > >>> > > > > >> > Sent: Thursday, March 3, 2016 12:10:42 PM
> > >>> > > > > >> > Subject: [Gluster-users] gluster volume heal info split
> > >>> brain
> > >>> > > > command
> > >>> > > > > >> not     showing files in split-brain
> > >>> > > > > >> >
> > >>> > > > > >> >
> > >>> > > > > >> > Hello,
> > >>> > > > > >> >
> > >>> > > > > >> > In gluster, we use the command "gluster volume heal
> > >>> c_glusterfs
> > >>> > > > > >> > info
> > >>> > > > > >> > split-brain" to find the files that are in split-brain
> > >>> scenario.
> > >>> > > > > >> > We run heal script (developed by Windriver prime team)
> on
> > >>> the
> > >>> > > > > >> > files
> > >>> > > > > >> reported
> > >>> > > > > >> > by above command to resolve split-brain issue.
> > >>> > > > > >> >
> > >>> > > > > >> > But we observed that the above command is not showing
> all
> > >>> files
> > >>> > > > > >> > that
> > >>> > > > > >> are in
> > >>> > > > > >> > split-brain,
> > >>> > > > > >> > even though split brain scenario actually exists on the
> > >>> node.
> > >>> > > > > >> >
> > >>> > > > > >> > Now a days this issue is seen more often and IO errors
> are
> > >>> > > > > >> > reported
> > >>> > > > when
> > >>> > > > > >> > tried to access these files under split-brain.
> > >>> > > > > >> >
> > >>> > > > > >> > Can you please check why this gluster command is not
> > >>> showing files
> > >>> > > > under
> > >>> > > > > >> > split-brain?
> > >>> > > > > >> > We can provide you required logs and support to resolve
> this
> > >>> > > > > >> > issue.
> > >>> > > > > >> Hi,
> > >>> > > > > >>
> > >>> > > > > >> Could you paste the output of getfattr -m. -de hex
> > >>> > > > > >> <path-to-files-in-split-brain> from all the bricks that
> the
> > >>> files
> > >>> > > > > >> lie
> > >>> > > > in?
> > >>> > > > > >>
> > >>> > > > > >> >
> > >>> > > > > >> > Please reply on this because I am not getting any reply
> > >>> from the
> > >>> > > > > >> community.
> > >>> > > > > >> >
> > >>> > > > > >> > --
> > >>> > > > > >> >
> > >>> > > > > >> > Regards
> > >>> > > > > >> > Abhishek Paliwal
> > >>> > > > > >> >
> > >>> > > > > >> > _______________________________________________
> > >>> > > > > >> > Gluster-users mailing list
> > >>> > > > > >> > Gluster-users at gluster.org
> > >>> > > > > >> > http://www.gluster.org/mailman/listinfo/gluster-users
> > >>> > > > > >>
> > >>> > > > > >> --
> > >>> > > > > >> Thanks,
> > >>> > > > > >> Anuradha.
> > >>> > > > > >>
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > > > --
> > >>> > > > Thanks,
> > >>> > > > Anuradha.
> > >>> > > >
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > --
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > >
> > >>> > > Regards
> > >>> > > Abhishek Paliwal
> > >>> > >
> > >>> >
> > >>> > --
> > >>> > Thanks,
> > >>> > Anuradha.
> > >>> >
> > >>>
> > >>> --
> > >>> Thanks,
> > >>> Anuradha.
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >>
> > >>
> > >>
> > >> Regards
> > >> Abhishek Paliwal
> > >>
> > >
> > >
> > >
> > > --
> > >
> > >
> > >
> > >
> > > Regards
> > > Abhishek Paliwal
> > >
> >
> >
> >
> > --
> >
> >
> >
> >
> > Regards
> > Abhishek Paliwal
> >
>
> --
> Thanks,
> Anuradha.
>



-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160318/ecaa62dd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: split-brain.log
Type: text/x-log
Size: 6716 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160318/ecaa62dd/attachment-0001.bin>


More information about the Gluster-devel mailing list