[Gluster-devel] [Gluster-users] gluster volume heal info split brain command not showing files in split-brain

ABHISHEK PALIWAL abhishpaliwal at gmail.com
Mon Mar 21 05:14:53 UTC 2016


Hi Anuradha,

Have you got any pointer from the above scenarios.

Regards,
Abhishek

On Fri, Mar 18, 2016 at 11:18 AM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com>
wrote:

>
>
> On Fri, Mar 18, 2016 at 1:41 AM, Anuradha Talur <atalur at redhat.com> wrote:
>
>>
>>
>> ----- Original Message -----
>> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
>> > To: "Anuradha Talur" <atalur at redhat.com>
>> > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
>> > Sent: Thursday, March 17, 2016 4:00:58 PM
>> > Subject: Re: [Gluster-users] gluster volume heal info split brain
>> command not showing files in split-brain
>> >
>> > Hi Anuradha,
>> >
>> > Please confirm me, this is bug in glusterfs or we need to do something
>> at
>> > our end.
>> >
>> > Because this problem is stopping our development.
>> Hi Abhishek,
>>
>> When you say file is not getting sync, do you mean that the files are not
>> in sync after healing or that the existing GFID mismatch that you tried to
>> heal failed?
>> In one of the previous mails, you said that the GFID mismatch problem is
>> resolved, is it not so?
>>
>
> As I mentioned I have two scenario:
> 1. First scenario is where files are in split-brain but not recognized by
> the the split-brain and heal info commands. So we are identifying those
> file when I/O errors occurred on those files (the same method mentioned in
> the link which you shared earlier) but this method is not reliable in our
> case because other modules have the dependencies on this file and those
> modules can't wait until heal in progress. In this case we required manual
> identification of the file those are falling in I/O error which is somehow
> not the correct way. It is better if the split-brain or heal info command
> identify the files and based on the output we will perform the self healing
> on those files only.
>
> 2. Second scenario in which we have one log file which have the fixed size
> and wrapping of data properties and continuously written by the system even
> when the other brick is down or rebooting. In this case, we have two brick
> in replica mode and when one goes down and comes up but this file remains
> out of sync. We are not getting any of the following on this file:
> A. Not recognized by the split-brain and heal info command.
> B. Not getting any I/O error
> C. Do not have the GFID mismatch
>
> Here, are the getfattr output of this file
>
> Brick B which rebooted and have the file out of sync
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>
> # file:
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
>
> Brick A where file was getting updated when Brick B was rebooting
>
> getfattr -d -m . -e hex
> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
> trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
> trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
> trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
> trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>
> This scenario is not 100% reproducible but out of 20 cycle we can
> reproduce it one or two times.
>
>
>> To your question about finding the files in split-brain, can you try
>> running gluster volume heal <volname> info? Heal info is also supposed to
>> show
>> the files in split-brain.
>
>
> This heal info command is also not working.
>
>
>> If the GFID mismatch is not resolved yet, it would really help understand
>> the underlying problem if you give the output of getfattr -m. -de hex
>> <path-to-parent-directory-of-the-file-in-GFID-mismatch>.
>>
>
> It resolved the problem but we don't want to go with that solution
> (mentioned in the link provided by) we want the consistency and out come
> from the split-brain or heal info files.
>
> The output you are asking for parent directory I have already shared you
> in this mail chain. I am also attaching one more time for your reference.
>
> Please let me know if you have any query regarding the above scenarios.
>
> Regards,
> Abhishek
>
>> >
>> > Regards,
>> > Abhishek
>> >
>> > On Thu, Mar 17, 2016 at 1:54 PM, ABHISHEK PALIWAL <
>> abhishpaliwal at gmail.com>
>> > wrote:
>> >
>> > > Hi Anuradha,
>> > >
>> > > But in this case I need to do tail on each file which is time taking
>> > > process and other end I can't pause my module until these file is
>> getting
>> > > healed.
>> > >
>> > > Any how I need the output of the split-brain to resolve this problem.
>> > >
>> > > Regards,
>> > > Abhishek
>> > >
>> > > On Wed, Mar 16, 2016 at 6:21 PM, ABHISHEK PALIWAL <
>> abhishpaliwal at gmail.com
>> > > > wrote:
>> > >
>> > >> Hi Anuradha,
>> > >>
>> > >> The issue is resolved but we have one more issue something similar to
>> > >> this one in which the file is not getting sync after the steps
>> followed,
>> > >> mentioned in the link which you shared in the previous mail.
>> > >>
>> > >> And problem is that why split-brain command is not showing
>> split-brain
>> > >> entries.
>> > >>
>> > >> Regards,
>> > >> Abhishek
>> > >>
>> > >> On Wed, Mar 16, 2016 at 6:06 PM, Anuradha Talur <atalur at redhat.com>
>> > >> wrote:
>> > >>
>> > >>>
>> > >>>
>> > >>> ----- Original Message -----
>> > >>> > From: "Anuradha Talur" <atalur at redhat.com>
>> > >>> > To: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
>> > >>> > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
>> > >>> > Sent: Wednesday, March 16, 2016 5:32:26 PM
>> > >>> > Subject: Re: [Gluster-users] gluster volume heal info split brain
>> > >>> command not showing files in split-brain
>> > >>> >
>> > >>> >
>> > >>> >
>> > >>> > ----- Original Message -----
>> > >>> > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
>> > >>> > > To: "Anuradha Talur" <atalur at redhat.com>
>> > >>> > > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
>> > >>> > > Sent: Wednesday, March 16, 2016 4:39:26 PM
>> > >>> > > Subject: Re: [Gluster-users] gluster volume heal info split
>> brain
>> > >>> command
>> > >>> > > not showing files in split-brain
>> > >>> > >
>> > >>> > > Hi Anuradha,
>> > >>> > >
>> > >>> > > I am doing the same which is mentioned in the link you shared
>> and It
>> > >>> has
>> > >>> > > been resolved the issue.
>> > >>> > >
>> > >>> > > But my question is if it is the split-brain scenario then why
>> the
>> > >>> command
>> > >>> > > "gluster volume heal info split-brain"
>> > >>> > >  not showing these files in the output even not the parent
>> directory
>> > >>> is
>> > >>> > > present in split-brain.
>> > >>> > >
>> > >>> > > Please find the requested logs
>> > >>>
>> > >>> Abhishek,
>> > >>>
>> > >>> Yes, ideally it should show. I will look into it. The only reason I
>> can
>> > >>> think of, is when parent directory did not have any pending markers
>> to
>> > >>> indicate split-brain; which is why I asked getfattr output for the
>> parent
>> > >>> directory too. But if the issue is resolved, there isn't much info
>> we can
>> > >>> get out of it. Thanks for sharing the logs. Will see what could have
>> > >>> caused
>> > >>> this.
>> > >>>
>> > >>> > Abhishek,
>> > >>> >
>> > >>> > Yes, ideally it should show. I will look into it.
>> > >>> > I saw another case with this issue. The parent directory did not
>> have
>> > >>> any pe
>> > >>> > >
>> > >>> > > On Wed, Mar 16, 2016 at 4:20 PM, Anuradha Talur <
>> atalur at redhat.com>
>> > >>> wrote:
>> > >>> > >
>> > >>> > > > Hi Abhishek,
>> > >>> > > >
>> > >>> > > > The files that are reporting i/o error have gfid-mismatch.
>> This
>> > >>> situation
>> > >>> > > > is called directory or entry split-brain. You can find steps
>> to
>> > >>> resolve
>> > >>> > > > this kind of split brain here :
>> > >>> > > >
>> > >>>
>> https://gluster.readthedocs.org/en/latest/Troubleshooting/split-brain/ .
>> > >>> > > >
>> > >>> > > > Ideally, the parent directories of these files have to be
>> listed
>> > >>> in heal
>> > >>> > > > info split-brain output. Can you please get extended
>> attributes of
>> > >>> parent
>> > >>> > > > directories of the files that show i/o error (Same getfattr
>> > >>> command that
>> > >>> > > > you previously used.) ?
>> > >>> > > >
>> > >>> > > > ----- Original Message -----
>> > >>> > > > > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
>> > >>> > > > > To: "Anuradha Talur" <atalur at redhat.com>
>> > >>> > > > > Cc: gluster-users at gluster.org, gluster-devel at gluster.org
>> > >>> > > > > Sent: Thursday, March 10, 2016 11:22:35 AM
>> > >>> > > > > Subject: Re: [Gluster-users] gluster volume heal info split
>> brain
>> > >>> > > > command not showing files in split-brain
>> > >>> > > > >
>> > >>> > > > > Hi Anuradha,
>> > >>> > > > >
>> > >>> > > > > Please find the glusterfs and glusterd logs directory as an
>> > >>> attachment.
>> > >>> > > > >
>> > >>> > > > > Regards,
>> > >>> > > > > Abhishek
>> > >>> > > > >
>> > >>> > > > >
>> > >>> > > > >
>> > >>> > > > > On Wed, Mar 9, 2016 at 5:54 PM, ABHISHEK PALIWAL <
>> > >>> > > > abhishpaliwal at gmail.com>
>> > >>> > > > > wrote:
>> > >>> > > > >
>> > >>> > > > > > Hi Anuradha,
>> > >>> > > > > >
>> > >>> > > > > > Sorry for late reply.
>> > >>> > > > > >
>> > >>> > > > > > Please find the requested logs below:
>> > >>> > > > > >
>> > >>> > > > > > Remote: 10.32.0.48
>> > >>> > > > > > Local : 10.32.1.144
>> > >>> > > > > >
>> > >>> > > > > > Local:
>> > >>> > > > > > #gluster volume heal c_glusterfs info split-brain
>> > >>> > > > > > Brick 10.32.1.144:/opt/lvmdir/c2/brick
>> > >>> > > > > > Number of entries in split-brain: 0
>> > >>> > > > > >
>> > >>> > > > > > Brick 10.32.0.48:/opt/lvmdir/c2/brick
>> > >>> > > > > > Number of entries in split-brain: 0
>> > >>> > > > > >
>> > >>> > > > > > Remote:
>> > >>> > > > > > #gluster volume heal c_glusterfs info split-brain
>> > >>> > > > > > Brick 10.32.1.144:/opt/lvmdir/c2/brick
>> > >>> > > > > > Number of entries in split-brain: 0
>> > >>> > > > > >
>> > >>> > > > > > Brick 10.32.0.48:/opt/lvmdir/c2/brick
>> > >>> > > > > > Number of entries in split-brain: 0
>> > >>> > > > > >
>> > >>> > > > > > auto-sync.sh.
>> > >>> > > > > > Here you can see that i/o error is detected. Below is the
>> > >>> required
>> > >>> > > > > > meta
>> > >>> > > > > > data from both the bricks.
>> > >>> > > > > >
>> > >>> > > > > > 1)
>> > >>> > > > > > stat: cannot stat
>> > >>> '/mnt/c//public_html/cello/ior_files/nameroot.ior':
>> > >>> > > > > > Input/output error
>> > >>> > > > > > Remote:
>> > >>> > > > > >
>> > >>> > > > > > getfattr -d -m . -e hex
>> > >>> > > > > >
>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
>> > >>> > > > > > # file:
>> > >>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
>> > >>> > > > > > trusted.afr.dirty=0x000000000000000000000000
>> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded2f6000ad80f
>> > >>> > > > > > trusted.gfid=0x771221a7bb3c4f1aade40ce9e38a95ee
>> > >>> > > > > >
>> > >>> > > > > > Local:
>> > >>> > > > > >
>> > >>> > > > > > getfattr -d -m . -e hex
>> > >>> > > > > >
>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
>> > >>> > > > > > # file:
>> > >>> opt/lvmdir/c2/brick/public_html/cello/ior_files/nameroot.ior
>> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded38f000e3a51
>> > >>> > > > > > trusted.gfid=0x8ea33f46703c4e2d95c09153c1b858fd
>> > >>> > > > > >
>> > >>> > > > > >
>> > >>> > > > > > 2)
>> > >>> > > > > > stat: cannot stat '/mnt/c//security/corbasecurity':
>> > >>> Input/output
>> > >>> > > > > > error
>> > >>> > > > > > Remote:
>> > >>> > > > > >
>> > >>> > > > > > getfattr -d -m . -e hex
>> > >>> opt/lvmdir/c2/brick/security/corbasecurity
>> > >>> > > > > > # file: opt/lvmdir/c2/brick/security/corbasecurity
>> > >>> > > > > > trusted.afr.dirty=0x000000000000000000000000
>> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded2f6000ad80f
>> > >>> > > > > > trusted.gfid=0xd298b7a0c8834f3e99abb39741363013
>> > >>> > > > > >
>> > >>> > > > > > Local:
>> > >>> > > > > >
>> > >>> > > > > > getfattr -d -m . -e hex
>> > >>> opt/lvmdir/c2/brick/security/corbasecurity
>> > >>> > > > > > # file: opt/lvmdir/c2/brick/security/corbasecurity
>> > >>> > > > > > trusted.bit-rot.version=0x000000000000000256ded38f000e3a51
>> > >>> > > > > > trusted.gfid=0x890df0f706184b52803fac3242a2f15b
>> > >>> > > > > >
>> > >>> > > > > > I observed that getfattr command output doesn't show all
>> the
>> > >>> fields
>> > >>> > > > > > all
>> > >>> > > > > > the times.
>> > >>> > > > > >
>> > >>> > > > > > Here you can check that gluster split-brain command hasn't
>> > >>> reported
>> > >>> > > > > > any
>> > >>> > > > > > split-brains but resulted in IO errors when accessed few
>> files.
>> > >>> > > > > >
>> > >>> > > > > > Could you please tell me if "split-brain" command doesn't
>> > >>> reported
>> > >>> > > > > > any
>> > >>> > > > > > entry as output, then is there any way through which we
>> can
>> > >>> find out
>> > >>> > > > that
>> > >>> > > > > > the files are in split-brain if we are getting the IO
>> error on
>> > >>> those
>> > >>> > > > file.
>> > >>> > > > > >
>> > >>> > > > > >
>> > >>> > > > > > Regards,
>> > >>> > > > > > Abhishek
>> > >>> > > > > >
>> > >>> > > > > >
>> > >>> > > > > >
>> > >>> > > > > > On Thu, Mar 3, 2016 at 5:32 PM, Anuradha Talur <
>> > >>> atalur at redhat.com>
>> > >>> > > > wrote:
>> > >>> > > > > >
>> > >>> > > > > >>
>> > >>> > > > > >>
>> > >>> > > > > >> ----- Original Message -----
>> > >>> > > > > >> > From: "ABHISHEK PALIWAL" <abhishpaliwal at gmail.com>
>> > >>> > > > > >> > To: gluster-users at gluster.org,
>> gluster-devel at gluster.org
>> > >>> > > > > >> > Sent: Thursday, March 3, 2016 12:10:42 PM
>> > >>> > > > > >> > Subject: [Gluster-users] gluster volume heal info split
>> > >>> brain
>> > >>> > > > command
>> > >>> > > > > >> not     showing files in split-brain
>> > >>> > > > > >> >
>> > >>> > > > > >> >
>> > >>> > > > > >> > Hello,
>> > >>> > > > > >> >
>> > >>> > > > > >> > In gluster, we use the command "gluster volume heal
>> > >>> c_glusterfs
>> > >>> > > > > >> > info
>> > >>> > > > > >> > split-brain" to find the files that are in split-brain
>> > >>> scenario.
>> > >>> > > > > >> > We run heal script (developed by Windriver prime team)
>> on
>> > >>> the
>> > >>> > > > > >> > files
>> > >>> > > > > >> reported
>> > >>> > > > > >> > by above command to resolve split-brain issue.
>> > >>> > > > > >> >
>> > >>> > > > > >> > But we observed that the above command is not showing
>> all
>> > >>> files
>> > >>> > > > > >> > that
>> > >>> > > > > >> are in
>> > >>> > > > > >> > split-brain,
>> > >>> > > > > >> > even though split brain scenario actually exists on the
>> > >>> node.
>> > >>> > > > > >> >
>> > >>> > > > > >> > Now a days this issue is seen more often and IO errors
>> are
>> > >>> > > > > >> > reported
>> > >>> > > > when
>> > >>> > > > > >> > tried to access these files under split-brain.
>> > >>> > > > > >> >
>> > >>> > > > > >> > Can you please check why this gluster command is not
>> > >>> showing files
>> > >>> > > > under
>> > >>> > > > > >> > split-brain?
>> > >>> > > > > >> > We can provide you required logs and support to
>> resolve this
>> > >>> > > > > >> > issue.
>> > >>> > > > > >> Hi,
>> > >>> > > > > >>
>> > >>> > > > > >> Could you paste the output of getfattr -m. -de hex
>> > >>> > > > > >> <path-to-files-in-split-brain> from all the bricks that
>> the
>> > >>> files
>> > >>> > > > > >> lie
>> > >>> > > > in?
>> > >>> > > > > >>
>> > >>> > > > > >> >
>> > >>> > > > > >> > Please reply on this because I am not getting any reply
>> > >>> from the
>> > >>> > > > > >> community.
>> > >>> > > > > >> >
>> > >>> > > > > >> > --
>> > >>> > > > > >> >
>> > >>> > > > > >> > Regards
>> > >>> > > > > >> > Abhishek Paliwal
>> > >>> > > > > >> >
>> > >>> > > > > >> > _______________________________________________
>> > >>> > > > > >> > Gluster-users mailing list
>> > >>> > > > > >> > Gluster-users at gluster.org
>> > >>> > > > > >> > http://www.gluster.org/mailman/listinfo/gluster-users
>> > >>> > > > > >>
>> > >>> > > > > >> --
>> > >>> > > > > >> Thanks,
>> > >>> > > > > >> Anuradha.
>> > >>> > > > > >>
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > > > --
>> > >>> > > > Thanks,
>> > >>> > > > Anuradha.
>> > >>> > > >
>> > >>> > >
>> > >>> > >
>> > >>> > >
>> > >>> > > --
>> > >>> > >
>> > >>> > >
>> > >>> > >
>> > >>> > >
>> > >>> > > Regards
>> > >>> > > Abhishek Paliwal
>> > >>> > >
>> > >>> >
>> > >>> > --
>> > >>> > Thanks,
>> > >>> > Anuradha.
>> > >>> >
>> > >>>
>> > >>> --
>> > >>> Thanks,
>> > >>> Anuradha.
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> Regards
>> > >> Abhishek Paliwal
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > >
>> > >
>> > >
>> > > Regards
>> > > Abhishek Paliwal
>> > >
>> >
>> >
>> >
>> > --
>> >
>> >
>> >
>> >
>> > Regards
>> > Abhishek Paliwal
>> >
>>
>> --
>> Thanks,
>> Anuradha.
>>
>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>



-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160321/f4348e55/attachment-0001.html>


More information about the Gluster-devel mailing list