[Gluster-devel] Query on healing process

ABHISHEK PALIWAL abhishpaliwal at gmail.com
Fri Mar 4 06:40:29 UTC 2016


Hi Ravi,

3. On the rebooted node, do you have ssl enabled by any chance? There is a
bug for "Not able to fetch volfile' when ssl is enabled:
https://bugzilla.redhat.com/show_bug.cgi?id=1258931

->>>>> I have checked but ssl is disabled but still getting these errors

# gluster volume heal c_glusterfs info
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.

# gluster volume heal c_glusterfs info split-brain
c_glusterfs: Not able to fetch volfile from glusterd
Volume heal failed.

And based on the your observation I understood that this is not the problem
of split-brain but *is there any way through which can find out the file
which is not in split-brain as well as not in sync?*

# getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-0=0x000000000000000000000000
trusted.afr.c_glusterfs-client-2=0x000000000000000000000000
trusted.afr.c_glusterfs-client-4=0x000000000000000000000000
trusted.afr.c_glusterfs-client-6=0x000000000000000000000000
trusted.afr.c_glusterfs-client-8=*0x000000060000000000000000** //because
client8 is the latest client in our case and starting 8 digits *

*00000006....are saying like there is something in changelog data.*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001356d86c0c000217fd
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# lhsh 002500 getfattr -m . -d -e hex
/opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
getfattr: Removing leading '/' from absolute path names
# file:
opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
trusted.afr.c_glusterfs-client-1=*0x000000000000000000000000** // and here
we can say that there is no split brain but the file is out of sync*
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x000000000000001156d86c290005735c
trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae

# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on


# gluster volume info

Volume Name: c_glusterfs
Type: Replicate
Volume ID: c6a61455-d378-48bf-ad40-7a3ce897fc9c
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.32.0.48:/opt/lvmdir/c2/brick
Brick2: 10.32.1.144:/opt/lvmdir/c2/brick
Options Reconfigured:
performance.readdir-ahead: on
network.ping-timeout: 4
nfs.disable: on

# gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com
<https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>
>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
# gluster volume heal info heal-failed
Usage: volume heal <VOLNAME> [enable | disable | full |statistics
[heal-count [replica <HOSTNAME:BRICKNAME>]] |info [healed | heal-failed |
split-brain] |split-brain {bigger-file <FILE> |source-brick
<HOSTNAME:BRICKNAME> [<FILE>]}]
# gluster volume heal c_glusterfs info heal-failed
Command not supported. Please use "gluster volume heal c_glusterfs info"
and logs to find the heal information.
# lhsh 002500
 _______  _____   _____              _____ __   _ _     _ _     _
 |       |_____] |_____]      |        |   | \  | |     |  \___/
 |_____  |       |            |_____ __|__ |  \_| |_____| _/   \_

002500> gluster --version
glusterfs 3.7.8 built on Feb 17 2016 07:49:49
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com
<https://prod-webmail.windriver.com/owa/redir.aspx?SURL=1n3NinBc2tJluL9mRvtdRtuM7FXSFmZ7aHgTkNSgQ7vm1RuX9kPTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAGwAdQBzAHQAZQByAC4AYwBvAG0ALwA.&URL=http%3a%2f%2fwww.gluster.com%2f>
>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General
Public License.
002500>

Regards,
Abhishek

On Thu, Mar 3, 2016 at 4:54 PM, ABHISHEK PALIWAL <abhishpaliwal at gmail.com>
wrote:

>
> On Thu, Mar 3, 2016 at 4:10 PM, Ravishankar N <ravishankar at redhat.com>
> wrote:
>
>> Hi,
>>
>> On 03/03/2016 11:14 AM, ABHISHEK PALIWAL wrote:
>>
>> Hi Ravi,
>>
>> As I discussed earlier this issue, I investigated this issue and find
>> that healing is not triggered because the "gluster volume heal c_glusterfs
>> info split-brain" command not showing any entries as a outcome of this
>> command even though the file in split brain case.
>>
>>
>> Couple of observations from the 'commands_output' file.
>>
>> getfattr -d -m . -e hex
>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> The afr xattrs do not indicate that the file is in split brain:
>> # file:
>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> trusted.afr.c_glusterfs-client-1=0x000000000000000000000000
>> trusted.afr.dirty=0x000000000000000000000000
>> trusted.bit-rot.version=0x000000000000000b56d6dd1d000ec7a9
>> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>>
>>
>>
>> getfattr -d -m . -e hex
>> opt/lvmdir/c2/brick/logfiles/availability/CELLO_AVAILABILITY2_LOG.xml
>> trusted.afr.c_glusterfs-client-0=0x000000080000000000000000
>> trusted.afr.c_glusterfs-client-2=0x000000020000000000000000
>> trusted.afr.c_glusterfs-client-4=0x000000020000000000000000
>> trusted.afr.c_glusterfs-client-6=0x000000020000000000000000
>> trusted.afr.dirty=0x000000000000000000000000
>> trusted.bit-rot.version=0x000000000000000b56d6dcb7000c87e7
>> trusted.gfid=0x9f5e354ecfda40149ddce7d5ffe760ae
>>
>> 1. There doesn't seem to be a split-brain going by the trusted.afr*
>> xattrs.
>>
>
> if it is not the split brain problem then how can I resolve this.
>
>
>> 2. You seem to have re-used the bricks from another volume/setup. For
>> replica 2, only trusted.afr.c_glusterfs-client-0 and
>> trusted.afr.c_glusterfs-client-1 must be present but I see 4 xattrs -
>> client-0,2,4 and 6
>>
>
> could you please suggest why these entries are there because I am not able
> to find out scenario. I am rebooting the one board multiple times to
> reproduce the issue and after every reboot doing the remove-brick and
> add-brick on the same volume for the second board.
>
>
>> 3. On the rebooted node, do you have ssl enabled by any chance? There is
>> a bug for "Not able to fetch volfile' when ssl is enabled:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1258931
>>
>> Btw, you for data and metadata split-brains you can use the gluster CLI
>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/heal-info-and-split-brain-resolution.md
>> instead of modifying the file from the back end.
>>
>
> But you are saying it is not split brain problem and even the split-brain
> command  is not showing any file so how can I find the bigger file in size.
> Also in my case the file size is fix 2MB it is overwritten every time.
>
>>
>> -Ravi
>>
>>
>> So, what I have done I manually deleted the gfid entry of that file from
>> .glusterfs directory and follow the instruction mentioned in the following
>> link to do heal
>>
>>
>> https://github.com/gluster/glusterfs/blob/master/doc/debugging/split-brain.md
>>
>> and this works fine for me.
>>
>> But my question is why the split-brain command not showing any file in
>> output.
>>
>> Here I am attaching all the log which I get from the node for you and
>> also the output of commands from both of the boards
>>
>> In this tar file two directories are present
>>
>> 000300 - log for the board which is running continuously
>> 002500-  log for the board which is rebooted
>>
>> I am waiting for your reply please help me out on this issue.
>>
>> Thanks in advanced.
>>
>> Regards,
>> Abhishek
>>
>> On Fri, Feb 26, 2016 at 1:21 PM, ABHISHEK PALIWAL <
>> <abhishpaliwal at gmail.com>abhishpaliwal at gmail.com> wrote:
>>
>>> On Fri, Feb 26, 2016 at 10:28 AM, Ravishankar N <
>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote:
>>>
>>>> On 02/26/2016 10:10 AM, ABHISHEK PALIWAL wrote:
>>>>
>>>> Yes correct
>>>>
>>>>
>>>> Okay, so when you say the files are not in sync until some time, are
>>>> you getting stale data when accessing from the mount?
>>>> I'm not able to figure out why heal info shows zero when the files are
>>>> not in sync, despite all IO happening from the mounts. Could you provide
>>>> the output of getfattr -d -m . -e hex /brick/file-name from both bricks
>>>> when you hit this issue?
>>>>
>>>> I'll provide the logs once I get. here delay means we are powering on
>>>> the second board after the 10 minutes.
>>>>
>>>>
>>>> On Feb 26, 2016 9:57 AM, "Ravishankar N" < <ravishankar at redhat.com>
>>>> ravishankar at redhat.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> On 02/26/2016 08:29 AM, ABHISHEK PALIWAL wrote:
>>>>>
>>>>> Hi Ravi,
>>>>>
>>>>> Thanks for the response.
>>>>>
>>>>> We are using Glugsterfs-3.7.8
>>>>>
>>>>> Here is the use case:
>>>>>
>>>>> We have a logging file which saves logs of the events for every board
>>>>> of a node and these files are in sync using glusterfs. System in replica 2
>>>>> mode it means When one brick in a replicated volume goes offline, the
>>>>> glusterd daemons on the other nodes keep track of all the files that are
>>>>> not replicated to the offline brick. When the offline brick becomes
>>>>> available again, the cluster initiates a healing process, replicating the
>>>>> updated files to that brick. But in our casse, we see that log file
>>>>> of one board is not in the sync and its format is corrupted means files are
>>>>> not in sync.
>>>>>
>>>>>
>>>>> Just to understand you correctly, you have mounted the 2 node
>>>>> replica-2 volume on both these nodes and writing to a logging file from the
>>>>> mounts right?
>>>>>
>>>>>
>>>>> Even the outcome of #gluster volume heal c_glusterfs info shows that
>>>>> there is no pending heals.
>>>>>
>>>>> Also , The logging file which is updated is of fixed size and the new
>>>>> entries will be wrapped ,overwriting the old entries.
>>>>>
>>>>> This way we have seen that after few restarts , the contents of the
>>>>> same file on two bricks are different , but the volume heal info shows zero
>>>>> entries
>>>>>
>>>>> Solution:
>>>>>
>>>>> But when we tried to put delay  > 5 min before the healing everything
>>>>> is working fine.
>>>>>
>>>>> Regards,
>>>>> Abhishek
>>>>>
>>>>> On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N <
>>>>> <ravishankar at redhat.com>ravishankar at redhat.com> wrote:
>>>>>
>>>>>> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Here, I have one query regarding the time taken by the healing
>>>>>> process.
>>>>>> In current two node setup when we rebooted one node then the
>>>>>> self-healing process starts less than 5min interval on the board which
>>>>>> resulting the corruption of the some files data.
>>>>>>
>>>>>>
>>>>>> Heal should start immediately after the brick process comes up. What
>>>>>> version of gluster are you using? What do you mean by corruption of data?
>>>>>> Also, how did you observe that the heal started after 5 minutes?
>>>>>> -Ravi
>>>>>>
>>>>>>
>>>>>> And to resolve it I have search on google and found the following
>>>>>> link:
>>>>>> <https://support.rackspace.com/how-to/glusterfs-troubleshooting/>
>>>>>> https://support.rackspace.com/how-to/glusterfs-troubleshooting/
>>>>>>
>>>>>> Mentioning that the healing process can takes upto 10min of time to
>>>>>> start this process.
>>>>>>
>>>>>> Here is the statement from the link:
>>>>>>
>>>>>> "Healing replicated volumes
>>>>>>
>>>>>> When any brick in a replicated volume goes offline, the glusterd
>>>>>> daemons on the remaining nodes keep track of all the files that are not
>>>>>> replicated to the offline brick. When the offline brick becomes available
>>>>>> again, the cluster initiates a healing process, replicating the updated
>>>>>> files to that brick. *The start of this process can take up to 10
>>>>>> minutes, based on observation.*"
>>>>>>
>>>>>> After giving the time of more than 5 min file corruption problem has
>>>>>> been resolved.
>>>>>>
>>>>>> So, Here my question is there any way through which we can reduce the
>>>>>> time taken by the healing process to start?
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Abhishek Paliwal
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-devel mailing listGluster-devel at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Abhishek Paliwal
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>>
>>>
>>> Regards
>>> Abhishek Paliwal
>>>
>>
>>
>>
>> --
>>
>>
>>
>>
>> Regards
>> Abhishek Paliwal
>>
>>
>>
>>
>
>
> --
>
>
>
>
> Regards
> Abhishek Paliwal
>



-- 




Regards
Abhishek Paliwal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160304/38955f37/attachment-0001.html>


More information about the Gluster-devel mailing list