[Gluster-users] Question about "Possibly undergoing heal" on a file being reported.

Fri May 6 05:04:41 UTC 2016

Thanks for the response. The healinfo outputs  'Possibly undergoing 
heal'  only when the selfheal daemon is performing heal and not when 
there is IO from the mount. Could you provide the state dump of the 2 
bricks (and the mount too if you know from which mount this vm image is 
being accessed)?

The command is `kill -USR1 <pid>` where pid is the process id of the 
brick or fuse mount. The statedump will be saved in `gluster 
--print-statedumpdir`
Wanted to check if there are any stale locks being held on the bricks.

Thanks,
Ravi

On 05/06/2016 01:22 AM, Richard Klein (RSI) wrote:
> I agree there is activity but it's very low I/O based, like updating log files.  It shouldn't be high enough IO to keep it permanently in the "Possibly undergoing healing" state for days.  But just to make sure, I powered off the VM and there is no activity now at all and the "trusted.afr.dirty" is still changing.  I will leave the VM in a powered off state until tomorrow.  I agree with you that is shouldn't but that is my dilemma.
>
> Thanks for the insight,
>
> Richard Klein
> RSI
>
>> -----Original Message-----
>> From: gluster-users-bounces at gluster.org [mailto:gluster-users-
>> bounces at gluster.org] On Behalf Of Joe Julian
>> Sent: Thursday, May 05, 2016 1:44 PM
>> To: gluster-users at gluster.org
>> Subject: Re: [Gluster-users] Question about "Possibly undergoing heal" on a file
>> being reported.
>>
>> FYI, that's not "no activity". The file is clearly changing. The dirty state flipping
>> back and forth between 1 and 0 is a byproduct of writes occurring. The clients
>> set the flag, do the write, then clear the flag.
>> My guess is that's why it's only "possibly" undergoing self-heal. The write may
>> have still been pending at the moment of the check.
>>
>> On 05/05/2016 10:22 AM, Richard Klein (RSI) wrote:
>>> There are 2 hosts involved and we have a replica value of 2.  The hosts are
>> called n1c1cl1 and n1c2cl1.  Below is the info you requested. The file name in
>> gluster is "/97f52c71-80bd-4c2b-8e47-3c8c77712687".
>>> -- From the n1c1cl1 brick --
>>>
>>> [root at n1c1cl1 ~]# ll -h
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>> -rwxr--r--. 2 root root 3.7G May  5 12:10
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>>
>>> [root at n1c1cl1 ~]# getfattr -d -m . -e hex
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>> getfattr: Removing leading '/' from absolute path names # file:
>>> data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>>
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74
>> 5
>>> f743a733000
>>> trusted.afr.dirty=0xe68000000000000000000000
>>> trusted.bit-rot.version=0x020000000000000057196a8d000e1606
>>> trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f
>>>
>>> -- From the n1c2cl1 brick --
>>>
>>> [root at n1c2cl1 ~]# ll -h
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>> -rwxr--r--. 2 root root 3.7G May  5 12:16
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>>
>>> [root at n1c2cl1 ~]# getfattr -d -m . -e hex
>>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>> getfattr: Removing leading '/' from absolute path names # file:
>>> data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
>>>
>> security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74
>> 5
>>> f743a733000
>>> trusted.afr.dirty=0xd38000000000000000000000
>>> trusted.bit-rot.version=0x020000000000000057196a8d000e20ae
>>> trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f
>>>
>>> --
>>>
>>> The "trusted.afr.dirty" is changing about 2 or 3 times a minute on both files.
>> Let me know if you need further info and thanks.
>>> Richard Klein
>>> RSI
>>>
>>>
>>>
>>> From: Ravishankar N [mailto:ravishankar at redhat.com]
>>> Sent: Wednesday, May 04, 2016 8:52 PM
>>> To: Richard Klein (RSI); gluster-users at gluster.org
>>> Subject: Re: [Gluster-users] Question about "Possibly undergoing heal" on a
>> file being reported.
>>>
>>>> On 05/05/2016 01:50 AM, Richard Klein (RSI) wrote:
>>>> First time e-mailer to the group, greetings all.  We are using Gluster 3.7.6 in
>> Cloudstack on CentOS7 with KVM.  Gluster is our primary storage.  All is going
>> well >but we have a test VM QCOW2 volume that gets stuck in the "Possibly
>> undergoing healing".  By stuck I mean it stays in that state for over 24 hrs.  This
>> is a test VM >with no activity on it and we have removed the swap file on the
>> guest as well thinking that may be causing high I/O.  All the tools show that the
>> VM is basically idle >with low I/O.  The only way I can clear it up is to power
>> the VM off, move the QCOW2 volume from the Gluster mount then back
>> (basically remove and recreate it) >then power the VM back on.  Once I do this
>> process all is well again but then it happened again on the same volume/file.
>>>> One additional note, I have even powered off the VM completely and the
>> QCOW2 file still stays in this state.
>>>> When this happens, can you share the output of the extended attributes of
>> the file in question from all the bricks of the replica in which the file resides?
>>> `getfattr -d -m . -e hex /path/to/bricks/file-name`
>>>
>>> Also what is the size of this VM image file?
>>>
>>> Thanks,
>>> Ravi
>>>
>>>
>>>
>>>> Is there a way to stop/abort or force the heal to finish?  Any help with a
>> direction would be appreciated.
>>>> Thanks,
>>>>
>>>> Richard Klein
>>>> RSI
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users