[Gluster-users] Files won't heal, although no obvious problem visible

Wed Nov 23 12:40:52 UTC 2016

I afraid I do not know how we got to this strange state, I do not know 
Gluster in detail enough. When does the trusted.afr.dirty flag get set? 
And when does the trusted.afr.xxx-client-xxx flag get set? From what you 
are saying, it seems to me that you always expect them to be set / 
cleared in the same moment.

If it helps you, at the end of my message you can find the full volume 
configuration.

Can I help you somehow more in actually discovering what happened / 
fixing the problem?

Thanks for your help, kind regards,

Pavel

Volume Name: hot
Type: Distributed-Replicate
Volume ID: 4d09dd56-97b6-4b63-8765-0a08574e8ddd
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x (2 + 1) = 36
Transport-type: tcp
Bricks:
Brick1: 10.10.27.10:/opt/data/hdd1/gluster
Brick2: 10.10.27.12:/opt/data/hdd1/gluster
Brick3: 10.10.27.11:/opt/data/ssd/arbiter1 (arbiter)
... similar triplets here ...
Brick34: 10.10.27.12:/opt/data/hdd8/gluster
Brick35: 10.10.27.11:/opt/data/hdd8/gluster
Brick36: 10.10.27.10:/opt/data/ssd/arbiter12 (arbiter)
Options Reconfigured:
performance.flush-behind: off
performance.write-behind: off
performance.open-behind: off
performance.nfs.write-behind: off
cluster.background-self-heal-count: 1
performance.io-cache: off
network.ping-timeout: 1
network.inode-lru-limit: 1024
performance.nfs.flush-behind: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
cluster.self-heal-daemon: off

On 11/23/2016 01:22 PM, Ravishankar N wrote:
> On 11/23/2016 04:56 PM, Pavel Cernohorsky wrote:
>> Hello, thanks for your reply, answers are in the text.
>>
>> On 11/23/2016 11:55 AM, Ravishankar N wrote:
>>> On 11/23/2016 03:56 PM, Pavel Cernohorsky wrote:
>>>> The "hot-client-21" is, based on the vol-file, the following of the 
>>>> bricks:
>>>> option remote-subvolume /opt/data/hdd5/gluster
>>>> option remote-host 10.10.27.11
>>>>
>>>> I have self healing daemon disabled, but when I try to trigger 
>>>> healing manually (gluster volume heal <volname>), I get: "Launching 
>>>> heal operation to perform index self heal on volume <volname> has 
>>>> been unsuccessful on bricks that are down. Please check if all 
>>>> brick processes are running.", although all the bricks are online 
>>>> (gluster volume status <volname>).
>>>
>>> Can you enable the self-heal daemon  and try again ?  `gluster 
>>> volume heal <volname>` requires the shd to be enabled. The error 
>>> message that you get is inappropriate and is being fixed.
>>
>> When I enabled the self heal daemon, I was able to start healing, and 
>> the files were actually healed. What does self-heal daemon do in 
>> addition to the automated healing when you read the file?
>
>
> The lookup/read code-path doesn't seem to be considering a file with 
> only the afr.dirty xattr being non-zero as a candidate for heal (while 
> the self heal-daemon code-path does) . I'm not sure at this point if 
> it should because just afr.dirty being set on all bricks without any  
> trusted.afr.xxx-client-xxx being set doesn't seem to be something that 
> should be hit under normal circumstances. I'll need to think about 
> this more.
>
>>
>> The original reason to disable self heal daemon was to be able to 
>> control the amount of resources used by the healing, because the 
>> "cluster.background-self-heal-count: 1" did not help very much and 
>> the amount of both network and disk io consumed was just extreme.
>>
>> And I am also pretty sure we have seen similar problem (not sure 
>> about the attributes) before we disabled the shd.
>>
>>>
>>>>
>>>> When I try to just md5sum the file, to trigger automated healing on 
>>>> file manipulation, I get the result, but the file is not healed 
>>>> anyway. This usually works when I do not get 3 entries for the same 
>>>> file in the heal info.
>>>
>>> Is the file size for 99705_544c0cd369a84ebcaf095b4a9f6d682a.mp4 
>>> non-zero on the 2 data bricks (i.e. on 10.10.27.11 and 10.10.27.10) 
>>> and do they match?
>>> Do the md5sums match with what you got on the mount when you 
>>> calculate it directly on these bricks?
>>
>> The file has non-zero size on both the data bricks, and the md5 sum 
>> was the same on both of them before they were healed, after the 
>> healing (enabling the shd and healing start) the md5 did not change 
>> on either of the data bricks. Mount point reports the same md5 as all 
>> the other attempts directly on the bricks. So what is actually 
>> happening there? Why was the file blamed (not unblamed after healing?)?
>
> That means there was no real heal pending. But because the dirty xattr 
> was set, the shd picked up a brick as a source and did the heal 
> anyway. We would need to find how we ended in the 'only afr.dirty 
> xattr was set' state for the file.
>
> -Ravi
>>
>> Thanks for your answers,
>> Pavel
>>
>