[Gluster-users] Self healing does not see files to heal

Wed Aug 17 08:18:15 UTC 2016

Hello Ravi,

Thank you for reply. Found bug number (for those who will google the email) https://bugzilla.redhat.com/show_bug.cgi?id=1112158

Accessing the removed file from mount-point is not always working because we have to find a special client which DHT will point to the brick with removed file. Otherwise the file will be accessed from good brick and self-healing will not happen (just verified). Or by accessing you meant something like touch?

--
Dmitry Glushenok
Jet Infosystems

> 17 авг. 2016 г., в 4:24, Ravishankar N <ravishankar at redhat.com> написал(а):
> 
> On 08/16/2016 10:44 PM, Дмитрий Глушенок wrote:
>> Hello,
>> 
>> While testing healing after bitrot error it was found that self healing cannot heal files which were manually deleted from brick. Gluster 3.8.1:
>> 
>> - Create volume, mount it locally and copy test file to it
>> [root at srv01 ~]# gluster volume create test01 replica 2  srv01:/R1/test01 srv02:/R1/test01
>> volume create: test01: success: please start the volume to access data
>> [root at srv01 ~]# gluster volume start test01
>> volume start: test01: success
>> [root at srv01 ~]# mount -t glusterfs srv01:/test01 /mnt
>> [root at srv01 ~]# cp /etc/passwd /mnt
>> [root at srv01 ~]# ls -l /mnt
>> итого 2
>> -rw-r--r--. 1 root root 1505 авг 16 19:59 passwd
>> 
>> - Then remove test file from first brick like we have to do in case of bitrot error in the file
> 
> You also need to remove all hard-links to the corrupted file from the brick, including the one in the .glusterfs folder.
> There is a bug in heal-full that prevents it from crawling all bricks of the replica. The right way to heal the corrupted files as of now is to access them from the mount-point like you did after removing the hard-links. The list of files that are corrupted can be obtained with the scrub status command.
> 
> Hope this helps,
> Ravi
> 
>> [root at srv01 ~]# rm /R1/test01/passwd
>> [root at srv01 ~]# ls -l /mnt
>> итого 0
>> [root at srv01 ~]#
>> 
>> - Issue full self heal
>> [root at srv01 ~]# gluster volume heal test01 full
>> Launching heal operation to perform full self heal on volume test01 has been successful
>> Use heal info commands to check status
>> [root at srv01 ~]# tail -2 /var/log/glusterfs/glustershd.log
>> [2016-08-16 16:59:56.483767] I [MSGID: 108026] [afr-self-heald.c:611:afr_shd_full_healer] 0-test01-replicate-0: starting full sweep on subvol test01-client-0
>> [2016-08-16 16:59:56.486560] I [MSGID: 108026] [afr-self-heald.c:621:afr_shd_full_healer] 0-test01-replicate-0: finished full sweep on subvol test01-client-0
>> 
>> - Now we still see no files in mount point (it becomes empty right after removing file from the brick)
>> [root at srv01 ~]# ls -l /mnt
>> итого 0
>> [root at srv01 ~]#
>> 
>> - Then try to access file by using full name (lookup-optimize and readdir-optimize are turned off by default). Now glusterfs shows the file!
>> [root at srv01 ~]# ls -l /mnt/passwd
>> -rw-r--r--. 1 root root 1505 авг 16 19:59 /mnt/passwd
>> 
>> - And it reappeared in the brick
>> [root at srv01 ~]# ls -l /R1/test01/
>> итого 4
>> -rw-r--r--. 2 root root 1505 авг 16 19:59 passwd
>> [root at srv01 ~]#
>> 
>> Is it a bug or we can tell self heal to scan all files on all bricks in the volume?
>> 
>> --
>> Dmitry Glushenok
>> Jet Infosystems
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users <http://www.gluster.org/mailman/listinfo/gluster-users>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160817/988b6918/attachment.html>