[Gluster-devel] Need inputs for fixing 866908

Thu Dec 13 05:53:27 UTC 2012

On 12/12/2012 02:09 PM, Pranith Kumar Karampuri wrote:
> hi,
>    https://bugzilla.redhat.com/show_bug.cgi?id=866908.
> gluster volume heal <vol_name> info" command outputs entries which are in ".glusterfs" and ".landfill" directory
>
> Reason for the issue:
> Whenever a brick is restarted and self-heal has to remove a directory 'dir' it sets a flag indicating in rmdir fop that it is 'rm -rf'. The 'dir' is moved to '.glusterfs/landfill' by posix because of the flags in rmdir fop, Janitor thread removes it asynchronously. gfid-handles exists for 'dir' and files under 'dir' until the janitor thread deletes them from landfill. Janitor thread is run every 10 minutes. Self-heald depends on gfid-handle presence to determine that a file exists in the filesystem. Because of this behaviour, stale entries show up in the output of 'gluster volume heal <volname> info'. The issue is transient, after 10 minutes the stale entries won't show up in the output. One way to fix it is by waking up Janitor thread as soon as an rmdir fop comes with a special flag indicating that rm -rf needs to happen inside landfill. We can even close the issue documenting the behaviour as known issue. Let me know your inputs.

The problem as reported seems to be with self-heal-daemon informing the 
CLI that there are entries to be healed and some of them not being 
relevant as they are part of .landfill. I cannot think of a foolproof 
solution for this right away as a rm -rf might happen after 
self-heal-daemon has sent out a response that contains entries needing 
self-heal. I think a better way out here is to educate users about 
entries seen from .landfill in the output of heal * commands as 
transient and not being relevant for self-healing.

-Vijay