[Gluster-devel] Geo-replication fills up inode by saving processed changelog files

Aravinda avishwan at redhat.com
Mon Dec 15 06:16:50 UTC 2014


On 12/15/2014 11:21 AM, Vijay Bellur wrote:
> On 12/15/2014 11:13 AM, Aravinda wrote:
>> On 12/15/2014 10:39 AM, Vijay Bellur wrote:
>>> On 12/11/2014 11:40 AM, Aravinda wrote:
>>>> Hi,
>>>>
>>>>
>>>> While geo-replication is running it keeps processed changelog files in
>>>> $WORKING_DIR/.processed or $WORKING_DIR/.history/.processed. These
>>>> changelog files are useful for debugging processed changelogs. But 
>>>> these
>>>> changelogs eats up the space/available inodes. The changelog files 
>>>> saved
>>>> in processed directory is duplicate of changelogs available in brick
>>>> backend(Difference is in the format, changelogs in processed dir are
>>>> parsed and human readable).
>>>
>>> Do we consume 2 inodes per changelog - one in $WORKING_DIR and another
>>> in the brick? If yes, how do we avoid consuming inodes in the
>>> filesystem that contains the brick?
>> Changelog generated by changelog translator saved in brick. When a
>> consumer is registers working dir through libgfchangelog api processes
>> brick changelog and copies to working dir.
>>>
>
> How are the changelogs in the bricks cleaned up?
Changelogs in the bricks will not get cleaned up, these changelogs 
remains with data(even in snapshot).
With these changelogs we can get historical changes whenever required, 
using changelog history API.
>
>
>>>>
>>>> How about keeping only the reference to changelog file after 
>>>> processed.
>>>> For debugging their will be additional step to look for changelog from
>>>> backend($BRICK/.glusterfs/changelogs) using this reference.
>>>>
>>>> After syncing data to slave(In geo-replication)
>>>> echo $changelog_filename >> $WORKING_DIR/.processed_files
>>>> rm $WORKING_DIR/.processing/$changelog_filename
>>>>
>>>> We need to modify `gf_changelog_done` and `gf_history_changelog_done`
>>>> functions in
>>>> libgfchangelog($GLUSTER_SRC/xlators/features/changelog/lib/src)
>>>>
>>>> Any thoughts?
>>>
>>> Archiving retired changelogs may be an option. What would be the
>>> scenario when there are multiple changelog consumers (apart from
>>> geo-replication)?
>> Each consumer registers a working dir, so it will not affect other
>> consumers if we archive. Only issue could be backend changelogs are not
>> copied in sosreports,(may be difficult to debug without copy of
>> changelog in working dir)
>
> sosreport plugin does pick up everything in working directory. If the 
> changelogs get archived in the working directory, this should not be a 
> problem right?
Yeah, if we archive then it will be available in sosreports, But as I 
mentioned in my first mail, if I delete the changelog in working 
dir(after syncing to slave, in processed dir) by keeping only references 
in working dir then sosreport will not have it.
>
> -Vijay
>

--
regards
Aravinda


More information about the Gluster-devel mailing list