[Gluster-devel] 3.6.1 issue

Vijay Bellur vbellur at redhat.com
Mon Dec 22 14:23:44 UTC 2014


On 12/21/2014 11:10 PM, David F. Robinson wrote:
> So for now it is up to all of the individual users to know they cannot use tar without the -P switch if they are accessing a data storage system that uses gluster?
>

Setting volume option cluster.read-hash-mode to 2 could help here. Can 
you please check if this resolves the problem without -P switch?

-Vijay


>> On Dec 21, 2014, at 12:30 PM, Vijay Bellur <vbellur at redhat.com> wrote:
>>
>>> On 12/20/2014 12:09 PM, David F. Robinson wrote:
>>> Seems to work with "-xPf".  I obviously couldn't check all of the files,
>>> but the two specific ones that I noted in my original email do not show
>>> any problems when using -P...
>>
>> This is related to the way tar extracts symbolic links by default & its interaction with GlusterFS. In a nutshell the following steps are involved in creation of symbolic links on the destination:
>>
>> a) Create an empty regular placeholder file with permission bits set to 0 and the name being that of the symlink source file.
>>
>> b) Record the device, inode numbers and the mtime of the placeholder file through stat.
>>
>> c) After the first pass of extraction is complete, there is a second pass involved to set right symbolic links. In this phase a stat is performed on the placeholder file. If all attributes recorded in b) are in sync with the latest information from stat buf, only then the placeholder is unlinked and a new symbolic link is created. If any attribute is out of sync, the unlink and creation of symbolic link do not happen.
>>
>> In the case of replicated GlusterFS volumes, the mtimes can vary across nodes during the creation of placeholder files. If the stat calls in steps b) and c) land on different nodes, then there is a very good likelihood that tar would skip creation of symbolic links and leave behind the placeholder files.
>>
>> A little more detail about this particular implementation behavior of symlinks for tar can be found at [1].
>>
>> To overcome this behavior, we can make use of the P switch with tar command during extraction which will create the link file directly and not go ahead with the above set of steps.
>>
>> Keeping timestamps in sync across the cluster will help to an extent in preventing this situation. There are ongoing refinements in replicate's selection of read-child which will help in addressing this problem.
>>
>> -Vijay
>>
>> [1] http://lists.debian.org/debian-user/2003/03/msg03249.html
>>
>
>



More information about the Gluster-devel mailing list