[Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume

Anders Blomdell anders.blomdell at control.lth.se
Mon Jul 21 14:03:14 UTC 2014


On 2014-07-21 14:36, Soumya Koduri wrote:
> 
> 
> On 07/21/2014 05:35 PM, Xavier Hernandez wrote:
>> On Monday 21 July 2014 13:53:19 Anders Blomdell wrote:
>>> On 2014-07-21 13:49, Pranith Kumar Karampuri wrote:
>>>> On 07/21/2014 05:17 PM, Anders Blomdell wrote:
>>>>> On 2014-07-21 13:36, Pranith Kumar Karampuri wrote:
>>>>>> On 07/21/2014 05:03 PM, Anders Blomdell wrote:
>>>>>>> On 2014-07-19 04:43, Pranith Kumar Karampuri wrote:
>>>>>>>> On 07/18/2014 07:57 PM, Anders Blomdell wrote:
>>>>>>>>> During testing of a 3*4 gluster (from master as of
>>>>>>>>> yesterday), I encountered>>>>>> two major
>>>>>>>>> weirdnesses: 1. A 'rm -rf <some_dir>' needed several
>>>>>>>>> invocations to finish, each time
>>>>>>>>> 
>>>>>>>>> reporting a number of lines like these: rm: cannot
>>>>>>>>> remove ‘a/b/c/d/e/f’: Directory not empty
>>>>>>>>> 
>>>>>>>>> 2. After having successfully deleted all files from
>>>>>>>>> the volume,
>>>>>>>>> 
>>>>>>>>> i have a single directory that is duplicated in
>>>>>>>>> gluster-fuse,
>>>>>>>>> 
>>>>>>>>> like this: # ls -l /mnt/gluster
>>>>>>>>> 
>>>>>>>>> total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17
>>>>>>>>> work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17
>>>>>>>>> work2/
>>>>>>>>> 
>>>>>>>>> any idea on how to debug this issue?
>>>>>>>> 
>>>>>>>> What are the steps to recreate? We need to first find
>>>>>>>> what lead to this. Then probably which xlator leads to
>>>>>>>> this.>>>>
>>>>>>> Would a pcap network dump + the result from 'tar -c
>>>>>>> --xattrs /brick/a/gluster' on all the hosts before and
>>>>>>> after the following commands are run be of>>>> any help: 
>>>>>>> # mount -t glusterfs gluster-host:/test /mnt/gluster #
>>>>>>> mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2
>>>>>>> work2
>>>>>> 
>>>>>> Are you using ext4?
>>>>> 
>>>>> Yes
>>>>> 
>>>>>> Is this on latest upstream?
>>>>> 
>>>>> kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream,
>>>>> I don't know. gluster is from master as of end of last week
>>>>> 
>>>>> If there are known issues with ext4 i could switch to
>>>>> something else, but during the last 15 years or so, I have
>>>>> had very little problems with ext2/3/4, thats the reason for
>>>>> choosing it.
>>>> 
>>>> The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier
>>>> were working on it last I heard(CCed)
>>> Should I switch to xfs or be guinea pig for testing a fixed
>>> version?
>> 
>> There is a patch for this [1]. It should work for this particular 
>> configuration, but there are some limitations in the general case,
>> specially for future scalability, that we tried to solve but it
>> seems quite difficult. Maybe Soumya has newer information about
>> that.
>> 
>> XFS should work without problems if you need it.
As long as it does not start using 64-bit offsets as well :-)
Sounds like I should go for XFS right now? Tell me if you need testers.

> Thats right. This patch works fine with the current supported/limited
> configuration. But we need a much more generalized approach or maybe
> a design change as Xavi had suggested to make it more scalable.
Is that the patch in [1] you are referring to?

> The problem in short -- 'ext4' uses large offsets/the bits which even
> GlusterFS may need to store subvol id along with the offset. This
> could be end up in few offsets being modified when given back to the
> filesystem resulting in missing files etc. Avati has proposed a
> solution to overcome this issue based on the assumption that "both
> EXT4/XFS are tolerant in terms of the accuracy of the value presented
> back in seekdir(). i.e, a seekdir(val) actually seeks to the entry
> which has the "closest" true offset. For more info, please check
> http://review.gluster.org/#/c/4711/.
This is AFAICT already in the version that failed, as commit 
e0616e9314c8323dc59fca7cad6972f08d72b936
 
> But this offset gap widens as and when more translators (which need
> to store subvol-id) get added to the gluster stack which may
> eventually result in the similar issue which you are facing now.
> 
> Thanks, Soumya
> 
>> Xavi
>> 
>> [1] http://review.gluster.org/8201/
>> 
Thanks!

/Anders

-- 
Anders Blomdell                  Email: anders.blomdell at control.lth.se
Department of Automatic Control
Lund University                  Phone:    +46 46 222 4625
P.O. Box 118                     Fax:      +46 46 138118
SE-221 00 Lund, Sweden



More information about the Gluster-devel mailing list