[Gluster-users] gluster 3.4 self-heal

Ivano Talamo Ivano.Talamo at roma1.infn.it
Wed May 28 07:19:12 UTC 2014


Hi Ravishankar,

thank you for the explanation.
I expected a performance hit after such a long shutdown, the only 
problem is I couldn't understand
if the healing was going or not.
After launching the gluster volume heal vol1 full I can see files in the 
.glusterfs/indices/xattrop/ directory
to decrease, but to this rate it would take two weeks to finish, maybe I 
would rather delete and recreate the volume
from scratch and with 3.5.

Thanks
Ivano

On 5/27/14 7:35 PM, Ravishankar N wrote:
> On 05/27/2014 08:47 PM, Ivano Talamo wrote:
>> Dear all,
>>
>> we have a replicated volume (2 servers with 1 brick each) on 
>> Scientific Linux 6.2 with gluster 3.4.
>> Everything was running fine until we shutdown of of the two and kept 
>> it down for 2 months.
>> When it came up again the volume could be healed and we have the 
>> following symptoms
>> (call #1 the always-up server, #2 the server that was kept down):
>>
>> -doing I/O on the volume has very bad performances (impossible to 
>> keep VM images on it)
>>
> A replica's bricks are not supposed to be intentionally kept down even 
> for hours, leave alone months :-( ; If you do; then when it does come 
> backup, there would be tons of stuff to heal, so a performance hit is 
> expected.
>> -on #1 there's 3997354 files on .glusterfs/indices/xattrop/ and the 
>> number doesn't go down
>>
> When #2 was down, did the I/O involve directory renames? (see if there 
> are entries on .glusterfs/landfill on #2). If yes then this is a known 
> issue and a fix is in progress : http://review.gluster.org/#/c/7879/
>
>> -on #1 gluster volume heal vol1 info the first time takes a lot to 
>> end and doesn't show nothing.
> This is fixed in glusterfs 3.5  where heal info is much more responsive.
>> after that it prints "Another transaction is in progress. Please try 
>> again after sometime."
>>
>> Furthermore on #1 glusterhd.log is full of messages like this:
>> [2014-05-27 15:07:44.145326] W 
>> [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-vol1-client-0: 
>> remote operation failed: No such file or directory
>> [2014-05-27 15:07:44.145880] W 
>> [client-rpc-fops.c:1640:client3_3_entrylk_cbk] 0-vol1-client-0: 
>> remote operation failed: No such file or directory
>> [2014-05-27 15:07:44.146070] E 
>> [afr-self-heal-entry.c:2296:afr_sh_post_nonblocking_entry_cbk] 
>> 0-vol1-replicate-0: Non Blocking entrylks failed for 
>> <gfid:bfbe65db-7426-4ca0-bf0b-7d1a28de2052>.
>> [2014-05-27 15:13:34.772856] E 
>> [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] 0-vol1-replicate-0: 
>> open of <gfid:18a358e0-23d3-4f56-8d74-f5cc38a0d0ea> failed on child 
>> vol1-client-0 (No such file or directory)
>>
>> On #2 bricks I see some updates, ie. new filenames appearing and 
>> .glusterfs/indices/xattrop/ is usually empy.
>>
>> Do you know what's happening? How can we fix this?
> You could try a `gluster volume heal vol1 full` to see if the bricks 
> get synced.
>
> Regards,
> Ravi
>>
>> thank you,
>> Ivano
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140528/92a1c397/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1877 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140528/92a1c397/attachment.p7s>


More information about the Gluster-users mailing list