[Gluster-users] gluster 3.4 self-heal

Ravishankar N ravishankar at redhat.com
Tue May 27 17:35:42 UTC 2014


On 05/27/2014 08:47 PM, Ivano Talamo wrote:
> Dear all,
>
> we have a replicated volume (2 servers with 1 brick each) on 
> Scientific Linux 6.2 with gluster 3.4.
> Everything was running fine until we shutdown of of the two and kept 
> it down for 2 months.
> When it came up again the volume could be healed and we have the 
> following symptoms
> (call #1 the always-up server, #2 the server that was kept down):
>
> -doing I/O on the volume has very bad performances (impossible to keep 
> VM images on it)
>
A replica's bricks are not supposed to be intentionally kept down even 
for hours, leave alone months :-( ; If you do; then when it does come 
backup, there would be tons of stuff to heal, so a performance hit is 
expected.
> -on #1 there's 3997354 files on .glusterfs/indices/xattrop/ and the 
> number doesn't go down
>
When #2 was down, did the I/O involve directory renames? (see if there 
are entries on .glusterfs/landfill on #2). If yes then this is a known 
issue and a fix is in progress : http://review.gluster.org/#/c/7879/

> -on #1 gluster volume heal vol1 info the first time takes a lot to end 
> and doesn't show nothing.
This is fixed in glusterfs 3.5  where heal info is much more responsive.
> after that it prints "Another transaction is in progress. Please try 
> again after sometime."
>
> Furthermore on #1 glusterhd.log is full of messages like this:
> [2014-05-27 15:07:44.145326] W 
> [client-rpc-fops.c:1538:client3_3_inodelk_cbk] 0-vol1-client-0: remote 
> operation failed: No such file or directory
> [2014-05-27 15:07:44.145880] W 
> [client-rpc-fops.c:1640:client3_3_entrylk_cbk] 0-vol1-client-0: remote 
> operation failed: No such file or directory
> [2014-05-27 15:07:44.146070] E 
> [afr-self-heal-entry.c:2296:afr_sh_post_nonblocking_entry_cbk] 
> 0-vol1-replicate-0: Non Blocking entrylks failed for 
> <gfid:bfbe65db-7426-4ca0-bf0b-7d1a28de2052>.
> [2014-05-27 15:13:34.772856] E 
> [afr-self-heal-data.c:1270:afr_sh_data_open_cbk] 0-vol1-replicate-0: 
> open of <gfid:18a358e0-23d3-4f56-8d74-f5cc38a0d0ea> failed on child 
> vol1-client-0 (No such file or directory)
>
> On #2 bricks I see some updates, ie. new filenames appearing and 
> .glusterfs/indices/xattrop/ is usually empy.
>
> Do you know what's happening? How can we fix this?
You could try a `gluster volume heal vol1 full` to see if the bricks get 
synced.

Regards,
Ravi
>
> thank you,
> Ivano
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140527/49c6d994/attachment.html>


More information about the Gluster-users mailing list