[Gluster-users] Pretty much any operation related to Gluster mounted fs hangs for a while

Tiago Santos tiago at musthavemenus.com
Mon Jan 26 18:50:15 UTC 2015


Thanks for you input, Anirban.

I ran the commands on both servers, with the following results:


root at web3:/var/www/site-images# time getfattr -m . -d -e hex
templates/assets/prod/temporary/13/user_1339200.png

real 0m34.524s
user 0m0.004s
sys 0m0.000s


root at web4:/var/www/site-images# time getfattr -m . -d -e hex
templates/assets/prod/temporary/13/user_1339200.png
getfattr: templates/assets/prod/temporary/13/user_1339200.png: Input/output
error

real 0m11.315s
user 0m0.001s
sys 0m0.003s
root at web4:/var/www/site-images# ls
templates/assets/prod/temporary/13/user_1339200.png
ls: cannot access templates/assets/prod/temporary/13/user_1339200.png:
Input/output error


Not sure if it elucidate the issue..


Also, I saw at /var/log/gluster.log a zillion entries like these:

[2015-01-26 17:35:39.973268] W
[client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1:
remote operation failed: Transport endpoint is not connected. Path:
/templates/apache/template/prod/facebook/9616964
(00000000-0000-0000-0000-000000000000)
[2015-01-26 17:35:39.973435] W
[client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1:
remote operation failed: Transport endpoint is not connected. Path:
/templates/apache/template/prod/facebook/9594915
(00000000-0000-0000-0000-000000000000)
[2015-01-26 17:35:39.973571] W
[client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1:
remote operation failed: Transport endpoint is not connected. Path:
/templates/apache/template/prod/facebook/9681971
(00000000-0000-0000-0000-000000000000)
[2015-01-26 17:35:39.973686] W
[client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1:
remote operation failed: Transport endpoint is not connected. Path:
/templates/apache/template/prod/facebook/19615
(00000000-0000-0000-0000-000000000000)
[2015-01-26 17:35:39.973802] W
[client-rpc-fops.c:2779:client3_3_lookup_cbk] 0-site-images-client-1:
remote operation failed: Transport endpoint is not connected. Path:
/templates/apache/template/prod/facebook/130392
(00000000-0000-0000-0000-000000000000)


I have talked with some guys at #gluster that pointed it could be network
issues. I'm still looking into it, but since the issue also happens locally
(within the same server), would that still be a valid point?


Also, less often, I see entries like these:

[2015-01-26 17:41:25.956418] E
[afr-self-heal-common.c:1615:afr_sh_common_lookup_cbk]
0-site-images-replicate-0: Conflicting entries for
/webhost/sites/clipart/assets/apache/images/graphics/215126/image1.png
[2015-01-26 17:41:26.588753] E
[afr-self-heal-common.c:1615:afr_sh_common_lookup_cbk]
0-site-images-replicate-0: Conflicting entries for
/webhost/sites/clipart/assets/apache/images/graphics/215126/image1.png


Are those a definitive indication of a split-brain? Or just something usual
until self-heal takes care of recently updated files?






On Mon, Jan 26, 2015 at 2:25 PM, A Ghoshal <a.ghoshal at tcs.com> wrote:

>  I am plagued with something of this sort, too!
>
> What I mostly see when I explore these things is that
>
> A) it's a split-brain.
> B) the split-brain is because the gfid's on the two replicas are at odds.
>
> You could check that out by
> 1. On each server, first 'cd' to where your brick is mounted.
> 2. getfattr -m . -d -e hex
> templates/assets/prod/temporary/13/user_1339200.png
>
> You will see a trusted.gfid kind of extended attribute. If it's not the
> same on both servers, there's a problem.
>
> Thanks,
> Anirban
>
>

Regards,
-- 
*Tiago Santos*
MustHaveMenus.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150126/ed243998/attachment.html>


More information about the Gluster-users mailing list