[Gluster-users] Pretty much any operation related to Gluster mounted fs hangs for a while

A Ghoshal a.ghoshal at tcs.com
Mon Jan 26 21:04:19 UTC 2015


Yep, so it is indeed a split-brain caused by a mismatch of the trusted.gfid attribute. 

Sadly, I don't know precisely what causes it. -Communication loss might be one of the triggers. I am guessing the files with the problem are dynamic, correct? In our setup (also replica 2), communication is never a problem but we do see this when one of the server takes a reboot. Maybe some obscure and difficult to understand race between background self-heal and the self heal daemon...

In any case, a normal procedure for split brain recovery would work for you if you wish to get you files back in function. It's easy to find on google. I use the instructions on Joe Julian's blog page myself. 


 -----Tiago Santos <tiago at musthavemenus.com> wrote: -----

 =======================
 To: A Ghoshal <a.ghoshal at tcs.com>
 From: Tiago Santos <tiago at musthavemenus.com>
 Date: 01/27/2015 02:11AM 
 Cc: gluster-users <gluster-users at gluster.org>
 Subject: Re: [Gluster-users] Pretty much any operation related to Gluster mounted fs hangs for a while
 =======================
   Oh, right!

Follow the outputs:


root at web3:/export/images1-1/brick# time getfattr -m . -d -e hex
templates/assets/prod/temporary/13/user_1339200.png
# file: templates/assets/prod/temporary/13/user_1339200.png
trusted.afr.site-images-client-0=0x000000000000000400000000
trusted.afr.site-images-client-1=0x000000020000000900000000
trusted.gfid=0x10e5894c474a4cb1898b71e872cdf527

real 0m0.024s
user 0m0.001s
sys 0m0.001s



root at web4:/export/images2-1/brick# time getfattr -m . -d -e hex
templates/assets/prod/temporary/13/user_1339200.png
# file: templates/assets/prod/temporary/13/user_1339200.png
trusted.afr.site-images-client-0=0x000000000000000000000000
trusted.afr.site-images-client-1=0x000000000000000000000000
trusted.gfid=0xd02f14fcb6724ceba4a330eb606910f3

real 0m0.003s
user 0m0.000s
sys 0m0.006s


Not sure exactly what that means. I'm googling, and would appreciate if you
guys can bring some light.

Thanks!
--
Tiago




On Mon, Jan 26, 2015 at 6:16 PM, A Ghoshal <a.ghoshal at tcs.com> wrote:

>
> Actually you ran getfattr on the volume - which is why the requisite
> extended attributes never showed up...
>
> Your bricks are mounted elsewhere.
>  /exports/images1-1/brick, and exports/images2-1/brick
>
> Btw, what version of Linux do you use? And, are the files you observe the
> input/output errors on soft-links?
>
>  -----Tiago Santos <tiago at musthavemenus.com> wrote: -----
>
>  =======================
>  To: A Ghoshal <a.ghoshal at tcs.com>
>  From: Tiago Santos <tiago at musthavemenus.com>
>  Date: 01/27/2015 12:20AM
>  Cc: gluster-users <gluster-users at gluster.org>
>  Subject: Re: [Gluster-users] Pretty much any operation related to Gluster
> mounted fs hangs for a while
>  =======================
>    Thanks for you input, Anirban.
>
> I ran the commands on both servers, with the following results:
>
>
> root at web3:/var/www/site-images# time getfattr -m . -d -e hex
> templates/assets/prod/temporary/13/user_1339200.png
>
> real 0m34.524s
> user 0m0.004s
> sys 0m0.000s
>
>
> root at web4:/var/www/site-images# time getfattr -m . -d -e hex
> templates/assets/prod/temporary/13/user_1339200.png
> getfattr: templates/assets/prod/temporary/13/user_1339200.png: Input/output
> error
>
> real 0m11.315s
> user 0m0.001s
> sys 0m0.003s
> root at web4:/var/www/site-images# ls
> templates/assets/prod/temporary/13/user_1339200.png
> ls: cannot access templates/assets/prod/temporary/13/user_1339200.png:
> Input/output error
>
>
    
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you





More information about the Gluster-users mailing list