[Gluster-users] Pretty much any operation related to Gluster mounted fs hangs for a while
Joe Julian
joe at julianfamily.org
Mon Jan 26 22:36:11 UTC 2015
Check your client logs. Perhaps the client isn't actually connecting to
both servers.
On 01/26/2015 02:12 PM, Tiago Santos wrote:
> That's what I meant. Sorry for the confusion.
>
> I'm writing on Client1 (same server as Brick1). Client2 (mounted
> Brick2, on server2) has nothing writing to it (so far).
>
> My wondering is how I went up on having a split-brain if I'm only
> writing on one client.
>
>
>
>
>
> On Mon, Jan 26, 2015 at 8:04 PM, Joe Julian <joe at julianfamily.org
> <mailto:joe at julianfamily.org>> wrote:
>
> Nothing but GlusterFS should be writing to bricks. Mount a client
> and write there.
>
>
> On 01/26/2015 01:38 PM, Tiago Santos wrote:
>> Right.
>>
>> I have Brick1 being constantly written. But I have nothing
>> writing on Brick2. It just get "healed" data from Brick1.
>>
>> This setup is still not in production, and there's no
>> applications using that data. I have rsyncs constantly updating
>> Brick1 (bring data from production servers), and then Gluster
>> updates Brick2.
>>
>> Which makes me wonder how may I be creating multiple replicas
>> during a split-brain.
>>
>>
>> It may be the case that, having a split-brain event, I may be
>> updating versions of the same file on Brick1 (only), and Gluster
>> understands it as different versions and things get confuse?
>>
>>
>> Anyways, while we talk I'm gonna run Joe's precious procedure on
>> split-brain recovery.
>>
>>
>>
>>
>>
>> On Mon, Jan 26, 2015 at 7:23 PM, Joe Julian <joe at julianfamily.org
>> <mailto:joe at julianfamily.org>> wrote:
>>
>> Mismatched GFIDs would happen if a file is created on
>> multiple replicas during a split-brain event. The GFID is
>> assigned at file creation.
>>
>>
>> On 01/26/2015 01:04 PM, A Ghoshal wrote:
>>
>> Yep, so it is indeed a split-brain caused by a mismatch
>> of the trusted.gfid attribute.
>>
>> Sadly, I don't know precisely what causes it.
>> -Communication loss might be one of the triggers. I am
>> guessing the files with the problem are dynamic, correct?
>> In our setup (also replica 2), communication is never a
>> problem but we do see this when one of the server takes a
>> reboot. Maybe some obscure and difficult to understand
>> race between background self-heal and the self heal daemon...
>>
>> In any case, a normal procedure for split brain recovery
>> would work for you if you wish to get you files back in
>> function. It's easy to find on google. I use the
>> instructions on Joe Julian's blog page myself.
>>
>>
>> -----Tiago Santos <tiago at musthavemenus.com
>> <mailto:tiago at musthavemenus.com>> wrote: -----
>>
>> =======================
>> To: A Ghoshal <a.ghoshal at tcs.com
>> <mailto:a.ghoshal at tcs.com>>
>> From: Tiago Santos <tiago at musthavemenus.com
>> <mailto:tiago at musthavemenus.com>>
>> Date: 01/27/2015 02:11AM
>> Cc: gluster-users <gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>
>> Subject: Re: [Gluster-users] Pretty much any operation
>> related to Gluster mounted fs hangs for a while
>> =======================
>> Oh, right!
>>
>> Follow the outputs:
>>
>>
>> root at web3:/export/images1-1/brick# time getfattr -m . -d
>> -e hex
>> templates/assets/prod/temporary/13/user_1339200.png
>> # file: templates/assets/prod/temporary/13/user_1339200.png
>> trusted.afr.site-images-client-0=0x000000000000000400000000
>> trusted.afr.site-images-client-1=0x000000020000000900000000
>> trusted.gfid=0x10e5894c474a4cb1898b71e872cdf527
>>
>> real 0m0.024s
>> user 0m0.001s
>> sys 0m0.001s
>>
>>
>>
>> root at web4:/export/images2-1/brick# time getfattr -m . -d
>> -e hex
>> templates/assets/prod/temporary/13/user_1339200.png
>> # file: templates/assets/prod/temporary/13/user_1339200.png
>> trusted.afr.site-images-client-0=0x000000000000000000000000
>> trusted.afr.site-images-client-1=0x000000000000000000000000
>> trusted.gfid=0xd02f14fcb6724ceba4a330eb606910f3
>>
>> real 0m0.003s
>> user 0m0.000s
>> sys 0m0.006s
>>
>>
>> Not sure exactly what that means. I'm googling, and would
>> appreciate if you
>> guys can bring some light.
>>
>> Thanks!
>> --
>> Tiago
>>
>>
>>
>>
>> On Mon, Jan 26, 2015 at 6:16 PM, A Ghoshal
>> <a.ghoshal at tcs.com <mailto:a.ghoshal at tcs.com>> wrote:
>>
>> Actually you ran getfattr on the volume - which is
>> why the requisite
>> extended attributes never showed up...
>>
>> Your bricks are mounted elsewhere.
>> /exports/images1-1/brick, and exports/images2-1/brick
>>
>> Btw, what version of Linux do you use? And, are the
>> files you observe the
>> input/output errors on soft-links?
>>
>> -----Tiago Santos <tiago at musthavemenus.com
>> <mailto:tiago at musthavemenus.com>> wrote: -----
>>
>> =======================
>> To: A Ghoshal <a.ghoshal at tcs.com
>> <mailto:a.ghoshal at tcs.com>>
>> From: Tiago Santos <tiago at musthavemenus.com
>> <mailto:tiago at musthavemenus.com>>
>> Date: 01/27/2015 12:20AM
>> Cc: gluster-users <gluster-users at gluster.org
>> <mailto:gluster-users at gluster.org>>
>> Subject: Re: [Gluster-users] Pretty much any
>> operation related to Gluster
>> mounted fs hangs for a while
>> =======================
>> Thanks for you input, Anirban.
>>
>> I ran the commands on both servers, with the
>> following results:
>>
>>
>> root at web3:/var/www/site-images# time getfattr -m . -d
>> -e hex
>> templates/assets/prod/temporary/13/user_1339200.png
>>
>> real 0m34.524s
>> user 0m0.004s
>> sys 0m0.000s
>>
>>
>> root at web4:/var/www/site-images# time getfattr -m . -d
>> -e hex
>> templates/assets/prod/temporary/13/user_1339200.png
>> getfattr:
>> templates/assets/prod/temporary/13/user_1339200.png:
>> Input/output
>> error
>>
>> real 0m11.315s
>> user 0m0.001s
>> sys 0m0.003s
>> root at web4:/var/www/site-images# ls
>> templates/assets/prod/temporary/13/user_1339200.png
>> ls: cannot access
>> templates/assets/prod/temporary/13/user_1339200.png:
>> Input/output error
>>
>>
>> =====-----=====-----=====
>> Notice: The information contained in this e-mail
>> message and/or attachments to it may contain
>> confidential or privileged information. If you are
>> not the intended recipient, any dissemination, use,
>> review, distribution, printing or copying of the
>> information contained in this e-mail message
>> and/or attachments to it are strictly prohibited. If
>> you have received this communication in error,
>> please notify us by reply e-mail or telephone and
>> immediately and permanently delete the message
>> and any attachments. Thank you
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>> --
>> *Tiago Santos*
>> MustHaveMenus.com
>
>
>
>
> --
> *Tiago Santos*
> MustHaveMenus.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150126/c22dd4e6/attachment.html>
More information about the Gluster-users
mailing list