[Gluster-users] frequent split-brain with Gluster + Samba + Win client
Pranith Kumar Karampuri
pkarampu at redhat.com
Thu Aug 7 09:53:25 UTC 2014
On 08/07/2014 03:18 PM, Tiemen Ruiten wrote:
> Hello Pranith,
>
> Thanks for your reply. I'm using 3.5.2.
>
> Is it possible that Windows doesn't release the files after a write
> happens?
>
> Because the self-heal often never occurs. Just this morning we
> discovered that when a web server read from the other node, some files
> that had been changed days ago still had content from before the edit.
>
> How can I ensure that everything syncs reliably and consistently when
> mounting from SMB? Is Samba VFS more reliable in this respect?
It should happen automatically. Even the mount *must* serve reads from
good copy. In what scenario did you observe that the reads are served
from stale brick?
Could you give 'getfattr -d -m. -e hex <path-of-file-on-brick>' output
from both the bricks?
Is it possible to provide self-heal-daemon logs so that we can inspect
what is happening?
Pranith
>
> Tiemen
>
> On 7 August 2014 03:14, Pranith Kumar Karampuri <pkarampu at redhat.com
> <mailto:pkarampu at redhat.com>> wrote:
>
> hi Tiemen,
> From the logs you have pasted, it doesn't seem there are any
> split-brains. It is just performing self-heals. What version of
> glusterfs are you using? Self-heals sometimes don't happen if the
> data operations from mount are in progress because it tries to
> give that more priority. Missing files should be created once the
> self-heal completes on the parent directory of those files.
>
> Pranith
>
>
> On 08/07/2014 01:40 AM, Tiemen Ruiten wrote:
>> Sorry, I seem to have messed up the subject.
>>
>> I should add, I'm mounting these volumes through GlusterFS FUSE,
>> not the Samba VFS plugin.
>>
>> On 06-08-14 21:47, Tiemen Ruiten wrote:
>>> Hello,
>>>
>>> I'm running into some serious problems with Gluster + CTDB and
>>> Samba. What I have:
>>>
>>> A two node replicated gluster cluster set up to share volumes
>>> using Samba setup according to this guide:
>>> https://download.gluster.org/pub/gluster/glusterfs/doc/Gluster_CTDB_setup.v1.pdf
>>>
>>> When we edit or copy files into the volume via SMB (from a
>>> Windows client accessing through a samba file share) this
>>> inevitably leads to a split-brain scenario. For example:
>>>
>>> gluster> volume heal fl-webroot info
>>> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
>>> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
>>> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
>>> Number of entries: 2
>>>
>>> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
>>> /LandingPage_Saturn_Production/images
>>> /LandingPage_Saturn_Production
>>> /LandingPage_Saturn_Production/Services/v2
>>> /LandingPage_Saturn_Production/images/country/be
>>> /LandingPage_Saturn_Production/bin
>>> /LandingPage_Saturn_Production/Services
>>> /LandingPage_Saturn_Production/images/generic
>>> /LandingPage_Saturn_Production/aspnet_client/system_web
>>> /LandingPage_Saturn_Production/images/country
>>> /LandingPage_Saturn_Production/Scripts
>>> /LandingPage_Saturn_Production/aspnet_client
>>> /LandingPage_Saturn_Production/images/country/fr
>>> Number of entries: 12
>>>
>>> gluster> volume heal fl-webroot info
>>> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
>>> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
>>> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
>>> Number of entries: 2
>>>
>>> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
>>> /LandingPage_Saturn_Production/images
>>> /LandingPage_Saturn_Production
>>> /LandingPage_Saturn_Production/Services/v2
>>> /LandingPage_Saturn_Production/images/country/be
>>> /LandingPage_Saturn_Production/bin
>>> /LandingPage_Saturn_Production/Services
>>> /LandingPage_Saturn_Production/images/generic
>>> /LandingPage_Saturn_Production/aspnet_client/system_web
>>> /LandingPage_Saturn_Production/images/country
>>> /LandingPage_Saturn_Production/Scripts
>>> /LandingPage_Saturn_Production/aspnet_client
>>> /LandingPage_Saturn_Production/images/country/fr
>>>
>>>
>>>
>>> Sometimes self-heal works, sometimes it doesn't:
>>>
>>> [2014-08-06 19:32:17.986790] E
>>> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-fl-webroot-replicate-0:
>>> entry self heal failed, on
>>> /LandingPage_Saturn_Production/Services/v2
>>> [2014-08-06 19:32:18.008330] W
>>> [client-rpc-fops.c:2772:client3_3_lookup_cbk]
>>> 0-fl-webroot-client-0: remote operation failed: No such file or
>>> directory. Path: <gfid:a89d7a07-2e3d-41ee-adcc-cb2fba3d2282>
>>> (a89d7a07-2e3d-41ee-adcc-cb2fba3d2282)
>>> [2014-08-06 19:32:18.024057] I
>>> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 0-fl-webroot-replicate-0:
>>> gfid or missing entry self heal is started, metadata self heal
>>> is successfully completed, backgroung data self heal is
>>> successfully completed, data self heal from fl-webroot-client-1
>>> to sinks fl-webroot-client-0, with 0 bytes on
>>> fl-webroot-client-0, 168 bytes on fl-webroot-client-1, data -
>>> Pending matrix: [ [ 0 0 ] [ 1 0 ] ] metadata self heal from
>>> source fl-webroot-client-1 to fl-webroot-client-0, metadata -
>>> Pending matrix: [ [ 0 0 ] [ 2 0 ] ], on
>>> /LandingPage_Saturn_Production/Services/v2/PartnerApiService.asmx
>>>
>>> *More seriously, some files are simply missing on one of the
>>> nodes without any error in the logs or notice when running
>>> gluster volume heal $volume info.*
>>>
>>> Of course I can provide any log file necessary.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140807/59a26a7e/attachment.html>
More information about the Gluster-users
mailing list