[Gluster-users] frequent split-brain with Gluster + Samba + Win client

Tiemen Ruiten t.ruiten at rdmedia.com
Thu Aug 7 09:48:37 UTC 2014


Hello Pranith,

Thanks for your reply. I'm using 3.5.2.

Is it possible that Windows doesn't release the files after a write
happens?

Because the self-heal often never occurs. Just this morning we discovered
that when a web server read from the other node, some files that had been
changed days ago still had content from before the edit.

How can I ensure that everything syncs reliably and consistently when
mounting from SMB? Is Samba VFS more reliable in this respect?

Tiemen

On 7 August 2014 03:14, Pranith Kumar Karampuri <pkarampu at redhat.com> wrote:

>  hi Tiemen,
> From the logs you have pasted, it doesn't seem there are any split-brains.
> It is just performing self-heals. What version of glusterfs are you using?
> Self-heals sometimes don't happen if the data operations from mount are in
> progress because it tries to give that more priority. Missing files should
> be created once the self-heal completes on the parent directory of those
> files.
>
> Pranith
>
>
> On 08/07/2014 01:40 AM, Tiemen Ruiten wrote:
>
> Sorry, I seem to have messed up the subject.
>
> I should add, I'm mounting these volumes through GlusterFS FUSE, not the
> Samba VFS plugin.
>
> On 06-08-14 21:47, Tiemen Ruiten wrote:
>
>   Hello,
>
> I'm running into some serious problems with Gluster + CTDB and Samba. What
> I have:
>
>  A two node replicated gluster cluster set up to share volumes using Samba
> setup according to this guide:
> https://download.gluster.org/pub/gluster/glusterfs/doc/Gluster_CTDB_setup.v1.pdf
>
>  When we edit or copy files into the volume via SMB (from a Windows client
> accessing through a samba file share) this inevitably leads to a
> split-brain scenario. For example:
>
> gluster> volume heal fl-webroot info
> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
> Number of entries: 2
>
> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
> /LandingPage_Saturn_Production/images
> /LandingPage_Saturn_Production
> /LandingPage_Saturn_Production/Services/v2
> /LandingPage_Saturn_Production/images/country/be
> /LandingPage_Saturn_Production/bin
> /LandingPage_Saturn_Production/Services
> /LandingPage_Saturn_Production/images/generic
> /LandingPage_Saturn_Production/aspnet_client/system_web
> /LandingPage_Saturn_Production/images/country
> /LandingPage_Saturn_Production/Scripts
> /LandingPage_Saturn_Production/aspnet_client
> /LandingPage_Saturn_Production/images/country/fr
> Number of entries: 12
>
> gluster> volume heal fl-webroot info
> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
> Number of entries: 2
>
> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
> /LandingPage_Saturn_Production/images
> /LandingPage_Saturn_Production
> /LandingPage_Saturn_Production/Services/v2
> /LandingPage_Saturn_Production/images/country/be
> /LandingPage_Saturn_Production/bin
> /LandingPage_Saturn_Production/Services
> /LandingPage_Saturn_Production/images/generic
> /LandingPage_Saturn_Production/aspnet_client/system_web
> /LandingPage_Saturn_Production/images/country
> /LandingPage_Saturn_Production/Scripts
> /LandingPage_Saturn_Production/aspnet_client
> /LandingPage_Saturn_Production/images/country/fr
>
>
>
>  Sometimes self-heal works, sometimes it doesn't:
>
> [2014-08-06 19:32:17.986790] E
> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
> 0-fl-webroot-replicate-0:  entry self heal  failed,   on
> /LandingPage_Saturn_Production/Services/v2
> [2014-08-06 19:32:18.008330] W
> [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-fl-webroot-client-0: remote
> operation failed: No such file or directory. Path:
> <gfid:a89d7a07-2e3d-41ee-adcc-cb2fba3d2282>
> (a89d7a07-2e3d-41ee-adcc-cb2fba3d2282)
> [2014-08-06 19:32:18.024057] I
> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status]
> 0-fl-webroot-replicate-0:  gfid or missing entry self heal  is started,
> metadata self heal  is successfully completed, backgroung data self heal
> is successfully completed,  data self heal from fl-webroot-client-1  to
> sinks  fl-webroot-client-0, with 0 bytes on fl-webroot-client-0, 168 bytes
> on fl-webroot-client-1,  data - Pending matrix:  [ [ 0 0 ] [ 1 0 ] ]
> metadata self heal from source fl-webroot-client-1 to fl-webroot-client-0,
> metadata - Pending matrix:  [ [ 0 0 ] [ 2 0 ] ], on
> /LandingPage_Saturn_Production/Services/v2/PartnerApiService.asmx
>
>  *More seriously, some files are simply missing on one of the nodes
> without any error in the logs or notice when running gluster volume heal
> $volume info.*
>
>  Of course I can provide any log file necessary.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140807/1849f7f9/attachment.html>


More information about the Gluster-users mailing list