[Gluster-users] frequent split-brain with Gluster + Samba + Win client

Pranith Kumar Karampuri pkarampu at redhat.com
Thu Aug 7 01:14:43 UTC 2014


hi Tiemen,
 From the logs you have pasted, it doesn't seem there are any 
split-brains. It is just performing self-heals. What version of 
glusterfs are you using? Self-heals sometimes don't happen if the data 
operations from mount are in progress because it tries to give that more 
priority. Missing files should be created once the self-heal completes 
on the parent directory of those files.

Pranith

On 08/07/2014 01:40 AM, Tiemen Ruiten wrote:
> Sorry, I seem to have messed up the subject.
>
> I should add, I'm mounting these volumes through GlusterFS FUSE, not 
> the Samba VFS plugin.
>
> On 06-08-14 21:47, Tiemen Ruiten wrote:
>> Hello,
>>
>> I'm running into some serious problems with Gluster + CTDB and Samba. 
>> What I have:
>>
>> A two node replicated gluster cluster set up to share volumes using 
>> Samba setup according to this guide: 
>> https://download.gluster.org/pub/gluster/glusterfs/doc/Gluster_CTDB_setup.v1.pdf
>>
>> When we edit or copy files into the volume via SMB (from a Windows 
>> client accessing through a samba file share) this inevitably leads to 
>> a split-brain scenario. For example:
>>
>> gluster> volume heal fl-webroot info
>> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
>> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
>> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
>> Number of entries: 2
>>
>> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
>> /LandingPage_Saturn_Production/images
>> /LandingPage_Saturn_Production
>> /LandingPage_Saturn_Production/Services/v2
>> /LandingPage_Saturn_Production/images/country/be
>> /LandingPage_Saturn_Production/bin
>> /LandingPage_Saturn_Production/Services
>> /LandingPage_Saturn_Production/images/generic
>> /LandingPage_Saturn_Production/aspnet_client/system_web
>> /LandingPage_Saturn_Production/images/country
>> /LandingPage_Saturn_Production/Scripts
>> /LandingPage_Saturn_Production/aspnet_client
>> /LandingPage_Saturn_Production/images/country/fr
>> Number of entries: 12
>>
>> gluster> volume heal fl-webroot info
>> Brick ankh.int.rdmedia.com:/export/glu/web/flash/webroot/
>> <gfid:0b162618-e46f-4921-92d0-c0fdb5290bf5>
>> <gfid:a259de7d-69fc-47bd-90e7-06a33b3e6cc8>
>> Number of entries: 2
>>
>> Brick morpork.int.rdmedia.com:/export/glu/web/flash/webroot/
>> /LandingPage_Saturn_Production/images
>> /LandingPage_Saturn_Production
>> /LandingPage_Saturn_Production/Services/v2
>> /LandingPage_Saturn_Production/images/country/be
>> /LandingPage_Saturn_Production/bin
>> /LandingPage_Saturn_Production/Services
>> /LandingPage_Saturn_Production/images/generic
>> /LandingPage_Saturn_Production/aspnet_client/system_web
>> /LandingPage_Saturn_Production/images/country
>> /LandingPage_Saturn_Production/Scripts
>> /LandingPage_Saturn_Production/aspnet_client
>> /LandingPage_Saturn_Production/images/country/fr
>>
>>
>>
>> Sometimes self-heal works, sometimes it doesn't:
>>
>> [2014-08-06 19:32:17.986790] E 
>> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 
>> 0-fl-webroot-replicate-0:  entry self heal  failed,   on 
>> /LandingPage_Saturn_Production/Services/v2
>> [2014-08-06 19:32:18.008330] W 
>> [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-fl-webroot-client-0: 
>> remote operation failed: No such file or directory. Path: 
>> <gfid:a89d7a07-2e3d-41ee-adcc-cb2fba3d2282> 
>> (a89d7a07-2e3d-41ee-adcc-cb2fba3d2282)
>> [2014-08-06 19:32:18.024057] I 
>> [afr-self-heal-common.c:2868:afr_log_self_heal_completion_status] 
>> 0-fl-webroot-replicate-0:  gfid or missing entry self heal is 
>> started, metadata self heal  is successfully completed, backgroung 
>> data self heal  is successfully completed,  data self heal from 
>> fl-webroot-client-1  to sinks fl-webroot-client-0, with 0 bytes on 
>> fl-webroot-client-0, 168 bytes on fl-webroot-client-1,  data - 
>> Pending matrix:  [ [ 0 0 ] [ 1 0 ] ]  metadata self heal from source 
>> fl-webroot-client-1 to fl-webroot-client-0,  metadata - Pending 
>> matrix:  [ [ 0 0 ] [ 2 0 ] ], on 
>> /LandingPage_Saturn_Production/Services/v2/PartnerApiService.asmx
>>
>> *More seriously, some files are simply missing on one of the nodes 
>> without any error in the logs or notice when running gluster volume 
>> heal $volume info.*
>>
>> Of course I can provide any log file necessary.
>>
>> -- 
>> Tiemen Ruiten
>> Systems Engineer
>> R&D Media
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140807/498be4fc/attachment.html>


More information about the Gluster-users mailing list