[Gluster-users] Gluster does not seem to detect a split-brain situation

Sjors Gielen sjors at sjorsgielen.nl
Sun Jun 7 19:21:19 UTC 2015


Oops! Accidentally ran the command as non-root on Curacao, that's why there
was no output. The actual output is:

curacao# getfattr -m . -d -e hex /export/sdb1/data/Case/21000355/studies.dat
getfattr: Removing leading '/' from absolute path names
# file: export/sdb1/data/Case/21000355/studies.dat
trusted.afr.data-client-0=0x000000000000000000000000
trusted.afr.data-client-1=0x000000000000000000000000
trusted.afr.dirty=0x000000000000000000000000
trusted.gfid=0xfb34574974cf4804b8b80789738c0f81

For reference, the output on bonaire:

bonaire# getfattr -m . -d -e hex /export/sdb1/data/Case/21000355/studies.dat
getfattr: Removing leading '/' from absolute path names
# file: export/sdb1/data/Case/21000355/studies.dat
trusted.gfid=0xfb34574974cf4804b8b80789738c0f81

Op zo 7 jun. 2015 om 21:13 schreef Sjors Gielen <sjors at sjorsgielen.nl>:

> I'm reading about quorums, I haven't set up anything like that yet.
>
> (In reply to Joe Julian, who responded off-list)
>
> The output of getfattr on bonaire:
>
> bonaire# getfattr -m . -d -e hex
> /export/sdb1/data/Case/21000355/studies.dat
> getfattr: Removing leading '/' from absolute path names
> # file: export/sdb1/data/Case/21000355/studies.dat
> trusted.gfid=0xfb34574974cf4804b8b80789738c0f81
>
> On curacao, the command gives no output.
>
> From `gluster volume status`, it seems that while the "brick
> curacao:/export/sdb1/data" is online, it has no associated port number.
> Curacao can connect to the port number provided by Bonaire just fine. There
> are no firewalls on/between the two machines, they are on the same subnet
> connected by Ethernet cables and two switches.
>
> By the way, warning messages just started appearing to
> /var/log/glusterfs/bricks/export-sdb1-data.log on Bonaire saying
> "mismatching ino/dev between file X and handle Y", though, maybe only just
> now even though I started the full self-heal hours ago.
>
> [2015-06-07 19:10:39.624393] W [posix-handle.c:727:posix_handle_hard]
> 0-data-posix: mismatching ino/dev between file
> /export/sdb1/data/Archive/S21/21008971/studies.dat (9127104621/2065) and
> handle
> /export/sdb1/data/.glusterfs/97/c2/97c2a65d-36e0-4566-a5c1-5925f97af1fd
> (9190215976/2065)
>
> Thanks again!
> Sjors
>
> Op zo 7 jun. 2015 om 19:13 schreef Sjors Gielen <sjors at sjorsgielen.nl>:
>
>> Hi all,
>>
>> I work at a small, 8-person company that uses Gluster for its primary
>> data storage. We have a volume called "data" that is replicated over two
>> servers (details below). This worked perfectly for over a year, but lately
>> we've been noticing some mismatches between the two bricks, so it seems
>> there has been some split-brain situation that is not being detected or
>> resolved. I have two questions about this:
>>
>> 1) I expected Gluster to (eventually) detect a situation like this; why
>> doesn't it?
>> 2) How do I fix this situation? I've tried an explicit 'heal', but that
>> didn't seem to change anything.
>>
>> Thanks a lot for your help!
>> Sjors
>>
>> ------8<------
>>
>> Volume & peer info: http://pastebin.com/PN7tRXdU
>> curacao# md5sum /export/sdb1/data/Case/21000355/studies.dat
>> 7bc2daec6be953ffae920d81fe6fa25c
>> /export/sdb1/data/Case/21000355/studies.dat
>> bonaire# md5sum /export/sdb1/data/Case/21000355/studies.dat
>> 28c950a1e2a5f33c53a725bf8cd72681
>> /export/sdb1/data/Case/21000355/studies.dat
>>
>> # mallorca is one of the clients
>> mallorca# md5sum /data/Case/21000355/studies.dat
>> 7bc2daec6be953ffae920d81fe6fa25c  /data/Case/21000355/studies.dat
>>
>> I expected an input/output error after reading this file, because of the
>> split-brain situation, but got none. There are no entries in the GlusterFS
>> logs of either bonaire or curacao.
>>
>> bonaire# gluster volume heal data full
>> Launching heal operation to perform full self heal on volume data has
>> been successful
>> Use heal info commands to check status
>> bonaire# gluster volume heal data info
>> Brick bonaire:/export/sdb1/data/
>> Number of entries: 0
>>
>> Brick curacao:/export/sdb1/data/
>> Number of entries: 0
>>
>> (Same output on curacao, and hours after this, the md5sums on both bricks
>> still differ.)
>>
>> curacao# gluster --version
>> glusterfs 3.6.2 built on Mar  2 2015 14:05:34
>> Repository revision: git://git.gluster.com/glusterfs.git
>> (Same version on Bonaire)
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150607/cd1c78bc/attachment.html>


More information about the Gluster-users mailing list