[Gluster-devel] glusterfs3.2.7 split brain on a server, while it's normal on another server
Pranith Kumar K
pkarampu at redhat.com
Wed Jan 9 10:05:49 UTC 2013
On 01/09/2013 11:03 AM, Song wrote:
>
> Hi,
>
> We have a glusterfs clusters, version is 3.2.7. The volume info is as
> below:
>
> Volume Name: gfs1
>
> Type: Distributed-Replicate
>
> Status: Started
>
> Number of Bricks: 94 x 3 = 282
>
> Transport-type: tcp
>
> We native mount the volume in all cluster servers. When we access the
> file "/XMTEXT/gfs1_000/000/000/095" on one server, the error is split
> brain.
>
> While we can access the same file on another server.
>
> At the same time, after re-mount the volume at error server, access
> the same file is ok.
>
> The glusterfs has cached some information? This case has happened more
> than one.
>
> The log is as following when split brain.
>
> [2013-01-07 09:57:29.554505] W
> [afr-common.c:931:afr_detect_self_heal_by_lookup_status]
> 0-gfs1-replicate-5: split brain detected during lookup of
> /XMTEXT/gfs1_000/000/000/095.
>
> [2013-01-07 09:57:29.554566] I
> [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5:
> background data gfid self-heal triggered. path:
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:29.555299] I
> [afr-self-heal-common.c:1290:sh_missing_entries_create]
> 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095.
> proceeding to metadata check
>
> [2013-01-07 09:57:29.555507] I
> [afr-self-heal-common.c:1050:afr_sh_missing_entries_done]
> 0-gfs1-replicate-5: split brain found, aborting selfheal of
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:29.555531] E
> [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk]
> 0-gfs1-replicate-5: background data gfid self-heal failed on
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:35.598229] W
> [afr-common.c:931:afr_detect_self_heal_by_lookup_status]
> 0-gfs1-replicate-5: split brain detected during lookup of
> /XMTEXT/gfs1_000/000/000/095.
>
> [2013-01-07 09:57:35.598282] I
> [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5:
> background data gfid self-heal triggered. path:
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:35.598939] I
> [afr-self-heal-common.c:1290:sh_missing_entries_create]
> 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095.
> proceeding to metadata check
>
> [2013-01-07 09:57:35.599139] I
> [afr-self-heal-common.c:1050:afr_sh_missing_entries_done]
> 0-gfs1-replicate-5: split brain found, aborting selfheal of
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:35.599176] E
> [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk]
> 0-gfs1-replicate-5: background data gfid self-heal failed on
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:38.192819] W
> [afr-common.c:931:afr_detect_self_heal_by_lookup_status]
> 0-gfs1-replicate-5: split brain detected during lookup of
> /XMTEXT/gfs1_000/000/000/095.
>
> [2013-01-07 09:57:38.192875] I
> [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5:
> background data gfid self-heal triggered. path:
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:38.193486] I
> [afr-self-heal-common.c:1290:sh_missing_entries_create]
> 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095.
> proceeding to metadata check
>
> [2013-01-07 09:57:38.193708] I
> [afr-self-heal-common.c:1050:afr_sh_missing_entries_done]
> 0-gfs1-replicate-5: split brain found, aborting selfheal of
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:38.193731] E
> [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk]
> 0-gfs1-replicate-5: background data gfid self-heal failed on
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 09:57:38.193937] W [afr-open.c:168:afr_open]
> 0-gfs1-replicate-5: failed to open as split brain seen, returning EIO
>
> [2013-01-07 09:57:38.194033] W [fuse-bridge.c:693:fuse_fd_cbk]
> 0-glusterfs-fuse: 3162527: OPEN() /XMTEXT/gfs1_000/000/000/095 => -1
> (Input/output error)
>
> [2013-01-07 10:08:12.569821] W
> [afr-common.c:931:afr_detect_self_heal_by_lookup_status]
> 0-gfs1-replicate-5: split brain detected during lookup of
> /XMTEXT/gfs1_000/000/000/095.
>
> [2013-01-07 10:08:12.569891] I
> [afr-common.c:1039:afr_launch_self_heal] 0-gfs1-replicate-5:
> background data gfid self-heal triggered. path:
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 10:08:12.571538] I
> [afr-self-heal-common.c:1290:sh_missing_entries_create]
> 0-gfs1-replicate-5: no missing files - /XMTEXT/gfs1_000/000/000/095.
> proceeding to metadata check
>
> [2013-01-07 10:08:12.572684] I
> [afr-self-heal-common.c:1050:afr_sh_missing_entries_done]
> 0-gfs1-replicate-5: split brain found, aborting selfheal of
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 10:08:12.572732] E
> [afr-self-heal-common.c:2190:afr_self_heal_completion_cbk]
> 0-gfs1-replicate-5: background data gfid self-heal failed on
> /XMTEXT/gfs1_000/000/000/095
>
> [2013-01-07 10:08:12.580006] W [afr-open.c:168:afr_open]
> 0-gfs1-replicate-5: failed to open as split brain seen, returning EIO
>
> [2013-01-07 10:08:12.580103] W [fuse-bridge.c:693:fuse_fd_cbk]
> 0-glusterfs-fuse: 3164490: OPEN() /XMTEXT/gfs1_000/000/000/095 => -1
> (Input/output error)
>
> Thanks!
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
Song,
It seems like the file is in gfid-split-brain. To confirm, could
you provide the output of following command from backends.
getfattr -d -m . -e hex <file-in-split-brain>
Pranith.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130109/167405b5/attachment-0001.html>
More information about the Gluster-devel
mailing list