[Gluster-users] Failed to mount nfs due to split-brain and Input/Output Error

Anh Vo vtqanh at gmail.com
Tue Jul 3 20:17:18 UTC 2018


Actually we just discovered that the heal info command was returning
different things when executed on the different nodes of our 3-replica
setup.
When we execute it on node2 we did not see the split brain reported "/" but
if I execute it on node0 and node1 I am seeing:

x at gfs-vm001:~$ sudo gluster volume heal gv0 info | tee heal-info
Brick gfs-vm000:/gluster/brick/brick0
<gfid:81289110-867b-42ff-ba3b-1373a187032b>
/ - Is in split-brain

Status: Connected
Number of entries: 2

Brick gfs-vm001:/gluster/brick/brick0
/ - Is in split-brain

<gfid:81289110-867b-42ff-ba3b-1373a187032b>
Status: Connected
Number of entries: 2

Brick gfs-vm002:/gluster/brick/brick0
/ - Is in split-brain

Status: Connected
Number of entries: 1


I ran getfattr -d -m . -e hex /gluster/brick/brick0 on all three nodes and
I am seeing node2 has slightly different attr:
node0:
sudo getfattr -d -m . -e hex /gluster/brick/brick0
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick/brick0
trusted.afr.gv0-client-2=0x000000000000000100000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x7fa3aac372d543f987ed0c66b77f02e2

node1:
sudo getfattr -d -m . -e hex /gluster/brick/brick0
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick/brick0
trusted.afr.gv0-client-2=0x000000000000000100000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x7fa3aac372d543f987ed0c66b77f02e2

node2:
sudo getfattr -d -m . -e hex /gluster/brick/brick0
getfattr: Removing leading '/' from absolute path names
# file: gluster/brick/brick0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-0=0x000000000000000200000000
trusted.afr.gv0-client-1=0x000000000000000200000000
trusted.afr.gv0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x7fa3aac372d543f987ed0c66b77f02e2

Where do I go from here? Thanks

On Tue, Jul 3, 2018 at 11:54 AM, Anh Vo <vtqanh at gmail.com> wrote:

> I am trying to mount nfs to gluster volume and got mount.nfs failure.
> Looking at nfs.log I am seeing these entries
>
> Heal info does not show the mentioned gfid ( 00000000-0000-0000-0000-
> 000000000001 ) being in split-brain.
>
> [2018-07-03 18:16:27.694953] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> c3ac3cc5, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
> [2018-07-03 18:16:28.204685] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> c4ac3cc5, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
> The message "E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-gv0-replicate-0: Failing STAT on gfid 00000000-0000-0000-0000-000000000001:
> split-brain observed. [Input/output error]" repeated 2 times between
> [2018-07-03 18:16:27.694903] and [2018-07-03 18:17:02.310689]
> [2018-07-03 18:17:02.310722] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> 2a6f2526, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
> [2018-07-03 18:17:02.628990] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-gv0-replicate-0: Failing STAT on gfid 00000000-0000-0000-0000-000000000001:
> split-brain observed. [Input/output error]
> [2018-07-03 18:17:02.629023] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> 2b6f2526, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
> [2018-07-03 18:17:00.398601] I [MSGID: 108031]
> [afr-common.c:2458:afr_local_discovery_cbk] 0-gv0-replicate-0: selecting
> local read_child gv0-client-2
> [2018-07-03 18:17:01.666671] W [MSGID: 108027] [afr-common.c:2821:afr_discover_done]
> 0-gv0-replicate-0: no read subvols for /
> [2018-07-03 18:51:43.509385] W [MSGID: 108027] [afr-common.c:2821:afr_discover_done]
> 0-gv0-replicate-0: no read subvols for /
> [2018-07-03 18:51:43.936826] E [MSGID: 108008] [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-gv0-replicate-0: Failing STAT on gfid 00000000-0000-0000-0000-000000000001:
> split-brain observed. [Input/output error]
> [2018-07-03 18:51:43.936868] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> 19b1731e, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
> [2018-07-03 18:51:44.278901] W [MSGID: 112199]
> [nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: / => (XID:
> 1ab1731e, FSINFO: NFS: 5(I/O error), POSIX: 5(Input/output error))
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180703/afb1fdd1/attachment.html>


More information about the Gluster-users mailing list