[Bugs] [Bug 1246052] New: Deceiving log messages like "Failing STAT on gfid : split-brain observed. [Input/output error]" reported

bugzilla at redhat.com bugzilla at redhat.com
Thu Jul 23 11:47:42 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1246052

            Bug ID: 1246052
           Summary: Deceiving log messages like "Failing STAT on gfid :
                    split-brain observed. [Input/output error]" reported
           Product: GlusterFS
           Version: mainline
         Component: replicate
          Keywords: Triaged
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: kdhananj at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com,
                    ndevos at redhat.com, pkarampu at redhat.com,
                    saujain at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1240657



+++ This bug was initially created as a clone of Bug #1240657 +++

Description of problem:
I try to delete a directory and I the error messages in ganesha-gfapi.log, like
these ones,

[2015-07-07 18:04:34.786903] W [MSGID: 114031]
[client-rpc-fops.c:531:client3_3_stat_cbk] 0-vol3-client-8: remote operation
failed [No such file or directory]
[2015-07-07 18:04:34.787612] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-3: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]
[2015-07-07 18:04:34.787954] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-1: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]
[2015-07-07 18:04:34.788090] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-5: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]
[2015-07-07 18:04:34.788191] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-0: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]
[2015-07-07 18:04:34.788240] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-2: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]
[2015-07-07 18:04:34.788478] E [MSGID: 108008]
[afr-read-txn.c:76:afr_read_txn_refresh_done] 0-vol3-replicate-4: Failing STAT
on gfid 18a973c4-73d3-48b8-942c-33a6f1a8e6b4: split-brain observed.
[Input/output error]


Though the directory deletion is successful, test was done on vers=4

Version-Release number of selected component (if applicable):
nfs-ganesha-2.2.0-4.el6rhs.x86_64
glusterfs-3.7.1-7.el6rhs.x86_64

How reproducible:
always

Actual results:
as described above

Expected results:
The above logs may be confusing while debugging the issue, hence we should try
to avoid these kind of confusing logs.

Additional info:

--- Additional comment from Saurabh on 2015-07-07 08:49:18 EDT ---



--- Additional comment from Soumya Koduri on 2015-07-08 06:48:54 EDT ---

Could you please provide the steps which led to this issue. Normal directory
removal operations work for us.

Also please CC the nfs team so that we do not miss out the bugs if needed.
Thanks!

--- Additional comment from Saurabh on 2015-07-08 07:03:37 EDT ---

rm -rf /mount-point/dir-name
or rmdir /mount-point/dir-name

--- Additional comment from Soumya Koduri on 2015-07-08 07:05:40 EDT ---

Please provide the tests you have been running before you hit the issue and if
its consistently reproducible and also the volume setup details (if in case any
other features are on or any bricks unavailable?)

--- Additional comment from Saurabh on 2015-07-08 07:20:52 EDT ---

It is pretty staright forward hence I just wrote the description.

1. create a volume of type 6x2, start it
2. mount the volume with vers=4, post configuring nfs-ganesha
3. mkdir /mount-point/<dirname>
4. rmdir /mount-point/<dirname>

--- Additional comment from Soumya Koduri on 2015-07-08 07:55:14 EDT ---

Thanks Saurabh. Have changed the bug summary to reflect that.

--- Additional comment from Niels de Vos on 2015-07-20 08:45:10 EDT ---

These messages are related to AFR, changing the component.

When a directory (or file) over NFS gets removed, a stat() on the filehandle
gets done afterwards. This is needed for updating the inode-cache that could
still be valid for hardlinks.

It is not clear to me what a stat() on a GFID could return EIO instead of
ENOENT.


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1240657
[Bug 1240657] Deceiving log messages like "Failing STAT on gfid :
split-brain observed. [Input/output error]" reported
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list