[Bugs] [Bug 1457616] New: "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf

bugzilla at redhat.com bugzilla at redhat.com
Thu Jun 1 04:00:51 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1457616

            Bug ID: 1457616
           Summary: "split-brain observed [Input/output error]" error
                    messages in samba logs during parallel rm -rf
           Product: GlusterFS
           Version: 3.11
         Component: replicate
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: ravishankar at redhat.com
                CC: bugs at gluster.org
        Depends On: 1454689, 1456582



+++ This bug was initially created as a clone of Bug #1456582 +++

+++ This bug was initially created as a clone of Bug #1454689 +++

Description of problem:
=======================
"split-brain observed [Input/output error]" error messages in samba logs during
parallel rm -rf.

Version-Release number of selected component (if applicable):
3.8.4-25.el7rhgs.x86_64

How reproducible:
1/1

Steps to Reproduce:
===================
1) Create a distrbuted-replicate volume and start it.
2) Set nl-cache, parallel readdirp, md-cache settings on the volume. (Please
see gluster volume distrep info below output for more info)
3) cifs mount on multiple clients.
4) Create a very large data set which contains small files, empty directories,
directory with files.
5) Simultaneously issue rm -rf * from multiple clients.

Check samba logs and rm -rf * terminal.

Actual results:
===============
When issued rm -rf * command simultaneously from multiple clients, seeing
"split-brain observed [Input/output error]" error messages in samba logs and rm
-rf * throws permission denied errors as below,

rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_02/d_005/d_005’: Permission
denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_02/d_005/d_000’: Permission
denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_00/d_002/d_005’: Permission
denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_00/d_002/d_000’: Permission
denied
rm: cannot remove ‘file_dstdir/10.70.47.52/thrd_04/d_001/d_000’: Permission
denied
rm: cannot remove ‘file_srcdir/10.70.47.52/thrd_01/d_002/d_000’: Permission
denied
rm: cannot remove ‘file_srcdir/10.70.47.15/thrd_05/d_003/d_000’: Permission
denied
rm: cannot remove ‘new/file_dstdir/10.70.47.15/thrd_04/d_004/d_008’: Permission
denied

Expected results:
=================
No split-brain errors in samba logs and rm -rf * should not throw any errors.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-05-23
06:42:36 EDT ---

Samba logs split-brain output snippet:
======================================
[2017-05-23 10:20:06.930523] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-2: Unreadable subvolume
-1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f.
(Possible split-brain)
[2017-05-23 10:20:06.930704] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-2: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.931078] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-3: Unreadable subvolume
-1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f.
(Possible split-brain)
[2017-05-23 10:20:06.931114] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-3: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.932575] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-7: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.932753] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-4: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.933165] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-6: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.933356] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-5: Failing
FSTAT on gfid 5c268304-8143-46c0-b4e4-561555fc003f: split-brain observed.
[Input/output error]
[2017-05-23 10:20:06.934668] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-5: Unreadable subvolume
-1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f.
(Possible split-brain)
[2017-05-23 10:20:06.936800] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-6: Unreadable subvolume
-1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f.
(Possible split-brain)
[2017-05-23 10:20:06.938764] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-7: Unreadable subvolume
-1 found with event generation 2 for gfid 5c268304-8143-46c0-b4e4-561555fc003f.
(Possible split-brain)
[2017-05-23 10:20:36.839637] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-2: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)
[2017-05-23 10:20:36.839703] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-2: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.840962] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-0: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)
[2017-05-23 10:20:36.841024] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-0: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.841372] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-1: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)
[2017-05-23 10:20:36.841425] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-1: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.843287] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-5: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.843445] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-3: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.843561] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-7: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.844088] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-5: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)
[2017-05-23 10:20:36.844355] E [MSGID: 108008]
[afr-read-txn.c:90:afr_read_txn_refresh_done] 4-distrep-replicate-6: Failing
FSTAT on gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace: split-brain observed.
[Input/output error]
[2017-05-23 10:20:36.853341] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-6: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)
[2017-05-23 10:20:36.864483] W [MSGID: 108008]
[afr-read-txn.c:229:afr_read_txn] 4-distrep-replicate-7: Unreadable subvolume
-1 found with event generation 2 for gfid cde39e6f-bdab-4a8c-84f5-4e2f9faebace.
(Possible split-brain)

[root at dhcp47-127 samba]# gluster volume heal distrep info split-brain
Brick 10.70.47.127:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick3/b3
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick4/b4
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick5/b5
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.127:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.181:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.46.47:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

Brick 10.70.47.140:/bricks/brick6/b6
Status: Connected
Number of entries in split-brain: 0

[root at dhcp47-127 samba]# gluster v info distrep

Volume Name: distrep
Type: Distributed-Replicate
Volume ID: 251fa596-4fdd-4e44-9b8c-4382b7139751
Status: Started
Snapshot Count: 0
Number of Bricks: 8 x 2 = 16
Transport-type: tcp
Bricks:
Brick1: 10.70.47.127:/bricks/brick3/b3
Brick2: 10.70.46.181:/bricks/brick3/b3
Brick3: 10.70.46.47:/bricks/brick3/b3
Brick4: 10.70.47.140:/bricks/brick3/b3
Brick5: 10.70.47.127:/bricks/brick4/b4
Brick6: 10.70.46.181:/bricks/brick4/b4
Brick7: 10.70.46.47:/bricks/brick4/b4
Brick8: 10.70.47.140:/bricks/brick4/b4
Brick9: 10.70.47.127:/bricks/brick5/b5
Brick10: 10.70.46.181:/bricks/brick5/b5
Brick11: 10.70.46.47:/bricks/brick5/b5
Brick12: 10.70.47.140:/bricks/brick5/b5
Brick13: 10.70.47.127:/bricks/brick6/b6
Brick14: 10.70.46.181:/bricks/brick6/b6
Brick15: 10.70.46.47:/bricks/brick6/b6
Brick16: 10.70.47.140:/bricks/brick6/b6
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
performance.parallel-readdir: on
performance.rda-cache-limit: 10MB
features.uss: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 50000
performance.nl-cache: on

[root at dhcp47-127 samba]# gluster v status distrep
Status of volume: distrep
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.47.127:/bricks/brick3/b3        49165     0          Y       9881 
Brick 10.70.46.181:/bricks/brick3/b3        49164     0          Y       4970 
Brick 10.70.46.47:/bricks/brick3/b3         49164     0          Y       10968
Brick 10.70.47.140:/bricks/brick3/b3        49163     0          Y       29273
Brick 10.70.47.127:/bricks/brick4/b4        49166     0          Y       9901 
Brick 10.70.46.181:/bricks/brick4/b4        49165     0          Y       5003 
Brick 10.70.46.47:/bricks/brick4/b4         49165     0          Y       10990
Brick 10.70.47.140:/bricks/brick4/b4        49164     0          Y       29294
Brick 10.70.47.127:/bricks/brick5/b5        49168     0          Y       4735 
Brick 10.70.46.181:/bricks/brick5/b5        49167     0          Y       26844
Brick 10.70.46.47:/bricks/brick5/b5         49167     0          Y       832  
Brick 10.70.47.140:/bricks/brick5/b5        49166     0          Y       19171
Brick 10.70.47.127:/bricks/brick6/b6        49169     0          Y       11755
Brick 10.70.46.181:/bricks/brick6/b6        49168     0          Y       16967
Brick 10.70.46.47:/bricks/brick6/b6         49168     0          Y       23623
Brick 10.70.47.140:/bricks/brick6/b6        49167     0          Y       9394 
Snapshot Daemon on localhost                49167     0          Y       12337
Self-heal Daemon on localhost               N/A       N/A        Y       7828 
Quota Daemon on localhost                   N/A       N/A        Y       7847 
Snapshot Daemon on dhcp46-47.lab.eng.blr.re
dhat.com                                    49166     0          Y       12714
Self-heal Daemon on dhcp46-47.lab.eng.blr.r
edhat.com                                   N/A       N/A        Y       5839 
Quota Daemon on dhcp46-47.lab.eng.blr.redha
t.com                                       N/A       N/A        Y       5861 
Snapshot Daemon on dhcp46-181.lab.eng.blr.r
edhat.com                                   49166     0          Y       6657 
Self-heal Daemon on dhcp46-181.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       31907
Quota Daemon on dhcp46-181.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       31931
Snapshot Daemon on dhcp47-140.lab.eng.blr.r
edhat.com                                   49165     0          Y       31098
Self-heal Daemon on dhcp47-140.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       23989
Quota Daemon on dhcp47-140.lab.eng.blr.redh
at.com                                      N/A       N/A        Y       24002

Task Status of Volume distrep
------------------------------------------------------------------------------
There are no active volume tasks

--- Additional comment from Ravishankar N on 2017-05-23 08:48:21 EDT ---

>From a quick look at both samba logs, I see that fstat_cbk has failed with
EACCESS, example:
19059:[2017-05-23 09:13:13.441530] W [MSGID: 114031]
[client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-8: remote
operation failed [Operation not permitted]
19061:[2017-05-23 09:13:13.441635] W [MSGID: 114031]
[client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-12: remote
operation failed [Operation not permitted]
19066:[2017-05-23 09:13:13.453797] W [MSGID: 114031]
[client-rpc-fops.c:1495:client3_3_fstat_cbk] 4-distrep-client-4: remote
operation failed [Operation not permitted]


The corresponding brick logs also have the error.
1290:[2017-05-23 09:12:33.023868] W [MSGID: 113006]
[posix.c:6280:posix_do_readdir] 0-distrep-posix: pfd is NULL, fd=0x7f1ea0201ad0
[Operation not permitted]
1308:[2017-05-23 09:13:06.032149] E [MSGID: 115081]
[server-rpc-fops.c:1203:server_fstat_cbk] 0-distrep-server: 614392: FSTAT -2
(dd285c20-ab48-4d17-88f0-50db96c9affc) ==> (Operation not permitted) [Operation
not permitted]
1309:[2017-05-23 09:13:06.038226] W [MSGID: 113006]
[posix.c:6280:posix_do_readdir] 0-distrep-posix: pfd is NULL, fd=0x7f1ea02d3b00
[Operation not permitted]
1316:[2017-05-23 09:13:13.435376] E [MSGID: 115081]
[server-rpc-fops.c:1203:server_fstat_cbk] 0-distrep-server: 576531: FSTAT -2
(232e37a8-30ea-4efe-962f-bbb413fc31de) ==> (Operation not permitted) [Op

1. These seem to be the cause of the errors seen by the application. We need to
find why that is.

2. Though possibly unrelated, we also need to see why AFR throws EIO after
inode refresh without any files in split-brain.

--- Additional comment from Worker Ant on 2017-05-29 12:46:21 EDT ---

REVIEW: https://review.gluster.org/17413 (afr: add errno to
afr_inode_refresh_done()) posted (#1) for review on master by Ravishankar N
(ravishankar at redhat.com)

--- Additional comment from Worker Ant on 2017-05-29 12:47:05 EDT ---

REVIEW: https://review.gluster.org/17414 (posix: use the correct op_errno)
posted (#1) for review on master by Ravishankar N (ravishankar at redhat.com)

--- Additional comment from Worker Ant on 2017-05-30 21:50:39 EDT ---

COMMIT: https://review.gluster.org/17413 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit feaea7fa541b81a4988b8f394037bfedb5017f4c
Author: Ravishankar N <ravishankar at redhat.com>
Date:   Mon May 29 21:56:12 2017 +0530

    afr: add errno to afr_inode_refresh_done()

    Problem:
    When parellel `rm -rf`s were being done from cifs clients, opendir might
    fail on some replicas with ENOENT. DHT ignores partial opendir failures
    in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh
    (as a part of readdirp read_txn) sees in its fd context that the state
    of the fds is *not* AFR_FD_OPENED and bails out to
    afr_inode_refresh_done() without doing a refresh. When this happens, the
    errno is set as EIO due to lack of readable subvols, logging split-brain
    messages in the logs.

    Fix:
    Introduce an errno argument to afr_inode_refresh_do() to bail out with
    the right error value when inode refresh is not performed.

    Change-Id: I075707fbb73fd93a923b77b923a96aac79e847f9
    BUG: 1456582
    Signed-off-by: Ravishankar N <ravishankar at redhat.com>
    Reviewed-on: https://review.gluster.org/17413
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>

--- Additional comment from Worker Ant on 2017-05-31 02:54:44 EDT ---

REVIEW: https://review.gluster.org/17414 (posix: use the correct op_errno)
posted (#2) for review on master by Ravishankar N (ravishankar at redhat.com)

--- Additional comment from Worker Ant on 2017-05-31 10:34:50 EDT ---

COMMIT: https://review.gluster.org/17414 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit de92c363c95d16966dbcc9d8763fd4448dd84d13
Author: Ravishankar N <ravishankar at redhat.com>
Date:   Mon May 29 21:38:14 2017 +0530

    posix: use the correct op_errno

    Problem:
    If readdir/fstat was performed on a directory that was removed,
    posix_fd_ctx_get() fails with ENOENT but we incorrectly use the ret
    value (-1 in this case) as op_errno, logging "Operation not permitted"
    messages in the brick logs. Also in case of fstat, the -1 op_errno was
    also propagated to the client via stack unwind, causing the message to
    appear in protocol/client logs as well.

    Fix:
    Use the right op_errno in readdir, fstat and writev. Also, if
    posix_fd_ctx_get() failed with ENOENT, convert it into EBADF because
    ENOENT is not a valid error for an fd operation.

    Change-Id: Ie43c0789d5040ec73b7cf885d015a183b8c64d70
    BUG: 1456582
    Signed-off-by: Ravishankar N <ravishankar at redhat.com>
    Reviewed-on: https://review.gluster.org/17414
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
    Tested-by: Pranith Kumar Karampuri <pkarampu at redhat.com>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Amar Tumballi <amarts at redhat.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1454689
[Bug 1454689] "split-brain observed [Input/output error]" error messages in
samba logs during parallel rm -rf
https://bugzilla.redhat.com/show_bug.cgi?id=1456582
[Bug 1456582] "split-brain observed [Input/output error]" error messages in
samba logs during parallel rm -rf
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list