[Bugs] [Bug 1480591] [Scale] : I/O errors on multiple gNFS mounts with " Stale file handle" during rebalance of an erasure coded volume.

bugzilla at redhat.com bugzilla at redhat.com
Fri Aug 11 12:47:26 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1480591

Mohammed Rafi KC <rkavunga at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|bugs at gluster.org            |rkavunga at redhat.com



--- Comment #1 from Mohammed Rafi KC <rkavunga at redhat.com> ---
Expanded my volume from 10*(4+2) to 12*(4+2) with linux untars and lookups from
 6 gNFS mounts.

Triggered rebalance.


I/O errored out on 3/3 clients with ESTALE :


On gqac007(Mounted via gqas008.sbu.lab.eng.bos.redhat.com) :

<snip>

tar: linux-4.13-rc3/arch/arm64/boot/dts: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch/arm64/boot: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch/arm64: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch: Cannot stat: Stale file handle
tar: linux-4.13-rc3: Cannot stat: Stale file handle
tar: Exiting with failure status due to previous errors
</snip>



On gqac024(mounted via gqas013.sbu.lab.eng.bos.redhat.com) :

<snip>

tar: linux-4.13-rc3/arch/arm64/boot: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch/arm64: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch: Cannot stat: Stale file handle
tar: linux-4.13-rc3: Cannot stat: Stale file handle
tar: Exiting with failure status due to previous errors

<snip>

On gqac008 (Mounted via gqas007.sbu.lab.eng.bos.redhat.com) :

<snip>

tar: linux-4.13-rc3/arch/arm64/boot: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch/arm64: Cannot stat: Stale file handle
tar: linux-4.13-rc3/arch: Cannot stat: Stale file handle
tar: linux-4.13-rc3: Cannot stat: Stale file handle
tar: Exiting with failure status due to previous errors
</snip>

Version-Release number of selected component (if applicable):
-------------------------------------------------------------

3.8.4-38

How reproducible:
-----------------

1/1


Actual results:
---------------

Application got affected.IO errored out with ESTALE.


Expected results:
-----------------

Successful IO completion.

Additional info:

Volume Name: butcher
Type: Distributed-Disperse
Volume ID: dbd4e36f-28de-445e-a5cd-c78d0a467fb6
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x (4 + 2) = 72
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick5: gqas003.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick6: gqas007.sbu.lab.eng.bos.redhat.com:/bricks1/brick
Brick7: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick8: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick9: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick10: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick11: gqas003.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick12: gqas007.sbu.lab.eng.bos.redhat.com:/bricks4/brick
Brick13: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick14: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick15: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick16: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick17: gqas003.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick18: gqas007.sbu.lab.eng.bos.redhat.com:/bricks5/brick
Brick19: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick20: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick21: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick22: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick23: gqas003.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick24: gqas007.sbu.lab.eng.bos.redhat.com:/bricks6/brick
Brick25: gqas013.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick26: gqas005.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick27: gqas006.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick28: gqas008.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick29: gqas003.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick30: gqas007.sbu.lab.eng.bos.redhat.com:/bricks7/brick
Brick31: gqas013.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick32: gqas005.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick33: gqas006.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick34: gqas008.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick35: gqas003.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick36: gqas007.sbu.lab.eng.bos.redhat.com:/bricks8/brick
Brick37: gqas013.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick38: gqas005.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick39: gqas006.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick40: gqas008.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick41: gqas003.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick42: gqas007.sbu.lab.eng.bos.redhat.com:/bricks9/brick
Brick43: gqas013.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick44: gqas005.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick45: gqas006.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick46: gqas008.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick47: gqas003.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick48: gqas007.sbu.lab.eng.bos.redhat.com:/bricks10/brick
Brick49: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick50: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick51: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick52: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick53: gqas003.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick54: gqas007.sbu.lab.eng.bos.redhat.com:/bricks2/brick
Brick55: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick56: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick57: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick58: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick59: gqas003.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick60: gqas007.sbu.lab.eng.bos.redhat.com:/bricks3/brick
Brick61: gqas013.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick62: gqas005.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick63: gqas006.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick64: gqas008.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick65: gqas003.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick66: gqas007.sbu.lab.eng.bos.redhat.com:/bricks11/brick
Brick67: gqas013.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick68: gqas005.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick69: gqas006.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick70: gqas008.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick71: gqas003.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Brick72: gqas007.sbu.lab.eng.bos.redhat.com:/bricks12/brick
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
features.uss: enable
client.event-threads: 4
server.event-threads: 4
network.inode-lru-limit: 50000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
performance.stat-prefetch: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
transport.address-family: inet
nfs.disable: off


Below are few filters which I used to analyze the pkt trace
(tcp.dstport != 49162) && (glusterfs.bname == "dir2")
(glusterfs) && (glusterfs.proc == 27) 
((glusterfs) && (glusterfs.proc == 27)) && (gluster.op_errno != 0)

Since I couldn't find any LOOKUP failures (for 'dir2') sent from the brick
servers, I am suspecting if the issue is in gf_svc_lookup if there could be
cases in which it could have forwarded the LOOKUP to virtual namespace instead
of first trying on regular graph.

Request Rafi to comment. 

>From code inspection so far, maybe the checks in place are not sufficient, like

static int32_t
gf_svc_lookup (call_frame_t *frame, xlator_t *this, loc_t *loc, dict_t *xdata)
{

...
...
 ret = svc_inode_ctx_get (this, loc->inode, &inode_type);
        if (!__is_root_gfid (loc->gfid)) {
                if (loc->parent) {
                        parent = inode_ref (loc->parent);
                        ret = svc_inode_ctx_get (this, loc->parent,
                                                 &parent_type);
                } else {
                        parent = inode_parent (loc->inode, loc->pargfid, NULL);
                        if (parent)
                                ret = svc_inode_ctx_get (this, parent,
                                                         &parent_type);'

>>> We do not seem to be checking for 'ret' value here. What if parent_type is not yet set in inode_ctx?
                }
        }

...
...
        if (!loc->name) {
                if (gf_uuid_is_null (loc->inode->gfid)) {
                        subvolume = FIRST_CHILD (this);
                        local->subvolume = subvolume;
                        wind = _gf_true;
                        goto out;
                } else {
                        if (inode_type >= 0)
                                subvolume = svc_get_subvolume (this,
                                                               inode_type);
                        else
                                subvolume = FIRST_CHILD (this);
                        local->subvolume = subvolume;
                        wind = _gf_true;
                        goto out;
                }
        }

        if (strcmp (loc->name, priv->path)) {
                if (parent_type == NORMAL_INODE) {
                        subvolume = FIRST_CHILD (this);
                        local->subvolume = subvolume;
                } else {
>>> Do we need to check if parent_type is actually set to VIRTUAL_INODE?

                        subvolume = SECOND_CHILD (this);
                        local->subvolume = subvolume;
                }
        } else {
                subvolume = SECOND_CHILD (this);
                local->subvolume = subvolume;
                if (parent_type == NORMAL_INODE) {
                        /* Indication of whether the lookup is happening on the
                           entry point or not, to the snapview-server.
                        */
                        SVC_ENTRY_POINT_SET (this, xdata, op_ret, op_errno,
                                             new_xdata, priv, ret, out);
                }
        }

--- Additional comment from Soumya Koduri on 2017-08-10 05:03:37 EDT ---

>From nfs.log provided in comment#14 -
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1478716/Aug09_Debug/


Around the time when nfs xlator logged ESTALE errors- 

[2017-08-09 14:32:22.109572] D [MSGID: 0]
[client-rpc-fops.c:2948:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x7f88d007dd00, butcher-snapd-client returned -1 error: Stale file handle
[Stale file handle]
[2017-08-09 14:32:22.109605] D [snapview-client.c:289:gf_svc_lookup_cbk]
0-butcher-snapview-client: Lookup failed on snapview graph with error Stale
file handle

>>>> If the lookup was sent to regular graph first, we should this error message recorded as well -"Lookup failed on normal graph with error... ".

>>>> But that doesn't seem to be the case


[2017-08-09 14:32:22.109634] D [MSGID: 0]
[snapview-client.c:329:gf_svc_lookup_cbk] 0-stack-trace: stack-address:
0x7f88d007dd00, butcher-snapview-client returned -1 error: Stale file handle
[Stale file handle]
[2017-08-09 14:32:22.109664] D [MSGID: 0] [io-stats.c:2191:io_stats_lookup_cbk]
0-stack-trace: stack-address: 0x7f88d007dd00, butcher returned -1 error: Stale
file handle [Stale file handle]
[2017-08-09 14:32:22.109685] E [MSGID: 112198]
[nfs3-helpers.c:3691:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup
failed: /d4/dir1: Stale file handle [Stale file handle]
[2017-08-09 14:32:22.109719] E [MSGID: 112069] [nfs3.c:1692:nfs3_access_resume]
0-nfs-nfsv3: Stale file handle: (192.168.79.149:992) butcher :
8694b0d6-8fba-40e3-83b9-c45d0f1ca515
[2017-08-09 14:32:22.109747] W [MSGID: 112199]
[nfs3-helpers.c:3414:nfs3_log_common_res] 0-nfs-nfsv3: /d4/dir1 => (XID: 6be86,
ACCESS: NFS: 70(Invalid file handle), POSIX: 14(Bad address))
[2017-08-09 14:32:22.110010] D [MSGID: 0]
[nfs3-helpers.c:1646:nfs3_log_common_call] 0-nfs-nfsv3: XID: 106be86, ACCESS:
args: FH: exportid 1ff111d1-32be-4590-9b9b-b1b756e29ba9, gfid
8694b0d6-8fba-40e3-83b9-c45d0f1ca515, mountid
e4706d5b-0000-0000-0000-000000000000

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list