[Bugs] [Bug 1396780] New: Make debugging EACCES errors easier to debug

bugzilla at redhat.com bugzilla at redhat.com
Sun Nov 20 05:40:42 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1396780

            Bug ID: 1396780
           Summary: Make debugging EACCES errors easier to debug
           Product: GlusterFS
           Version: 3.9
         Component: posix-acl
          Keywords: Triaged
          Severity: medium
          Priority: medium
          Assignee: bugs at gluster.org
          Reporter: pkarampu at redhat.com
                CC: bugs at gluster.org, hgowtham at redhat.com
        Depends On: 1394548



+++ This bug was initially created as a clone of Bug #1394548 +++

Description of problem:
Whenever we see some error on the brick it is difficult to find which xlator is
giving this error and why. I am raising this bug to track a recent issue where
I have to deal with figuring out why posix-acl is giving EACCES. Not completely
sure if posix-acl is indeed the xlator that is giving this error either.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Pranith Kumar K on 2016-11-13 06:44:04 EST ---

I will improve/add enough logs to make debugging such issues easier.

--- Additional comment from Worker Ant on 2016-11-13 06:45:39 EST ---

REVIEW: http://review.gluster.org/15836 (protocol/server: Print error-xlator
name) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Worker Ant on 2016-11-13 06:45:42 EST ---

REVIEW: http://review.gluster.org/15837 (system/posix: Log reason for EACCES)
posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Pranith Kumar K on 2016-11-13 06:48:14 EST ---

This is how it will look-like:

Server xlator prints the xlator which gave the error:
[2016-11-13 11:37:04.240437] E [MSGID: 115050]
[server-rpc-fops.c:155:server_lookup_cbk] 0-d1-server: 20: LOOKUP /.Trash-1000
(00000000-0000-0000-0000-000000000001/.Trash-1000), error-xlator:
d1-access-control [Permission denied]

Posix-acl logs the following error:
[2016-11-13 11:37:04.240723] E [MSGID: 139001]
[posix-acl.c:245:posix_acl_log_permit_denied] 0-d1-access-control: client:
dhcp35-190.lab.eng.blr.redhat.com-20476-2016/11/13-11:37:00:218336-d1-client-0-0-0,
gfid: 00000000-0000-0000-0000-000000000001,
req(uid:1000,gid:1000,perm:1,ngrps:3),
ctx(uid:0,gid:0,in-groups:0,perm:755,updated-fop:LOOKUP, acl:-) [Permission
denied]

--- Additional comment from Worker Ant on 2016-11-15 13:50:41 EST ---

REVIEW: http://review.gluster.org/15836 (protocol/server: Print error-xlator
name) posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Pranith Kumar K on 2016-11-15 13:51:15 EST ---

Tests done:
[2016-11-15 16:15:55.276972] W [MSGID: 115049]
[server-rpc-fops.c:61:server_statfs_cbk] 0-d1-server: 30: STATFS, client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Operation not permitted]

[2016-11-15 16:19:32.702160] E [MSGID: 115050]
[server-rpc-fops.c:168:server_lookup_cbk] 0-d1-server: 33: LOOKUP /
(00000000-0000-0000-0000-000000000001), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 16:22:14.103250] E [MSGID: 115056]
[server-rpc-fops.c:512:server_mkdir_cbk] 0-d1-server: 82: MKDIR /d1
(00000000-0000-0000-0000-000000000001/d1) client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Permission denied]

[2016-11-15 16:24:07.672258] I [MSGID: 115056]
[server-rpc-fops.c:467:server_rmdir_cbk] 0-d1-server: 415: RMDIR /d1
(00000000-0000-0000-0000-000000000001/d1), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Operation not permitted]

[2016-11-15 16:26:22.560947] E [MSGID: 115056]
[server-rpc-fops.c:1860:server_readdirp_cbk] 0-d1-server: 459: READDIRP 0
(00000000-0000-0000-0000-000000000001), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Transport endpoint is not connected]

[2016-11-15 16:27:22.081672] E [MSGID: 115056]
[server-rpc-fops.c:679:server_opendir_cbk] 0-d1-server: 501: OPENDIR /
(00000000-0000-0000-0000-000000000001), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 16:29:53.057811] I [MSGID: 115060]
[server-rpc-fops.c:914:server_setxattr_cbk] 0-d1-server: client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Numerical result out of range]

[2016-11-15 17:28:32.951020] I [MSGID: 115061]
[server-rpc-fops.c:1024:server_rename_cbk] 0-d1-server: 867: RENAME /a
(00000000-0000-0000-0000-000000000000/a) -> /b
(00000000-0000-0000-0000-000000000000/b), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Operation not permitted]

[2016-11-15 17:31:16.111967] E [MSGID: 115062]
[server-rpc-fops.c:1066:server_unlink_cbk] 0-d1-server: 954: UNLINK /b
(00000000-0000-0000-0000-000000000001/b), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 17:32:22.888055] I [MSGID: 115062]
[server-rpc-fops.c:1111:server_symlink_cbk] 0-d1-server: 1122: SYMLINK /c
(00000000-0000-0000-0000-000000000001/c), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator:d1-posix [Permission denied]

[2016-11-15 17:33:36.232374] I [MSGID: 115062]
[server-rpc-fops.c:1158:server_link_cbk] 0-d1-server: 1224: LINK /b
(7d426df8-c507-441c-b886-0f2944e02434) ->
00000000-0000-0000-0000-000000000001/c, client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 17:35:02.425027] E [MSGID: 115067]
[server-rpc-fops.c:1391:server_writev_cbk] 0-d1-server: 1275: WRITEV 0
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 17:35:02.425027] E [MSGID: 115067]
[server-rpc-fops.c:1391:server_writev_cbk] 0-d1-server: 1275: WRITEV 0
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 17:45:02.357029] I [MSGID: 115063]
[server-rpc-fops.c:1198:server_truncate_cbk] 0-d1-server: 1403: TRUNCATE /a
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Transport endpoint is not connected]

[2016-11-15 17:45:59.293328] E [MSGID: 115063]
[server-rpc-fops.c:1276:server_ftruncate_cbk] 0-d1-server: 1446: FTRUNCATE 0
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Resource temporarily unavailable]

[2016-11-15 17:47:05.029708] E [MSGID: 115065]
[server-rpc-fops.c:1315:server_flush_cbk] 0-d1-server: 1493: FLUSH 0
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Transport endpoint is not connected]

[2016-11-15 17:49:49.998398] E [MSGID: 115066]
[server-rpc-fops.c:1352:server_fsync_cbk] 0-d1-server: 1649: FSYNC 0
(8a7a8ccc-c7ad-467b-a31a-bd0624ebe184), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Too many open files]

[2016-11-15 17:52:50.151625] E [MSGID: 115068]
[server-rpc-fops.c:1442:server_readv_cbk] 0-d1-server: 1922: READV -2
(8a7a8ccc-c7ad-467b-a31a-bd0624ebe184), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Level 3 reset]

[2016-11-15 17:54:32.968047] E [MSGID: 115070]
[server-rpc-fops.c:1520:server_open_cbk] 0-d1-server: 2051: OPEN /b.txt
(4fae6044-d912-462e-9945-2118ffb8e5fb), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Transport endpoint is not connected]

[2016-11-15 18:00:54.058103] I [MSGID: 115071]
[server-rpc-fops.c:1562:server_create_cbk] 0-d1-server: 2146: CREATE /c
(00000000-0000-0000-0000-000000000001/c), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Stale file handle]

[2016-11-15 18:01:46.593861] I [MSGID: 115062]
[server-rpc-fops.c:1614:server_readlink_cbk] 0-d1-server: 2192: READLINK /b
(7d426df8-c507-441c-b886-0f2944e02434), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [No such file or directory]

[2016-11-15 18:02:45.994104] I [MSGID: 115072]
[server-rpc-fops.c:1692:server_setattr_cbk] 0-d1-server: 2249: SETATTR /a
(e844b0ce-d9fc-47de-a28a-a2c49055e469), client:
dhcp35-190.lab.eng.blr.redhat.com-15882-2016/11/15-16:14:37:688661-d1-client-0-0-0,
error-xlator: d1-posix [Input/output error]

[2016-11-15 18:28:56.729535] E [MSGID: 115053]
[server-rpc-fops.c:316:server_finodelk_cbk] 0-r2-server: 82: FINODELK 0
(03da68c1-d612-4315-bce6-33be333835f3), client:
dhcp35-190.lab.eng.blr.redhat.com-28126-2016/11/15-18:27:35:326952-r2-client-0-0-0,
error-xlator: r2-locks [Cannot allocate memory]

[2016-11-15 18:30:22.018119] E [MSGID: 115053]
[server-rpc-fops.c:278:server_inodelk_cbk] 0-r2-server: 166: INODELK /a.txt
(03da68c1-d612-4315-bce6-33be333835f3), client:
dhcp35-190.lab.eng.blr.redhat.com-28126-2016/11/15-18:27:35:326952-r2-client-0-0-0,
error-xlator: r2-locks [Cannot allocate memory]

[2016-11-15 18:31:42.018708] E [MSGID: 115054]
[server-rpc-fops.c:353:server_entrylk_cbk] 0-r2-server: 228: ENTRYLK /
(00000000-0000-0000-0000-000000000001), client:
dhcp35-190.lab.eng.blr.redhat.com-28126-2016/11/15-18:27:35:326952-r2-client-0-0-0,
error-xlator: r2-locks [Cannot allocate memory]

[2016-11-15 18:33:55.268172] E [MSGID: 115073]
[server-rpc-fops.c:1771:server_xattrop_cbk] 0-r2-server: 128: XATTROP /a
(ce82cf09-8869-4762-b34c-59fa0822373b), client:
dhcp35-190.lab.eng.blr.redhat.com-28689-2016/11/15-18:32:59:379301-r2-client-0-0-0,
error-xlator: r2-posix [Transport endpoint is not connected]

[2016-11-15 18:35:26.500727] E [MSGID: 115073]
[server-rpc-fops.c:1814:server_fxattrop_cbk] 0-r2-server: 283: FXATTROP 1
(ce82cf09-8869-4762-b34c-59fa0822373b), client:
dhcp35-190.lab.eng.blr.redhat.com-28689-2016/11/15-18:32:59:379301-r2-client-0-0-0,
error-xlator: r2-posix [Too many open files in system]

[2016-11-15 18:41:05.990114] E [MSGID: 115074]
[server-rpc-fops.c:1911:server_fallocate_cbk] 0-r2-server: 639: FALLOCATE 0
(7e532b5e-9175-4b00-a360-5283bd3d4e81), client:
dhcp35-190.lab.eng.blr.redhat.com-29224-2016/11/15-18:37:58:466209-r2-client-0-0-0,
error-xlator: - [Level 2 not synchronized]

[2016-11-15 18:44:47.815369] I [MSGID: 115058]
[server-rpc-fops.c:728:server_removexattr_cbk] 0-r2-server: 1165: REMOVEXATTR
/a.txt (7e532b5e-9175-4b00-a360-5283bd3d4e81) of key trusted.abc, client:
dhcp35-190.lab.eng.blr.redhat.com-29224-2016/11/15-18:37:58:466209-r2-client-0-0-0,
error-xlator: r2-posix [Invalid request code]

[2016-11-15 18:49:33.433095] E [MSGID: 115059]
[server-rpc-fops.c:802:server_getxattr_cbk] 0-r2-server: 1628: GETXATTR /a.txt
(7e532b5e-9175-4b00-a360-5283bd3d4e81) (trusted.abc), client:
dhcp35-190.lab.eng.blr.redhat.com-29224-2016/11/15-18:37:58:466209-r2-client-1-0-0,
error-xlator: r2-posix [File exists]

--- Additional comment from Worker Ant on 2016-11-16 04:37:03 EST ---

REVIEW: http://review.gluster.org/15837 (system/posix-acl: Log reason for
EACCES) posted (#2) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Worker Ant on 2016-11-16 05:56:09 EST ---

COMMIT: http://review.gluster.org/15836 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 83311c5bb735508274bb25cc610e929481577f9b
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Sat Nov 12 16:47:00 2016 +0530

    protocol/server: Print error-xlator name

    Problem:
    At the moment from which xlator the errors are stemming from is a mystery.

    Fix:
    With this patch we can find on the server side which xlator first gave
    the errno received by server xlator. I am not yet sure how to get this
    done for client side which has lot of copy_frame()s. May be another
    patch.

    Change-Id: Ie13307b965facf2a496123e81ce0bd6756f98ac9
    BUG: 1394548
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/15836
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    Reviewed-by: Vijay Bellur <vbellur at redhat.com>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>

--- Additional comment from Worker Ant on 2016-11-16 11:27:55 EST ---

REVIEW: http://review.gluster.org/15837 (system/posix-acl: Log reason for
EACCES) posted (#3) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)

--- Additional comment from Worker Ant on 2016-11-17 04:56:17 EST ---

COMMIT: http://review.gluster.org/15837 committed in master by Pranith Kumar
Karampuri (pkarampu at redhat.com) 
------
commit 35e8ecabd7db36431ea93f523095bc21078ecef8
Author: Pranith Kumar K <pkarampu at redhat.com>
Date:   Sun Nov 13 16:43:36 2016 +0530

    system/posix-acl: Log reason for EACCES

    It is becoming increasingly difficult to debug the reason why posix-acl
decides
    to fail a fop with EACCES. This patch prints a big log everytime such
    a condition occurs giving out the details that may help in finding why the
fop
    is errored out.

    Change-Id: I2505baaafb5d77ef6c187554ff027df9b20468db
    BUG: 1394548
    Signed-off-by: Pranith Kumar K <pkarampu at redhat.com>
    Reviewed-on: http://review.gluster.org/15837
    Smoke: Gluster Build System <jenkins at build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
    Reviewed-by: Raghavendra Talur <rtalur at redhat.com>

--- Additional comment from Worker Ant on 2016-11-17 10:33:13 EST ---

REVIEW: http://review.gluster.org/15872 (protocol/server: Print pargfid in logs
for rename error) posted (#1) for review on master by Pranith Kumar Karampuri
(pkarampu at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1394548
[Bug 1394548] Make debugging EACCES errors easier to debug
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list