[Bugs] [Bug 1394219] Better logging when reporting failures of the kind "< file-path> Failing MKNOD as quorum is not met"

bugzilla at redhat.com bugzilla at redhat.com
Wed Jan 11 08:40:16 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1394219

nchilaka <nchilaka at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ON_QA                       |VERIFIED



--- Comment #7 from nchilaka <nchilaka at redhat.com> ---
qa_validation:
This is what i see when quorum is lost and the fuse client discovers
(confirmed with dev on the expected o/p, which is as below hence moving to
verified

BEFORE:

[2017-01-10 12:17:28.684316] W [MSGID: 108001]
[afr-transaction.c:804:afr_handle_quorum] 0-rep3-replicate-0:
/dir1/linux-4.9.2/Documentation/devicetree/bindings/display/arm,pl11x.txt:
Failing SETATTR as quorum is not met

AFTER:

[2017-01-10 14:21:07.187875] W [MSGID: 108001]
[afr-transaction.c:806:afr_handle_quorum] 0-rep3-replicate-0:
0621b9e8-f9c2-435e-b679-267dfba99ef2: Failing WRITE as quorum is not met
[Transport endpoint is not connected]


test version:3.8.4-11




===>test version:glusterfs-libs-3.8.4-3.el7rhgs.x86_64

[2017-01-11 06:21:14.112038] E [rpc-clnt.c:365:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f381bb905b2] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f381b95675e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f381b95686e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f381b957fc4] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f381b9588a0] )))))
0-rep3-client-0: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13))
called at 2017-01-11 06:21:14.102905 (xid=0x2b87ea)
[2017-01-11 06:21:14.116366] W [MSGID: 114031]
[client-rpc-fops.c:855:client3_3_writev_cbk] 0-rep3-client-0: remote operation
failed [Transport endpoint is not connected]
[2017-01-11 06:21:14.119704] I [socket.c:3401:socket_submit_request]
0-rep3-client-0: not connected (priv->connected = 0)
[2017-01-11 06:21:14.119751] W [rpc-clnt.c:1640:rpc_clnt_submit]
0-rep3-client-0: failed to submit rpc-request (XID: 0x2b87eb Program: GlusterFS
3.3, ProgVers: 330, Proc: 16) to rpc-transport (rep3-client-0)
[2017-01-11 06:21:14.119770] W [MSGID: 114031]
[client-rpc-fops.c:974:client3_3_fsync_cbk] 0-rep3-client-0: remote operation
failed [Transport endpoint is not connected]
[2017-01-11 06:21:14.119794] W [MSGID: 108035]
[afr-transaction.c:2150:afr_changelog_fsync_cbk] 0-rep3-replicate-0:
fsync(83c34e00-0e26-4089-ba4a-1f54bcb14213) failed on subvolume rep3-client-0.
Transaction was WRITE [Transport endpoint is not connected]
[2017-01-11 06:21:14.119948] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 0-rep3-client-0: disconnected from
rep3-client-0. Client process will keep trying to connect to glusterd until
brick's port is available
[2017-01-11 06:21:14.120191] E [MSGID: 114058]
[client-handshake.c:1533:client_query_portmap_cbk] 0-rep3-client-0: failed to
get the port number for remote subvolume. Please run 'gluster volume status' on
server to see if brick process is running.
[2017-01-11 06:21:14.409300] W [MSGID: 114031]
[client-rpc-fops.c:1830:client3_3_fxattrop_cbk] 0-rep3-client-0: remote
operation failed
[2017-01-11 06:21:14.410305] E [MSGID: 114031]
[client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-rep3-client-0: remote
operation failed [Transport endpoint is not connected]
[2017-01-11 06:21:24.704657] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 0-rep3-client-0: disconnected from
rep3-client-0. Client process will keep trying to connect to glusterd until
brick's port is available


=====================>
test version:glusterfs-api-3.8.4-11.el7rhgs.x86_64




[2017-01-11 06:51:28.108980] W [socket.c:590:__socket_rwv] 0-rep3-client-1:
readv on 10.70.47.105:49154 failed (No data available)
[2017-01-11 06:51:28.111944] E [rpc-clnt.c:365:saved_frames_unwind] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7ff70a0b2602] (-->
/lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7ff709e7876e] (-->
/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff709e7887e] (-->
/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7ff709e79fe4] (-->
/lib64/libgfrpc.so.0(rpc_clnt_notify+0x94)[0x7ff709e7a874] )))))
0-rep3-client-1: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13))
called at 2017-01-11 06:51:28.108777 (xid=0xd20c8)
[2017-01-11 06:51:28.111991] W [MSGID: 114031]
[client-rpc-fops.c:855:client3_3_writev_cbk] 0-rep3-client-1: remote operation
failed [Transport endpoint is not connected]
[2017-01-11 06:51:28.112031] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 0-rep3-client-1: disconnected from
rep3-client-1. Client process will keep trying to connect to glusterd until
brick's port is available
[2017-01-11 06:51:28.112217] W [MSGID: 114061]
[client-common.c:460:client_pre_fsync] 0-rep3-client-1: 
(dce618ba-01fa-453d-b8ed-67af2f77d1aa) remote_fd is -1. EBADFD [File descriptor
in bad state]
[2017-01-11 06:51:28.112267] W [MSGID: 108035]
[afr-transaction.c:2221:afr_changelog_fsync_cbk] 0-rep3-replicate-0:
fsync(dce618ba-01fa-453d-b8ed-67af2f77d1aa) failed on subvolume rep3-client-1.
Transaction was WRITE [File descriptor in bad state]
[2017-01-11 06:51:33.425876] W [MSGID: 114031]
[client-rpc-fops.c:1830:client3_3_fxattrop_cbk] 0-rep3-client-1: remote
operation failed
[2017-01-11 06:51:33.430676] E [MSGID: 114031]
[client-rpc-fops.c:1601:client3_3_finodelk_cbk] 0-rep3-client-1: remote
operation failed [Transport endpoint is not connected]
[2017-01-11 06:51:38.485138] E [MSGID: 114058]
[client-handshake.c:1533:client_query_portmap_cbk] 0-rep3-client-1: failed to
get the port number for remote subvolume. Please run 'gluster volume status' on
server to see if brick process is running.
[2017-01-11 06:51:38.485228] I [MSGID: 114018]
[client.c:2280:client_rpc_notify] 0-rep3-client-1: disconnected from
rep3-client-1. Client process will keep trying to connect to glusterd until
brick's port is available





steps:
1)create a 3 node cluster
2)create a 1x3 volume
3)enable server quorum by setting cluster.server-quorum-type to server
4) mount vol on a client a do a dd of 5 gb
5)while file write is happening bring down glusterd of n2,n3
the above logs will be observed on fuse

and below is the difference before and after fix


BEFORE:

[2017-01-10 12:17:28.684316] W [MSGID: 108001]
[afr-transaction.c:804:afr_handle_quorum] 0-rep3-replicate-0:
/dir1/linux-4.9.2/Documentation/devicetree/bindings/display/arm,pl11x.txt:
Failing SETATTR as quorum is not met

AFTER:

[2017-01-10 14:21:07.187875] W [MSGID: 108001]
[afr-transaction.c:806:afr_handle_quorum] 0-rep3-replicate-0:
0621b9e8-f9c2-435e-b679-267dfba99ef2: Failing WRITE as quorum is not met
[Transport endpoint is not connected]

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=RZzsuUGiyA&a=cc_unsubscribe


More information about the Bugs mailing list