[Bugs] [Bug 1656000] New: file access problem with encrypt xlator in case one brick is down

bugzilla at redhat.com bugzilla at redhat.com
Tue Dec 4 12:09:41 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1656000

            Bug ID: 1656000
           Summary: file access problem with encrypt xlator in case one
                    brick is down
           Product: GlusterFS
           Version: 5
         Component: encryption-xlator
          Assignee: bugs at gluster.org
          Reporter: vpolakis at gmail.com
                CC: bugs at gluster.org



Description of problem:
setup: replica x3 no distribution using encrypt xlator.
if 3 bricks are up all clients are able to access all the files.
if one brick goes down, every client is able to access the files that created.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
setup: replica x3 no distribution using encrypt xlator.
if 3 bricks are up all clients are able to access all the files.
if one brick goes down, every client is able to access the files that created.

# ls -la /client_a_v10_1/filecreatedfromclient_a_2
ls: cannot access /client_a_v10_1/filecreatedfromclient_a_2: Invalid argument

[2018-11-28 16:48:19.095228] D [crypt.c:4059:crypt_lookup] 0-oam_enc-crypt:
Lookup /
[2018-11-28 16:48:19.096445] D [MSGID: 0]
[dht-common.c:1046:dht_revalidate_cbk] 0-oam_enc-dht: revalidate lookup of /
returned with op_ret 0 [Structure needs cleaning]
[2018-11-28 16:48:19.096631] D [crypt.c:4059:crypt_lookup] 0-oam_enc-crypt:
Lookup /filecreatedfromclient_a_2
[2018-11-28 16:48:19.096663] D [MSGID: 0] [dht-common.c:2743:dht_lookup]
0-oam_enc-dht: Calling fresh lookup for /filecreatedfromclient_a_2 on
oam_enc-replicate-0
[2018-11-28 16:48:19.097397] D [MSGID: 0] [dht-common.c:2328:dht_lookup_cbk]
0-oam_enc-dht: fresh_lookup returned for /filecreatedfromclient_a_2 with op_ret
0
[2018-11-28 16:48:19.097429] D [MSGID: 0] [dht-layout.c:870:dht_layout_preset]
0-oam_enc-dht: file = 00000000-0000-0000-0000-000000000000, subvol =
oam_enc-replicate-0
[2018-11-28 16:48:19.097482] D [MSGID: 0] [afr-read-txn.c:220:afr_read_txn]
0-oam_enc-replicate-0: 00000000-0000-0000-0000-000000000000: generation now vs
cached: 2, 0
[2018-11-28 16:48:19.097992] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-oam_enc-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2018-11-28 16:48:19.098017] D [MSGID: 0]
[client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x7ff89800f900, oam_enc-client-0 returned -1 error: Invalid argument [Invalid
argument]
[2018-11-28 16:48:19.098188] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-oam_enc-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2018-11-28 16:48:19.098206] D [MSGID: 0]
[client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x7ff89800f900, oam_enc-client-1 returned -1 error: Invalid argument [Invalid
argument]
[2018-11-28 16:48:19.098231] D [MSGID: 0]
[afr-inode-read.c:522:afr_getxattr_wind] 0-stack-trace: stack-address:
0x7ff89800f900, oam_enc-replicate-0 returned -1 error: Invalid argument
[Invalid argument]
[2018-11-28 16:48:19.098246] D [MSGID: 0] [dht-common.c:3392:dht_getxattr_cbk]
0-stack-trace: stack-address: 0x7ff89800f900, oam_enc-dht returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.098266] D [MSGID: 0] [crypt.c:3835:load_file_size]
0-stack-trace: stack-address: 0x7ff89800f900, oam_enc-crypt returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.098279] D [MSGID: 0] [io-cache.c:267:ioc_lookup_cbk]
0-stack-trace: stack-address: 0x7ff89800f900, oam_enc-io-cache returned -1
error: Invalid argument [Invalid argument]
[2018-11-28 16:48:19.098287] D [MSGID: 0] [md-cache.c:1048:mdc_lookup_cbk]
0-stack-trace: stack-address: 0x7ff89800f900, oam_enc-md-cache returned -1
error: Invalid argument [Invalid argument]
[2018-11-28 16:48:19.098299] D [MSGID: 0] [io-stats.c:2186:io_stats_lookup_cbk]
0-stack-trace: stack-address: 0x7ff89800f900, oam_enc returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.098309] W [fuse-resolve.c:61:fuse_resolve_entry_cbk]
0-fuse: 00000000-0000-0000-0000-000000000001/filecreatedfromclient_a_2: failed
to resolve (Invalid argument)
[2018-11-28 16:48:19.098373] D [crypt.c:4059:crypt_lookup] 0-oam_enc-crypt:
Lookup /filecreatedfromclient_a_2
[2018-11-28 16:48:19.098389] D [MSGID: 0] [dht-common.c:2743:dht_lookup]
0-oam_enc-dht: Calling fresh lookup for /filecreatedfromclient_a_2 on
oam_enc-replicate-0
[2018-11-28 16:48:19.099080] D [MSGID: 0] [dht-common.c:2328:dht_lookup_cbk]
0-oam_enc-dht: fresh_lookup returned for /filecreatedfromclient_a_2 with op_ret
0
[2018-11-28 16:48:19.099099] D [MSGID: 0] [dht-layout.c:870:dht_layout_preset]
0-oam_enc-dht: file = 00000000-0000-0000-0000-000000000000, subvol =
oam_enc-replicate-0
[2018-11-28 16:48:19.099126] D [MSGID: 0] [afr-read-txn.c:220:afr_read_txn]
0-oam_enc-replicate-0: 00000000-0000-0000-0000-000000000000: generation now vs
cached: 2, 0
[2018-11-28 16:48:19.099690] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-oam_enc-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2018-11-28 16:48:19.099714] D [MSGID: 0]
[client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x7ff894003450, oam_enc-client-0 returned -1 error: Invalid argument [Invalid
argument]
[2018-11-28 16:48:19.099735] W [MSGID: 114031]
[client-rpc-fops.c:2928:client3_3_lookup_cbk] 0-oam_enc-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [Invalid
argument]
[2018-11-28 16:48:19.099744] D [MSGID: 0]
[client-rpc-fops.c:2936:client3_3_lookup_cbk] 0-stack-trace: stack-address:
0x7ff894003450, oam_enc-client-1 returned -1 error: Invalid argument [Invalid
argument]
[2018-11-28 16:48:19.099780] D [MSGID: 0]
[afr-inode-read.c:522:afr_getxattr_wind] 0-stack-trace: stack-address:
0x7ff894003450, oam_enc-replicate-0 returned -1 error: Invalid argument
[Invalid argument]
[2018-11-28 16:48:19.099788] D [MSGID: 0] [dht-common.c:3392:dht_getxattr_cbk]
0-stack-trace: stack-address: 0x7ff894003450, oam_enc-dht returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.099797] D [MSGID: 0] [crypt.c:3835:load_file_size]
0-stack-trace: stack-address: 0x7ff894003450, oam_enc-crypt returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.099805] D [MSGID: 0] [io-cache.c:267:ioc_lookup_cbk]
0-stack-trace: stack-address: 0x7ff894003450, oam_enc-io-cache returned -1
error: Invalid argument [Invalid argument]
[2018-11-28 16:48:19.099812] D [MSGID: 0] [md-cache.c:1048:mdc_lookup_cbk]
0-stack-trace: stack-address: 0x7ff894003450, oam_enc-md-cache returned -1
error: Invalid argument [Invalid argument]
[2018-11-28 16:48:19.099820] D [MSGID: 0] [io-stats.c:2186:io_stats_lookup_cbk]
0-stack-trace: stack-address: 0x7ff894003450, oam_enc returned -1 error:
Invalid argument [Invalid argument]
[2018-11-28 16:48:19.099827] W [fuse-bridge.c:490:fuse_entry_cbk]
0-glusterfs-fuse: 104: LOOKUP() /filecreatedfromclient_a_2 => -1 (Invalid
argument)



# ls -la /client_a_v10_2/filecreatedfromclient_a_2
-rw-r--r--. 1 root root 0 Nov 28 17:46
/client_a_v10_2/filecreatedfromclient_a_2


[2018-11-28 16:49:55.635933] D [crypt.c:4059:crypt_lookup] 0-oam_enc-crypt:
Lookup /
[2018-11-28 16:49:55.637194] D [MSGID: 0]
[dht-common.c:1046:dht_revalidate_cbk] 0-oam_enc-dht: revalidate lookup of /
returned with op_ret 0 [Structure needs cleaning]
[2018-11-28 16:49:55.637349] D [crypt.c:4059:crypt_lookup] 0-oam_enc-crypt:
Lookup /filecreatedfromclient_a_2
[2018-11-28 16:49:55.637394] D [MSGID: 0] [dht-common.c:2668:dht_lookup]
0-oam_enc-dht: calling revalidate lookup for /filecreatedfromclient_a_2 at
oam_enc-replicate-0
[2018-11-28 16:49:55.638229] D [logging.c:1952:_gf_msg_internal]
0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to
flush least recently used log message to disk
[2018-11-28 16:49:55.638167] D [MSGID: 0]
[dht-common.c:1046:dht_revalidate_cbk] 0-oam_enc-dht: revalidate lookup of
/filecreatedfromclient_a_2 returned with op_ret 0
[2018-11-28 16:49:55.638228] D [MSGID: 0] [afr-read-txn.c:220:afr_read_txn]
0-oam_enc-replicate-0: e94dc9b7-f25b-4dbf-ba4b-ec20ec3e03b9: generation now vs
cached: 2, 0
[2018-11-28 16:49:55.639554] D [crypt.c:3802:load_file_size] 0-oam_enc-crypt:
FOP 27: Translate regular file to 0
[2018-11-28 16:49:57.246874] D [name.c:168:client_fill_address_family]
0-oam_enc-client-2: address-family not specified, marking it as unspec for
getaddrinfo to resolve from (remote-host:
host9314567f944f432ca6f842fd107-storage-node-1.storage-server.net)
[2018-11-28 16:49:57.251435] D [MSGID: 0] [common-utils.c:334:gf_resolve_ip6]
0-resolver: returning ip-172.24.26.104 (port-24007) for hostname:
host9314567f944f432ca6f842fd107-storage-node-1.storage-server.net and port:
24007
[2018-11-28 16:49:57.251463] D [socket.c:2994:socket_fix_ssl_opts]
0-oam_enc-client-2: disabling SSL for portmapper connection
[2018-11-28 16:49:57.251633] D [socket.c:686:__socket_shutdown]
0-oam_enc-client-2: shutdown() returned -1. Transport endpoint is not connected
[2018-11-28 16:49:57.251646] D [socket.c:733:__socket_disconnect]
0-oam_enc-client-2: __socket_teardown_connection () failed: Transport endpoint
is not connected
[2018-11-28 16:49:57.251651] D [socket.c:2426:socket_event_handler]
0-transport: EPOLLERR - disconnecting now
[2018-11-28 16:49:57.251660] D [MSGID: 0] [client.c:2260:client_rpc_notify]
0-oam_enc-client-2: got RPC_CLNT_DISCONNECT
[2018-11-28 16:49:57.252006] D
[rpc-clnt-ping.c:93:rpc_clnt_remove_ping_timer_locked] (-->
/lib64/libglusterfs.so.0(_gf_log_callingfn+0x13b)[0x7fcce6b7cfab] (-->
/lib64/libgfrpc.so.0(rpc_clnt_remove_ping_timer_locked+0x8b)[0x7fcce6948d0b]
(--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x5f)[0x7fcce694513f]
(--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2a0)[0x7fcce6945c20] (-->
/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fcce6941a23] ))))) 0-:
172.24.26.104:24007: ping timer event already removed
[2018-11-28 16:49:57.252028] D [MSGID: 0]
[event-epoll.c:588:event_dispatch_epoll_handler] 0-epoll: generation bumped on
idx=2 from gen=292 to slot->gen=293, fd=6, slot->fd=6


>From what i understood the problem is that first lookup is triggered with
filename. Then nameless lookup is triggered with gfid and then, finaly,
getxattr for trusted.glusterfs.crypt.att.size.
In case the file is created from own client last lookup includes the gfid but
if the file is created from different client, although gfid is returned from
the first lookup from server, it is not included in the nameless lookup.
This results error from server which of course causes failure in the whole
operation. 

>From what i have seen, the problem appears to be in the crypt xlator during
client3_3_lookup/crypt_lookup_cbk.
There valid gfid i included in the buf->ia_gfid but not in inode. 

   local->inode = inode_ref(inode);
   local->buf = *buf;
   local->postbuf = *postparent;
   if (xdata)
       local->xdata = dict_ref(xdata);
   gf_uuid_copy(local->loc->gfid, buf->ia_gfid);

logs are from 3.7 but the same issue happens at version 5

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list