[Gluster-users] libgfapi failover problem on replica bricks

Paul Penev ppquant at gmail.com
Mon Apr 21 20:38:11 UTC 2014


2014-04-21 21:12 GMT+02:00 Joe Julian <joe at julianfamily.org>:
> qemu outputs the glusterfs log to stdout:
> https://github.com/qemu/qemu/blob/stable-1.7/block/gluster.c#L211

I modified that line so that it would log to a file instead (as stdout
was not available in my case).

At the end of the logs, there is a pointer for afr trying to heal an
already healed vm image (?). That is not possible because the brick
was killed.

Start KVM

[2014-04-21 20:18:09.219715] I [socket.c:3480:socket_init] 0-gfapi:
SSL support is NOT enabled
[2014-04-21 20:18:09.219758] I [socket.c:3495:socket_init] 0-gfapi:
using system polling thread
[2014-04-21 20:18:09.227508] I [socket.c:3480:socket_init]
0-gtest-client-1: SSL support is NOT enabled
[2014-04-21 20:18:09.227530] I [socket.c:3495:socket_init]
0-gtest-client-1: using system polling thread
[2014-04-21 20:18:09.228142] I [socket.c:3480:socket_init]
0-gtest-client-0: SSL support is NOT enabled
[2014-04-21 20:18:09.228155] I [socket.c:3495:socket_init]
0-gtest-client-0: using system polling thread
[2014-04-21 20:18:09.228178] I [glfs-master.c:92:notify] 0-gfapi: New
graph 7631342d-3334-3636-3131-2d323031342f (0) coming up
[2014-04-21 20:18:09.228193] I [client.c:2155:notify]
0-gtest-client-0: parent translators are ready, attempting connect on
transport
[2014-04-21 20:18:09.228646] I [client.c:2155:notify]
0-gtest-client-1: parent translators are ready, attempting connect on
transport
[2014-04-21 20:18:09.229356] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
0-gtest-client-0: changing port to 49153 (from 0)
[2014-04-21 20:18:09.229416] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:18:09.229960] I
[client-handshake.c:1659:select_server_supported_programs]
0-gtest-client-0: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2014-04-21 20:18:09.230060] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
0-gtest-client-1: changing port to 49153 (from 0)
[2014-04-21 20:18:09.230099] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:18:09.230380] I
[client-handshake.c:1456:client_setvolume_cbk] 0-gtest-client-0:
Connected to 10.0.0.14:49153, attached to remote volume
'/var/gtest/brick'.
[2014-04-21 20:18:09.230395] I
[client-handshake.c:1468:client_setvolume_cbk] 0-gtest-client-0:
Server and Client lk-version numbers are not same, reopening the fds
[2014-04-21 20:18:09.230443] I [afr-common.c:3698:afr_notify]
0-gtest-replicate-0: Subvolume 'gtest-client-0' came back up; going
online.
[2014-04-21 20:18:09.230598] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-gtest-client-0:
Server lk version = 1
[2014-04-21 20:18:09.230922] I
[client-handshake.c:1659:select_server_supported_programs]
0-gtest-client-1: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2014-04-21 20:18:09.231290] I
[client-handshake.c:1456:client_setvolume_cbk] 0-gtest-client-1:
Connected to 10.0.0.15:49153, attached to remote volume
'/var/gtest/brick'.
[2014-04-21 20:18:09.231311] I
[client-handshake.c:1468:client_setvolume_cbk] 0-gtest-client-1:
Server and Client lk-version numbers are not same, reopening the fds
[2014-04-21 20:18:09.247979] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-gtest-client-1:
Server lk version = 1
[2014-04-21 20:18:09.248451] I
[afr-common.c:2057:afr_set_root_inode_on_first_lookup]
0-gtest-replicate-0: added root inode
[2014-04-21 20:18:09.249045] I [afr-common.c:2120:afr_discovery_cbk]
0-gtest-replicate-0: selecting local read_child gtest-client-0
[2014-04-21 20:18:09.249125] I
[glfs-resolve.c:788:__glfs_active_subvol] 0-gtest: switched to graph
7631342d-3334-3636-3131-2d323031342f (0)


Kill glusterfsd:

[2014-04-21 20:21:35.291871] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:21:35.291914] W
[socket.c:1962:__socket_proto_state_machine] 0-gtest-client-0: reading
from socket failed. Error (No data available), peer (10.0.0.14:49153)
[2014-04-21 20:21:35.291972] I [client.c:2098:client_rpc_notify]
0-gtest-client-0: disconnected
[2014-04-21 20:21:46.242636] I [rpc-clnt.c:1676:rpc_clnt_reconfig]
0-gtest-client-0: changing port to 49153 (from 0)
[2014-04-21 20:21:46.242721] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:21:46.243047] E [socket.c:2157:socket_connect_finish]
0-gtest-client-0: connection to 10.0.0.14:49153 failed (Connection
refused)
[2014-04-21 20:21:46.243073] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:21:50.243299] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)

On restart of gluster server:

[2014-04-21 20:23:26.266552] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:23:27.566449] W [socket.c:514:__socket_rwv] 0-gfapi:
readv failed (No data available)
[2014-04-21 20:23:27.566483] W
[socket.c:1962:__socket_proto_state_machine] 0-gfapi: reading from
socket failed. Error (No data available), peer (127.0.0.1:24007)
[2014-04-21 20:23:29.267245] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:23:32.268085] W [socket.c:514:__socket_rwv]
0-gtest-client-0: readv failed (No data available)
[2014-04-21 20:23:35.269082] I
[client-handshake.c:1659:select_server_supported_programs]
0-gtest-client-0: Using Program GlusterFS 3.3, Num (1298437), Version
(330)
[2014-04-21 20:23:35.269343] I
[client-handshake.c:1456:client_setvolume_cbk] 0-gtest-client-0:
Connected to 10.0.0.14:49153, attached to remote volume
'/var/gtest/brick'.
[2014-04-21 20:23:35.269373] I
[client-handshake.c:1468:client_setvolume_cbk] 0-gtest-client-0:
Server and Client lk-version numbers are not same, reopening the fds
[2014-04-21 20:23:35.269391] I
[client-handshake.c:1308:client_post_handshake] 0-gtest-client-0: 1
fds open - Delaying child_up until they are re-opened
[2014-04-21 20:23:35.269854] I
[client-handshake.c:930:client_child_up_reopen_done] 0-gtest-client-0:
last fd open'd/lock-self-heal'd - notifying CHILD-UP
[2014-04-21 20:23:35.269997] I
[client-handshake.c:450:client_set_lk_version_cbk] 0-gtest-client-0:
Server lk version = 1
[2014-04-21 20:23:38.269808] I [glfs-mgmt.c:429:mgmt_getspec_cbk]
0-gfapi: No change in volfile, continuing

On killing the second replica (after heal completed). Causes KVM to
throw disk errors.

[2014-04-21 20:25:51.542494] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:25:51.542546] W
[socket.c:1962:__socket_proto_state_machine] 0-gtest-client-1: reading
from socket failed. Error (No data available), peer (10.0.0.15:49153)
[2014-04-21 20:25:51.542591] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:02.283871] E
[client-handshake.c:1742:client_query_portmap_cbk] 0-gtest-client-1:
failed to get the port number for remote subvolume. Please run
'gluster volume status' on server to see if brick process is running.
[2014-04-21 20:26:02.283925] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:02.283945] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:05.284560] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:05.284607] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:08.285137] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:08.285185] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:11.285524] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:11.285591] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:14.286501] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:14.286552] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:17.287190] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:17.287238] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:20.287539] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:20.287589] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected
[2014-04-21 20:26:22.735155] E
[afr-self-heal-common.c:2260:afr_self_heal_completion_cbk]
0-gtest-replicate-0: background  meta-data data self-heal failed on
/images/1002/vm-1002-disk-1.qcow2
[2014-04-21 20:26:23.288242] W [socket.c:514:__socket_rwv]
0-gtest-client-1: readv failed (No data available)
[2014-04-21 20:26:23.288281] I [client.c:2098:client_rpc_notify]
0-gtest-client-1: disconnected

I'm afraid that my knowledge on gluster internals is not enought for
interpreting these logs.

Paul



More information about the Gluster-users mailing list