[Gluster-devel] Any review is appreciated. Reason about gluster server_connection_cleanup uncleanly, file flocks leaks in frequently network disconnection

Wed Sep 17 09:15:13 UTC 2014

Hi all,

By several days tracking, we finally pinpointed the reason of glusterfs
uncleanly
detach file flocks in frequently network disconnection. We are now working
on
a patch to submit. And here is this issue details. Any suggestions will be
appreciated!

First of all, as I mentioned in
http://supercolony.gluster.org/pipermail/gluster-devel/2014-September/042233.html
This issue happens in a frequently network disconnection.

According to the sources, the server cleanup jobs is in
server_connection_cleanup.
When the RPCSVC_EVENT_DISCONNECT happens, it will come here:

int
server_rpc_notify ()
{
......
        case RPCSVC_EVENT_DISCONNECT:
......
                if (!conf->lk_heal) {
                        server_conn_ref (conn);
                        server_connection_put (this, conn, &detached);
                        if (detached)
                                server_connection_cleanup (this, conn,
                                                           INTERNAL_LOCKS |
                                                           POSIX_LOCKS);
                        server_conn_unref (conn);
......
}

The server_connection_cleanup() will be called while variable 'detached' is
true.
And the 'detached' is set by server_connection_put():
server_connection_t*
server_connection_put (xlator_t *this, server_connection_t *conn,
                       gf_boolean_t *detached)
{
        server_conf_t       *conf = NULL;
        gf_boolean_t        unref = _gf_false;

        if (detached)
                *detached = _gf_false;
        conf = this->private;
        pthread_mutex_lock (&conf->mutex);
        {
                conn->bind_ref--;
                if (!conn->bind_ref) {
                        list_del_init (&conn->list);
                        unref = _gf_true;
                }
        }
        pthread_mutex_unlock (&conf->mutex);
        if (unref) {
                gf_log (this->name, GF_LOG_INFO, "Shutting down connection
%s",
                        conn->id);
                if (detached)
                        *detached = _gf_true;
                server_conn_unref (conn);
                conn = NULL;
        }
        return conn;
}

The 'detached' is only set _gf_true when 'conn->bind_ref' decrease to 0.
This 'conn->bind_ref' is set in server_connection_get(), increase or set to
1.

server_connection_t *
server_connection_get (xlator_t *this, const char *id)
{
......
                list_for_each_entry (trav, &conf->conns, list) {
                        if (!strcmp (trav->id, id)) {
                                conn = trav;
                                conn->bind_ref++;
                                goto unlock;
                        }
                }
......
}

When the connection id is same, then the 'conn->bind_ref' will be increased.
Therefore, the problem should be a reference mismatch increase or decrease.
Then
we add some logs to verify our guess.

// 1st connection comes in. and there is no id
'host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
in the connection table. The 'conn->bind_ref' is set to 1.
[2014-09-17 04:42:28.950693] D [server-helpers.c:712:server_connection_get]
0-vs_vol_rep2-server: server connection id:
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
conn->bind_ref:1, found:0
[2014-09-17 04:42:28.950717] D [server-handshake.c:430:server_setvolume]
0-vs_vol_rep2-server: Connected to
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
[2014-09-17 04:42:28.950758] I [server-handshake.c:567:server_setvolume]
0-vs_vol_rep2-server: accepted client from
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
(version: 3.4.5) (peer: host-000c29e93d20:1015)
......
// Keep running several minutes.......
......
// Network disconnected here. The TCP socket of client side is disconnected
by
time-out, by the server-side socket still keep connected. AT THIS MOMENT,
network restore. Client side reconnect a new TCP connection JUST BEFORE the
last socket on server-side is reset. Note that at this point, there is 2
valid
sockets on server side. The later new connection use the same conn id
'host-000
c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0' look up
in the
connection table and increase the 'conn->bind_ref' to 2.

[2014-09-17 04:46:16.135066] D [server-helpers.c:712:server_connection_get]
0-vs_vol_rep2-server: server connection id:
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
conn->bind_ref:2, found:1 // HERE IT IS, ref increase to 2!!!
[2014-09-17 04:46:16.135113] D [server-handshake.c:430:server_setvolume]
0-vs_vol_rep2-server: Connected to
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
[2014-09-17 04:46:16.135157] I [server-handshake.c:567:server_setvolume]
0-vs_vol_rep2-server: accepted client from
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
(version: 3.4.5) (peer: host-000c29e93d20:1018)

// After 13 seconds, the old connection is reset, decrease the
'conn->bind_ref' to 1.

[2014-09-17 04:46:28.688780] W [socket.c:2121:__socket_proto_state_machine]
0-tcp.vs_vol_rep2-server: ret = -1, error: Connection reset by peer, peer
(host-000c29e93d20:1015)
[2014-09-17 04:46:28.688790] I [socket.c:2274:socket_event_handler]
0-transport: socket_event_poll_in failed, ret=-1.
[2014-09-17 04:46:28.688797] D [socket.c:2281:socket_event_handler]
0-transport: disconnecting now
[2014-09-17 04:46:28.688831] I [server.c:762:server_rpc_notify]
0-vs_vol_rep2-server: disconnecting connectionfrom
host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0(host-000c29e93d20:1015)
[2014-09-17 04:46:28.688861] D [server-helpers.c:744:server_connection_put]
0-vs_vol_rep2-server: conn->bind_ref:1

In our production environment, there is some flocks in the 1st connection.
According to the logs, there is no way to cleanup the flocks in the 1st
connection.
And the 2nd new connection, the client-side can't flock again.

Therefore, we think the major reason is different connections using the
same conn id.
The conn id is assembled in client_setvolume()

ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
                           this->ctx->process_uuid, this->name,
                           this->graph->id);

The conn id contains 3 parts:
this->ctx->process_uuid: hostname + pid + startup timestamp
this->name: tranlator name
this->graph->id: graph id

It is apparently that the conn id is same unless the client side restart.
So when
network disconnects, there is some chance that socket on client side
timeout and
the one on server side is alive. At this moment, network restore, client
reconnect
before server old socket reset, that will cause the file flocks of old
connection
unclean.

That is our total analysis of this flock leak issue. Now we are working on
the patch.
Hope someone could review it when it is finished.

Any other comment is grateful, Thank you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140917/2cb6adfc/attachment.html>