[Gluster-devel] Any review is appreciated. Reason about gluster server_connection_cleanup uncleanly, file flocks leaks in frequently network disconnection

Fri Sep 19 07:14:13 UTC 2014

Hi all,

Here is a patch for this file flocks uncleanly disconnect issue of
gluster-3.4.5.
I am totally new guy in the gluster development work flow, and still trying
to
understand how to submit this patch to Gerrit. So I want to paste the patch
here first to let devel team know, and submit it after I figure out the
Gerrit :-).

The major modification is adding an id for different tcp connection between
a
pair client and server to avoid a connection socket not close at the same
time.

diff --git a/rpc/rpc-lib/src/rpc-clnt.h b/rpc/rpc-lib/src/rpc-clnt.h
index 263d5f7..718308d 100644
--- a/rpc/rpc-lib/src/rpc-clnt.h
+++ b/rpc/rpc-lib/src/rpc-clnt.h
@@ -143,6 +143,7 @@ struct rpc_clnt_connection {
  struct timeval           last_sent;
  struct timeval           last_received;
  int32_t                  ping_started;
+        uint32_t             clnt_conn_id;
 };
 typedef struct rpc_clnt_connection rpc_clnt_connection_t;

diff --git a/xlators/protocol/client/src/client-handshake.c
b/xlators/protocol/client/src/client-handshake.c
index d2083e6..1c2fc2f 100644
--- a/xlators/protocol/client/src/client-handshake.c
+++ b/xlators/protocol/client/src/client-handshake.c
@@ -471,9 +471,10 @@ client_set_lk_version (xlator_t *this)
         conf = (clnt_conf_t *) this->private;

         req.lk_ver = client_get_lk_ver (conf);
-        ret = gf_asprintf (&req.uid, "%s-%s-%d",
+        ret = gf_asprintf (&req.uid, "%s-%s-%d-%u",
                            this->ctx->process_uuid, this->name,
-                           this->graph->id);
+                           this->graph->id,
+                           (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
         if (ret == -1)
                 goto err;

@@ -1549,13 +1550,22 @@ client_setvolume (xlator_t *this, struct rpc_clnt
*rpc)
                 }
         }

+        /* For different connections between a pair client and server, we
use a
+         * different clnt_conn_id to identify. Otherwise, there are some
chances
+         * lead to flocks not released in a uncleanly disconnection.
+         * */
+        if (conf->rpc) {
+                conf->rpc->conn.clnt_conn_id = conf->clnt_conn_id++;
+        }
+
         /* With multiple graphs possible in the same process, we need a
            field to bring the uniqueness. Graph-ID should be enough to get
the
            job done
         */
-        ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
+        ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d-%u",
                            this->ctx->process_uuid, this->name,
-                           this->graph->id);
+                           this->graph->id,
+                           (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
         if (-1 == ret) {
                 gf_log (this->name, GF_LOG_ERROR,
                         "asprintf failed while setting process_uuid");
diff --git a/xlators/protocol/client/src/client.c
b/xlators/protocol/client/src/client.c
index ad95574..35fef49 100644
--- a/xlators/protocol/client/src/client.c
+++ b/xlators/protocol/client/src/client.c
@@ -2437,6 +2437,7 @@ init (xlator_t *this)
         conf->lk_version         = 1;
         conf->grace_timer        = NULL;
         conf->grace_timer_needed = _gf_true;
+        conf->clnt_conn_id       = 0;

         ret = client_init_grace_timer (this, this->options, conf);
         if (ret)
diff --git a/xlators/protocol/client/src/client.h
b/xlators/protocol/client/src/client.h
index 0a27c09..dea90d1 100644
--- a/xlators/protocol/client/src/client.h
+++ b/xlators/protocol/client/src/client.h
@@ -116,6 +116,9 @@ typedef struct clnt_conf {
  */
         gf_boolean_t           filter_o_direct; /* if set, filter O_DIRECT
from
                                                    the flags list of
open() */
+        uint32_t               clnt_conn_id; /* connection id for each
connection
+                                                in process_uuid, start
with 0,
+                                                increase once a new
connection */
 } clnt_conf_t;

 typedef struct _client_fd_ctx {


On Wednesday, September 17, 2014, Jaden Liang <jaden1q84 at gmail.com> wrote:

> Hi all,
>
> By several days tracking, we finally pinpointed the reason of glusterfs
> uncleanly
> detach file flocks in frequently network disconnection. We are now working
> on
> a patch to submit. And here is this issue details. Any suggestions will be
> appreciated!
>
> First of all, as I mentioned in
>
> http://supercolony.gluster.org/pipermail/gluster-devel/2014-September/042233.html
> This issue happens in a frequently network disconnection.
>
> According to the sources, the server cleanup jobs is in
> server_connection_cleanup.
> When the RPCSVC_EVENT_DISCONNECT happens, it will come here:
>
> int
> server_rpc_notify ()
> {
> ......
>         case RPCSVC_EVENT_DISCONNECT:
> ......
>                 if (!conf->lk_heal) {
>                         server_conn_ref (conn);
>                         server_connection_put (this, conn, &detached);
>                         if (detached)
>                                 server_connection_cleanup (this, conn,
>                                                            INTERNAL_LOCKS |
>                                                            POSIX_LOCKS);
>                         server_conn_unref (conn);
> ......
> }
>
> The server_connection_cleanup() will be called while variable 'detached'
> is true.
> And the 'detached' is set by server_connection_put():
> server_connection_t*
> server_connection_put (xlator_t *this, server_connection_t *conn,
>                        gf_boolean_t *detached)
> {
>         server_conf_t       *conf = NULL;
>         gf_boolean_t        unref = _gf_false;
>
>         if (detached)
>                 *detached = _gf_false;
>         conf = this->private;
>         pthread_mutex_lock (&conf->mutex);
>         {
>                 conn->bind_ref--;
>                 if (!conn->bind_ref) {
>                         list_del_init (&conn->list);
>                         unref = _gf_true;
>                 }
>         }
>         pthread_mutex_unlock (&conf->mutex);
>         if (unref) {
>                 gf_log (this->name, GF_LOG_INFO, "Shutting down connection
> %s",
>                         conn->id);
>                 if (detached)
>                         *detached = _gf_true;
>                 server_conn_unref (conn);
>                 conn = NULL;
>         }
>         return conn;
> }
>
> The 'detached' is only set _gf_true when 'conn->bind_ref' decrease to 0.
> This 'conn->bind_ref' is set in server_connection_get(), increase or set
> to 1.
>
> server_connection_t *
> server_connection_get (xlator_t *this, const char *id)
> {
> ......
>                 list_for_each_entry (trav, &conf->conns, list) {
>                         if (!strcmp (trav->id, id)) {
>                                 conn = trav;
>                                 conn->bind_ref++;
>                                 goto unlock;
>                         }
>                 }
> ......
> }
>
> When the connection id is same, then the 'conn->bind_ref' will be
> increased.
> Therefore, the problem should be a reference mismatch increase or
> decrease. Then
> we add some logs to verify our guess.
>
> // 1st connection comes in. and there is no id
> 'host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
> in the connection table. The 'conn->bind_ref' is set to 1.
> [2014-09-17 04:42:28.950693] D
> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server: server
> connection id:
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
> conn->bind_ref:1, found:0
> [2014-09-17 04:42:28.950717] D [server-handshake.c:430:server_setvolume]
> 0-vs_vol_rep2-server: Connected to
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> [2014-09-17 04:42:28.950758] I [server-handshake.c:567:server_setvolume]
> 0-vs_vol_rep2-server: accepted client from
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> (version: 3.4.5) (peer: host-000c29e93d20:1015)
> ......
> // Keep running several minutes.......
> ......
> // Network disconnected here. The TCP socket of client side is
> disconnected by
> time-out, by the server-side socket still keep connected. AT THIS MOMENT,
> network restore. Client side reconnect a new TCP connection JUST BEFORE
> the
> last socket on server-side is reset. Note that at this point, there is 2
> valid
> sockets on server side. The later new connection use the same conn id
> 'host-000
> c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0' look up
> in the
> connection table and increase the 'conn->bind_ref' to 2.
>
> [2014-09-17 04:46:16.135066] D
> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server: server
> connection id:
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
> conn->bind_ref:2, found:1 // HERE IT IS, ref increase to 2!!!
> [2014-09-17 04:46:16.135113] D [server-handshake.c:430:server_setvolume]
> 0-vs_vol_rep2-server: Connected to
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> [2014-09-17 04:46:16.135157] I [server-handshake.c:567:server_setvolume]
> 0-vs_vol_rep2-server: accepted client from
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> (version: 3.4.5) (peer: host-000c29e93d20:1018)
>
> // After 13 seconds, the old connection is reset, decrease the
> 'conn->bind_ref' to 1.
>
> [2014-09-17 04:46:28.688780] W
> [socket.c:2121:__socket_proto_state_machine] 0-tcp.vs_vol_rep2-server: ret
> = -1, error: Connection reset by peer, peer (host-000c29e93d20:1015)
> [2014-09-17 04:46:28.688790] I [socket.c:2274:socket_event_handler]
> 0-transport: socket_event_poll_in failed, ret=-1.
> [2014-09-17 04:46:28.688797] D [socket.c:2281:socket_event_handler]
> 0-transport: disconnecting now
> [2014-09-17 04:46:28.688831] I [server.c:762:server_rpc_notify]
> 0-vs_vol_rep2-server: disconnecting connectionfrom
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0(host-000c29e93d20:1015)
> [2014-09-17 04:46:28.688861] D
> [server-helpers.c:744:server_connection_put] 0-vs_vol_rep2-server:
> conn->bind_ref:1
>
> In our production environment, there is some flocks in the 1st connection.
> According to the logs, there is no way to cleanup the flocks in the 1st
> connection.
> And the 2nd new connection, the client-side can't flock again.
>
> Therefore, we think the major reason is different connections using the
> same conn id.
> The conn id is assembled in client_setvolume()
>
> ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
>                            this->ctx->process_uuid, this->name,
>                            this->graph->id);
>
> The conn id contains 3 parts:
> this->ctx->process_uuid: hostname + pid + startup timestamp
> this->name: tranlator name
> this->graph->id: graph id
>
> It is apparently that the conn id is same unless the client side restart.
> So when
> network disconnects, there is some chance that socket on client side
> timeout and
> the one on server side is alive. At this moment, network restore, client
> reconnect
> before server old socket reset, that will cause the file flocks of old
> connection
> unclean.
>
> That is our total analysis of this flock leak issue. Now we are working on
> the patch.
> Hope someone could review it when it is finished.
>
> Any other comment is grateful, Thank you!
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140919/6d03c103/attachment-0001.html>