[Gluster-devel] Any review is appreciated. Reason about gluster server_connection_cleanup uncleanly, file flocks leaks in frequently network disconnection
Jaden Liang
jaden1q84 at gmail.com
Fri Sep 19 07:14:13 UTC 2014
Hi all,
Here is a patch for this file flocks uncleanly disconnect issue of
gluster-3.4.5.
I am totally new guy in the gluster development work flow, and still trying
to
understand how to submit this patch to Gerrit. So I want to paste the patch
here first to let devel team know, and submit it after I figure out the
Gerrit :-).
The major modification is adding an id for different tcp connection between
a
pair client and server to avoid a connection socket not close at the same
time.
diff --git a/rpc/rpc-lib/src/rpc-clnt.h b/rpc/rpc-lib/src/rpc-clnt.h
index 263d5f7..718308d 100644
--- a/rpc/rpc-lib/src/rpc-clnt.h
+++ b/rpc/rpc-lib/src/rpc-clnt.h
@@ -143,6 +143,7 @@ struct rpc_clnt_connection {
struct timeval last_sent;
struct timeval last_received;
int32_t ping_started;
+ uint32_t clnt_conn_id;
};
typedef struct rpc_clnt_connection rpc_clnt_connection_t;
diff --git a/xlators/protocol/client/src/client-handshake.c
b/xlators/protocol/client/src/client-handshake.c
index d2083e6..1c2fc2f 100644
--- a/xlators/protocol/client/src/client-handshake.c
+++ b/xlators/protocol/client/src/client-handshake.c
@@ -471,9 +471,10 @@ client_set_lk_version (xlator_t *this)
conf = (clnt_conf_t *) this->private;
req.lk_ver = client_get_lk_ver (conf);
- ret = gf_asprintf (&req.uid, "%s-%s-%d",
+ ret = gf_asprintf (&req.uid, "%s-%s-%d-%u",
this->ctx->process_uuid, this->name,
- this->graph->id);
+ this->graph->id,
+ (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
if (ret == -1)
goto err;
@@ -1549,13 +1550,22 @@ client_setvolume (xlator_t *this, struct rpc_clnt
*rpc)
}
}
+ /* For different connections between a pair client and server, we
use a
+ * different clnt_conn_id to identify. Otherwise, there are some
chances
+ * lead to flocks not released in a uncleanly disconnection.
+ * */
+ if (conf->rpc) {
+ conf->rpc->conn.clnt_conn_id = conf->clnt_conn_id++;
+ }
+
/* With multiple graphs possible in the same process, we need a
field to bring the uniqueness. Graph-ID should be enough to get
the
job done
*/
- ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
+ ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d-%u",
this->ctx->process_uuid, this->name,
- this->graph->id);
+ this->graph->id,
+ (conf->rpc) ? conf->rpc->conn.clnt_conn_id : 0);
if (-1 == ret) {
gf_log (this->name, GF_LOG_ERROR,
"asprintf failed while setting process_uuid");
diff --git a/xlators/protocol/client/src/client.c
b/xlators/protocol/client/src/client.c
index ad95574..35fef49 100644
--- a/xlators/protocol/client/src/client.c
+++ b/xlators/protocol/client/src/client.c
@@ -2437,6 +2437,7 @@ init (xlator_t *this)
conf->lk_version = 1;
conf->grace_timer = NULL;
conf->grace_timer_needed = _gf_true;
+ conf->clnt_conn_id = 0;
ret = client_init_grace_timer (this, this->options, conf);
if (ret)
diff --git a/xlators/protocol/client/src/client.h
b/xlators/protocol/client/src/client.h
index 0a27c09..dea90d1 100644
--- a/xlators/protocol/client/src/client.h
+++ b/xlators/protocol/client/src/client.h
@@ -116,6 +116,9 @@ typedef struct clnt_conf {
*/
gf_boolean_t filter_o_direct; /* if set, filter O_DIRECT
from
the flags list of
open() */
+ uint32_t clnt_conn_id; /* connection id for each
connection
+ in process_uuid, start
with 0,
+ increase once a new
connection */
} clnt_conf_t;
typedef struct _client_fd_ctx {
On Wednesday, September 17, 2014, Jaden Liang <jaden1q84 at gmail.com> wrote:
> Hi all,
>
> By several days tracking, we finally pinpointed the reason of glusterfs
> uncleanly
> detach file flocks in frequently network disconnection. We are now working
> on
> a patch to submit. And here is this issue details. Any suggestions will be
> appreciated!
>
> First of all, as I mentioned in
>
> http://supercolony.gluster.org/pipermail/gluster-devel/2014-September/042233.html
> This issue happens in a frequently network disconnection.
>
> According to the sources, the server cleanup jobs is in
> server_connection_cleanup.
> When the RPCSVC_EVENT_DISCONNECT happens, it will come here:
>
> int
> server_rpc_notify ()
> {
> ......
> case RPCSVC_EVENT_DISCONNECT:
> ......
> if (!conf->lk_heal) {
> server_conn_ref (conn);
> server_connection_put (this, conn, &detached);
> if (detached)
> server_connection_cleanup (this, conn,
> INTERNAL_LOCKS |
> POSIX_LOCKS);
> server_conn_unref (conn);
> ......
> }
>
> The server_connection_cleanup() will be called while variable 'detached'
> is true.
> And the 'detached' is set by server_connection_put():
> server_connection_t*
> server_connection_put (xlator_t *this, server_connection_t *conn,
> gf_boolean_t *detached)
> {
> server_conf_t *conf = NULL;
> gf_boolean_t unref = _gf_false;
>
> if (detached)
> *detached = _gf_false;
> conf = this->private;
> pthread_mutex_lock (&conf->mutex);
> {
> conn->bind_ref--;
> if (!conn->bind_ref) {
> list_del_init (&conn->list);
> unref = _gf_true;
> }
> }
> pthread_mutex_unlock (&conf->mutex);
> if (unref) {
> gf_log (this->name, GF_LOG_INFO, "Shutting down connection
> %s",
> conn->id);
> if (detached)
> *detached = _gf_true;
> server_conn_unref (conn);
> conn = NULL;
> }
> return conn;
> }
>
> The 'detached' is only set _gf_true when 'conn->bind_ref' decrease to 0.
> This 'conn->bind_ref' is set in server_connection_get(), increase or set
> to 1.
>
> server_connection_t *
> server_connection_get (xlator_t *this, const char *id)
> {
> ......
> list_for_each_entry (trav, &conf->conns, list) {
> if (!strcmp (trav->id, id)) {
> conn = trav;
> conn->bind_ref++;
> goto unlock;
> }
> }
> ......
> }
>
> When the connection id is same, then the 'conn->bind_ref' will be
> increased.
> Therefore, the problem should be a reference mismatch increase or
> decrease. Then
> we add some logs to verify our guess.
>
> // 1st connection comes in. and there is no id
> 'host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
> in the connection table. The 'conn->bind_ref' is set to 1.
> [2014-09-17 04:42:28.950693] D
> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server: server
> connection id:
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
> conn->bind_ref:1, found:0
> [2014-09-17 04:42:28.950717] D [server-handshake.c:430:server_setvolume]
> 0-vs_vol_rep2-server: Connected to
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> [2014-09-17 04:42:28.950758] I [server-handshake.c:567:server_setvolume]
> 0-vs_vol_rep2-server: accepted client from
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> (version: 3.4.5) (peer: host-000c29e93d20:1015)
> ......
> // Keep running several minutes.......
> ......
> // Network disconnected here. The TCP socket of client side is
> disconnected by
> time-out, by the server-side socket still keep connected. AT THIS MOMENT,
> network restore. Client side reconnect a new TCP connection JUST BEFORE
> the
> last socket on server-side is reset. Note that at this point, there is 2
> valid
> sockets on server side. The later new connection use the same conn id
> 'host-000
> c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0' look up
> in the
> connection table and increase the 'conn->bind_ref' to 2.
>
> [2014-09-17 04:46:16.135066] D
> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server: server
> connection id:
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0,
> conn->bind_ref:2, found:1 // HERE IT IS, ref increase to 2!!!
> [2014-09-17 04:46:16.135113] D [server-handshake.c:430:server_setvolume]
> 0-vs_vol_rep2-server: Connected to
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> [2014-09-17 04:46:16.135157] I [server-handshake.c:567:server_setvolume]
> 0-vs_vol_rep2-server: accepted client from
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0
> (version: 3.4.5) (peer: host-000c29e93d20:1018)
>
> // After 13 seconds, the old connection is reset, decrease the
> 'conn->bind_ref' to 1.
>
> [2014-09-17 04:46:28.688780] W
> [socket.c:2121:__socket_proto_state_machine] 0-tcp.vs_vol_rep2-server: ret
> = -1, error: Connection reset by peer, peer (host-000c29e93d20:1015)
> [2014-09-17 04:46:28.688790] I [socket.c:2274:socket_event_handler]
> 0-transport: socket_event_poll_in failed, ret=-1.
> [2014-09-17 04:46:28.688797] D [socket.c:2281:socket_event_handler]
> 0-transport: disconnecting now
> [2014-09-17 04:46:28.688831] I [server.c:762:server_rpc_notify]
> 0-vs_vol_rep2-server: disconnecting connectionfrom
> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0(host-000c29e93d20:1015)
> [2014-09-17 04:46:28.688861] D
> [server-helpers.c:744:server_connection_put] 0-vs_vol_rep2-server:
> conn->bind_ref:1
>
> In our production environment, there is some flocks in the 1st connection.
> According to the logs, there is no way to cleanup the flocks in the 1st
> connection.
> And the 2nd new connection, the client-side can't flock again.
>
> Therefore, we think the major reason is different connections using the
> same conn id.
> The conn id is assembled in client_setvolume()
>
> ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
> this->ctx->process_uuid, this->name,
> this->graph->id);
>
> The conn id contains 3 parts:
> this->ctx->process_uuid: hostname + pid + startup timestamp
> this->name: tranlator name
> this->graph->id: graph id
>
> It is apparently that the conn id is same unless the client side restart.
> So when
> network disconnects, there is some chance that socket on client side
> timeout and
> the one on server side is alive. At this moment, network restore, client
> reconnect
> before server old socket reset, that will cause the file flocks of old
> connection
> unclean.
>
> That is our total analysis of this flock leak issue. Now we are working on
> the patch.
> Hope someone could review it when it is finished.
>
> Any other comment is grateful, Thank you!
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140919/6d03c103/attachment-0001.html>
More information about the Gluster-devel
mailing list