[Gluster-devel] Any review is appreciated. Reason about gluster server_connection_cleanup uncleanly, file flocks leaks in frequently network disconnection
Jaden Liang
jaden1q84 at gmail.com
Sat Sep 20 10:03:54 UTC 2014
Hi all,
Finally, I figure out the Gerrit and submit a BUG and patch.
Hope I had followed a correct procedures, :-)
Bug description:
https://bugzilla.redhat.com/show_bug.cgi?id=1144672
Patch review request:
http://review.gluster.org/#/c/8787/
Please review this patch when you are free to do it.
Thanks very much!
On Friday, September 19, 2014, Vijay Bellur <vbellur at redhat.com> wrote:
> Hi Jaden,
>
> Thanks for your effort on this one.
>
> A lot of GlusterFS developers including me are busy with various things
> over the next 2 weeks. Please do expect some delays in our responses and
> reviews till then.
>
> Cheers,
> Vijay
>
> On 09/19/2014 12:44 PM, Jaden Liang wrote:
>
>> Hi all,
>>
>> Here is a patch for this file flocks uncleanly disconnect issue of
>> gluster-3.4.5.
>> I am totally new guy in the gluster development work flow, and still
>> trying to
>> understand how to submit this patch to Gerrit. So I want to paste the
>> patch
>> here first to let devel team know, and submit it after I figure out the
>> Gerrit :-).
>>
>> The major modification is adding an id for different tcp connection
>> between a
>> pair client and server to avoid a connection socket not close at the
>> same time.
>>
>> diff --git a/rpc/rpc-lib/src/rpc-clnt.h b/rpc/rpc-lib/src/rpc-clnt.h
>> index 263d5f7..718308d 100644
>> --- a/rpc/rpc-lib/src/rpc-clnt.h
>> +++ b/rpc/rpc-lib/src/rpc-clnt.h
>> @@ -143,6 +143,7 @@ struct rpc_clnt_connection {
>> struct timeval last_sent;
>> struct timeval last_received;
>> int32_t ping_started;
>> + uint32_t clnt_conn_id;
>> };
>> typedef struct rpc_clnt_connection rpc_clnt_connection_t;
>> diff --git a/xlators/protocol/client/src/client-handshake.c
>> b/xlators/protocol/client/src/client-handshake.c
>> index d2083e6..1c2fc2f 100644
>> --- a/xlators/protocol/client/src/client-handshake.c
>> +++ b/xlators/protocol/client/src/client-handshake.c
>> @@ -471,9 +471,10 @@ client_set_lk_version (xlator_t *this)
>> conf = (clnt_conf_t *) this->private;
>> req.lk_ver = client_get_lk_ver (conf);
>> - ret = gf_asprintf (&req.uid, "%s-%s-%d",
>> + ret = gf_asprintf (&req.uid, "%s-%s-%d-%u",
>> this->ctx->process_uuid, this->name,
>> - this->graph->id);
>> + this->graph->id,
>> + (conf->rpc) ? conf->rpc->conn.clnt_conn_id :
>> 0);
>> if (ret == -1)
>> goto err;
>> @@ -1549,13 +1550,22 @@ client_setvolume (xlator_t *this, struct
>> rpc_clnt *rpc)
>> }
>> }
>> + /* For different connections between a pair client and server,
>> we use a
>> + * different clnt_conn_id to identify. Otherwise, there are
>> some chances
>> + * lead to flocks not released in a uncleanly disconnection.
>> + * */
>> + if (conf->rpc) {
>> + conf->rpc->conn.clnt_conn_id = conf->clnt_conn_id++;
>> + }
>> +
>> /* With multiple graphs possible in the same process, we need a
>> field to bring the uniqueness. Graph-ID should be enough to
>> get the
>> job done
>> */
>> - ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
>> + ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d-%u",
>> this->ctx->process_uuid, this->name,
>> - this->graph->id);
>> + this->graph->id,
>> + (conf->rpc) ? conf->rpc->conn.clnt_conn_id :
>> 0);
>> if (-1 == ret) {
>> gf_log (this->name, GF_LOG_ERROR,
>> "asprintf failed while setting process_uuid");
>> diff --git a/xlators/protocol/client/src/client.c
>> b/xlators/protocol/client/src/client.c
>> index ad95574..35fef49 100644
>> --- a/xlators/protocol/client/src/client.c
>> +++ b/xlators/protocol/client/src/client.c
>> @@ -2437,6 +2437,7 @@ init (xlator_t *this)
>> conf->lk_version = 1;
>> conf->grace_timer = NULL;
>> conf->grace_timer_needed = _gf_true;
>> + conf->clnt_conn_id = 0;
>> ret = client_init_grace_timer (this, this->options, conf);
>> if (ret)
>> diff --git a/xlators/protocol/client/src/client.h
>> b/xlators/protocol/client/src/client.h
>> index 0a27c09..dea90d1 100644
>> --- a/xlators/protocol/client/src/client.h
>> +++ b/xlators/protocol/client/src/client.h
>> @@ -116,6 +116,9 @@ typedef struct clnt_conf {
>> */
>> gf_boolean_t filter_o_direct; /* if set, filter
>> O_DIRECT from
>> the flags list of
>> open() */
>> + uint32_t clnt_conn_id; /* connection id for each
>> connection
>> + in process_uuid, start
>> with 0,
>> + increase once a new
>> connection */
>> } clnt_conf_t;
>> typedef struct _client_fd_ctx {
>>
>>
>> On Wednesday, September 17, 2014, Jaden Liang <jaden1q84 at gmail.com
>> <mailto:jaden1q84 at gmail.com>> wrote:
>>
>> Hi all,
>>
>> By several days tracking, we finally pinpointed the reason of
>> glusterfs uncleanly
>> detach file flocks in frequently network disconnection. We are now
>> working on
>> a patch to submit. And here is this issue details. Any suggestions
>> will be
>> appreciated!
>>
>> First of all, as I mentioned in
>> http://supercolony.gluster.org/pipermail/gluster-devel/
>> 2014-September/042233.html
>> This issue happens in a frequently network disconnection.
>>
>> According to the sources, the server cleanup jobs is in
>> server_connection_cleanup.
>> When the RPCSVC_EVENT_DISCONNECT happens, it will come here:
>>
>> int
>> server_rpc_notify ()
>> {
>> ......
>> case RPCSVC_EVENT_DISCONNECT:
>> ......
>> if (!conf->lk_heal) {
>> server_conn_ref (conn);
>> server_connection_put (this, conn,
>> &detached);
>> if (detached)
>> server_connection_cleanup (this,
>> conn,
>>
>> INTERNAL_LOCKS |
>>
>> POSIX_LOCKS);
>> server_conn_unref (conn);
>> ......
>> }
>>
>> The server_connection_cleanup() will be called while variable
>> 'detached' is true.
>> And the 'detached' is set by server_connection_put():
>> server_connection_t*
>> server_connection_put (xlator_t *this, server_connection_t *conn,
>> gf_boolean_t *detached)
>> {
>> server_conf_t *conf = NULL;
>> gf_boolean_t unref = _gf_false;
>>
>> if (detached)
>> *detached = _gf_false;
>> conf = this->private;
>> pthread_mutex_lock (&conf->mutex);
>> {
>> conn->bind_ref--;
>> if (!conn->bind_ref) {
>> list_del_init (&conn->list);
>> unref = _gf_true;
>> }
>> }
>> pthread_mutex_unlock (&conf->mutex);
>> if (unref) {
>> gf_log (this->name, GF_LOG_INFO, "Shutting down
>> connection %s",
>> conn->id);
>> if (detached)
>> *detached = _gf_true;
>> server_conn_unref (conn);
>> conn = NULL;
>> }
>> return conn;
>> }
>>
>> The 'detached' is only set _gf_true when 'conn->bind_ref' decrease
>> to 0.
>> This 'conn->bind_ref' is set in server_connection_get(), increase or
>> set to 1.
>>
>> server_connection_t *
>> server_connection_get (xlator_t *this, const char *id)
>> {
>> ......
>> list_for_each_entry (trav, &conf->conns, list) {
>> if (!strcmp (trav->id, id)) {
>> conn = trav;
>> conn->bind_ref++;
>> goto unlock;
>> }
>> }
>> ......
>> }
>>
>> When the connection id is same, then the 'conn->bind_ref' will be
>> increased.
>> Therefore, the problem should be a reference mismatch increase or
>> decrease. Then
>> we add some logs to verify our guess.
>>
>> // 1st connection comes in. and there is no id
>> 'host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0'
>> in the connection table. The 'conn->bind_ref' is set to 1.
>> [2014-09-17 04:42:28.950693] D
>> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server:
>> server connection id:
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0,
>> conn->bind_ref:1, found:0
>> [2014-09-17 04:42:28.950717] D
>> [server-handshake.c:430:server_setvolume] 0-vs_vol_rep2-server:
>> Connected to
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0
>> [2014-09-17 04:42:28.950758] I
>> [server-handshake.c:567:server_setvolume] 0-vs_vol_rep2-server:
>> accepted client from
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0
>> (version: 3.4.5) (peer: host-000c29e93d20:1015)
>> ......
>> // Keep running several minutes.......
>> ......
>> // Network disconnected here. The TCP socket of client side is
>> disconnected by
>> time-out, by the server-side socket still keep connected. AT THIS
>> MOMENT,
>> network restore. Client side reconnect a new TCP connection JUST
>> BEFORE the
>> last socket on server-side is reset. Note that at this point, there
>> is 2 valid
>> sockets on server side. The later new connection use the same conn
>> id 'host-000
>> c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_rep2-client-2-0'
>> look up in the
>> connection table and increase the 'conn->bind_ref' to 2.
>>
>> [2014-09-17 04:46:16.135066] D
>> [server-helpers.c:712:server_connection_get] 0-vs_vol_rep2-server:
>> server connection id:
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0,
>> conn->bind_ref:2, found:1 // HERE IT IS, ref increase to 2!!!
>> [2014-09-17 04:46:16.135113] D
>> [server-handshake.c:430:server_setvolume] 0-vs_vol_rep2-server:
>> Connected to
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0
>> [2014-09-17 04:46:16.135157] I
>> [server-handshake.c:567:server_setvolume] 0-vs_vol_rep2-server:
>> accepted client from
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0
>> (version: 3.4.5) (peer: host-000c29e93d20:1018)
>>
>> // After 13 seconds, the old connection is reset, decrease the
>> 'conn->bind_ref' to 1.
>>
>> [2014-09-17 04:46:28.688780] W
>> [socket.c:2121:__socket_proto_state_machine]
>> 0-tcp.vs_vol_rep2-server: ret = -1, error: Connection reset by peer,
>> peer (host-000c29e93d20:1015)
>> [2014-09-17 04:46:28.688790] I [socket.c:2274:socket_event_handler]
>> 0-transport: socket_event_poll_in failed, ret=-1.
>> [2014-09-17 04:46:28.688797] D [socket.c:2281:socket_event_handler]
>> 0-transport: disconnecting now
>> [2014-09-17 04:46:28.688831] I [server.c:762:server_rpc_notify]
>> 0-vs_vol_rep2-server: disconnecting connectionfrom
>> host-000c29e93d20-8661-2014/09/13-11:02:26:995090-vs_vol_
>> rep2-client-2-0(host-000c29e93d20:1015)
>> [2014-09-17 04:46:28.688861] D
>> [server-helpers.c:744:server_connection_put] 0-vs_vol_rep2-server:
>> conn->bind_ref:1
>>
>> In our production environment, there is some flocks in the 1st
>> connection.
>> According to the logs, there is no way to cleanup the flocks in the
>> 1st connection.
>> And the 2nd new connection, the client-side can't flock again.
>>
>> Therefore, we think the major reason is different connections using
>> the same conn id.
>> The conn id is assembled in client_setvolume()
>>
>> ret = gf_asprintf (&process_uuid_xl, "%s-%s-%d",
>> this->ctx->process_uuid, this->name,
>> this->graph->id);
>>
>> The conn id contains 3 parts:
>> this->ctx->process_uuid: hostname + pid + startup timestamp
>> this->name: tranlator name
>> this->graph->id: graph id
>>
>> It is apparently that the conn id is same unless the client side
>> restart. So when
>> network disconnects, there is some chance that socket on client side
>> timeout and
>> the one on server side is alive. At this moment, network restore,
>> client reconnect
>> before server old socket reset, that will cause the file flocks of
>> old connection
>> unclean.
>>
>> That is our total analysis of this flock leak issue. Now we are
>> working on the patch.
>> Hope someone could review it when it is finished.
>>
>> Any other comment is grateful, Thank you!
>>
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20140920/548f3709/attachment-0001.html>
More information about the Gluster-devel
mailing list