[Gluster-devel] Question on rpc_transport_unref
Emmanuel Dreyfus
manu at netbsd.org
Sun Apr 28 13:24:38 UTC 2013
Emmanuel Dreyfus <manu at netbsd.org> wrote:
> Is it an event-poll story? When the backtrace hits even-pool.c,
> glusterfs has already decided to destroy the transport because its
> refcount droped to zero. It looks more like a refcount tracking issue,
> which is always done outside of event-poll code.
Here is rpc_transport_(ref|unref) tracking prior the bug:
1367153818.290418 rpcsvc_request_init 0x7f7ff73b6000 1 -> 2
1367153818.290426 server_alloc_frame 0x7f7ff73b6000 2 -> 3
1367153818.290449 rpcsvc_request_init 0x7f7ff73b6000 3 -> 4
1367153818.290455 server_alloc_frame 0x7f7ff73b6000 4 -> 5
1367153818.291074 rpcsvc_request_init 0x7f7ff73b6000 5 -> 6
1367153818.291076 server_alloc_frame 0x7f7ff73b6000 6 -> 7
1367153818.291100 rpcsvc_request_init 0x7f7ff73b6000 7 -> 8
1367153818.291103 server_alloc_frame 0x7f7ff73b6000 8 -> 9
1367153818.291104 rpcsvc_request_destroy 0x7f7ff73b6000 9 -> 8
1367153818.291104 free_state 0x7f7ff73b6000 8 -> 7
1367153818.291112 rpcsvc_request_init 0x7f7ff73b6000 7 -> 8
1367153818.291114 rpcsvc_request_destroy 0x7f7ff73b6000 8 -> 7
1367153818.291115 server_alloc_frame 0x7f7ff73b6000 7 -> 6
1367153818.291127 rpcsvc_request_destroy 0x7f7ff73b6000 6 -> 5
1367153818.291127 free_state 0x7f7ff73b6000 5 -> 4
1367153818.291987 rpcsvc_request_destroy 0x7f7ff73b6000 4 -> 3
1367153818.291987 free_state 0x7f7ff73b6000 3 -> 2
1367153818.293152 rpcsvc_request_destroy 0x7f7ff73b6000 2 -> 1
1367153818.293152 free_state 0x7f7ff73b6000 1 -> 0
This explains the reference dropping to 0, but it is odd, I just don't get
it, it makes no sense:
1367153818.291115 server_alloc_frame 0x7f7ff73b6000 7 -> 6
I double-checked my debug code and it seems correct. server_alloc_frame()
contains that call where I added the __func__ argument for debugging:
state->xprt = rpc_transport_ref (req->trans, __func__);
Here is my modified rpc_transport_ref():
rpc_transport_t *
rpc_transport_ref (rpc_transport_t *this, char *func)
{
rpc_transport_t *return_this = NULL;
GF_VALIDATE_OR_GOTO("rpc_transport", this, fail);
pthread_mutex_lock (&this->lock);
{
this->debug[this->debugcount].before = this->refcount;
this->refcount ++;
this->debug[this->debugcount].after = this->refcount;
gettimeofday(&this->debug[this->debugcount].tv, NULL);
this->debug[this->debugcount].func = func;
this->debug[this->debugcount].trans = this;
this->debugcount = (this->debugcount + 1) % DEBUG_SIZE;
}
pthread_mutex_unlock (&this->lock);
return_this = this;
fail:
return return_this;
}
And rpc_transport_destroy():
int32_t
rpc_transport_destroy (rpc_transport_t *this)
{
int32_t ret = -1;
GF_VALIDATE_OR_GOTO("rpc_transport", this, fail);
{
int i, c;
for (i = 0; i < DEBUG_SIZE; i++) {
c = (i + this->debugcount + 1) % DEBUG_SIZE;
gf_log("XXXref", GF_LOG_ERROR, "%d.%06d %s %p %d -> %d",
this->debug[c].tv.tv_sec, this->debug[c].tv.tv_usec,
this->debug[c].func, this->debug[c].trans,
this->debug[c].before, this->debug[c].after);
}
}
/* ... */
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu at netbsd.org
More information about the Gluster-devel
mailing list