[Gluster-devel] glusterfsd memory leak issue found after enable ssl

Milind Changire mchangir at redhat.com
Mon Apr 22 06:21:19 UTC 2019


Looks like using BIO_free() is not right. Here's what the SSL_set_bio() man
page says ...

SSL_set_bio() is similar to SSL_set0_rbio() and SSL_set0_wbio() except that
it connects both the rbio and the wbio at the same time, and *transfers the
ownership of rbio and wbio to ssl* according to the following set of rules:

So, I think you were right about SSL_free() doing the job for the bio.
However, SSL_free() has no reason to set the priv->ssl_sbio pointer to
NULL. I think priv->ssl_sbio should be set to NULL immediately after the
call to SSL_set_bio() is successful. And we need to add a comment while
setting priv->ssl_sbio to NULL that the ownership of the bio has now been
transferred to SSL and SSL will free the related memory appropriately.



On Mon, Apr 22, 2019 at 11:37 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.zhou at nokia-sbell.com> wrote:

> I tried to print priv->ssl_sbio after SSL_free() find this pointer is not
> null, so I add free ssl_sbio with BIO_free, however this cause glusterfd
> coredump
>
> (gdb) bt
>
> #0  0x00007f3047867f9b in raise () from /lib64/libc.so.6
>
> #1  0x00007f3047869351 in abort () from /lib64/libc.so.6
>
> #2  0x00007f30478aa8c7 in __libc_message () from /lib64/libc.so.6
>
> #3  0x00007f30478b0e6a in malloc_printerr () from /lib64/libc.so.6
>
> #4  0x00007f30478b2835 in _int_free () from /lib64/libc.so.6
>
> #5  0x00007f3047c5bbbd in CRYPTO_free () from /lib64/libcrypto.so.10
>
> #6  0x00007f3047d07582 in BIO_free () from /lib64/libcrypto.so.10
>
> #7  0x00007f3043f9ba4b in __socket_reset (this=0x7f303c1ae710) at
> socket.c:1032
>
> #8  0x00007f3043f9c4aa in socket_event_poll_err (this=0x7f303c1ae710,
> gen=1, idx=17) at socket.c:1232
>
> #9  0x00007f3043fa1b7d in socket_event_handler (fd=26, idx=17, gen=1,
> data=0x7f303c1ae710, poll_in=1, poll_out=0, poll_err=0) at socket.c:2669
>
> #10 0x00007f3049307984 in event_dispatch_epoll_handler
> (event_pool=0x1035610, event=0x7f3043b14e84) at event-epoll.c:587
>
> #11 0x00007f3049307c5b in event_dispatch_epoll_worker (data=0x107e3e0) at
> event-epoll.c:663
>
> #12 0x00007f30480535da in start_thread () from /lib64/libpthread.so.0
>
> #13 0x00007f3047929eaf in clone () from /lib64/libc.so.6
>
>
>
> @@ -1019,7 +1019,20 @@ static void __socket_reset(rpc_transport_t *this) {
>
>    memset(&priv->incoming, 0, sizeof(priv->incoming));
>
>    event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
>
> -
>
> +  if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +    gf_log(this->name, GF_LOG_INFO,
>
> +           "clear and reset for socket(%d), free ssl ",
>
> +           priv->sock);
>
> +    SSL_shutdown(priv->ssl_ssl);
>
> +    SSL_clear(priv->ssl_ssl);
>
> +    SSL_free(priv->ssl_ssl);
>
> +    gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p
> ",priv->sock,priv->ssl_sbio);
>
> +     if(priv->ssl_sbio != NULL)
>
> +         BIO_free(priv->ssl_sbio);
>
> +    priv->ssl_ssl = NULL;
>
> +     priv->ssl_sbio = NULL;
>
> +  }
>
>    priv->sock = -1;
>
>    priv->idx = -1;
>
>    priv->connected = -1;
>
> @@ -4238,6 +4251,20 @@ void fini(rpc_transport_t *this) {
>
>      pthread_mutex_destroy(&priv->out_lock);
>
>      pthread_mutex_destroy(&priv->cond_lock);
>
>      pthread_cond_destroy(&priv->cond);
>
> +     if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +    gf_log(this->name, GF_LOG_INFO,
>
> +           "clear and reset for socket(%d), free ssl ",
>
> +           priv->sock);
>
> +    SSL_shutdown(priv->ssl_ssl);
>
> +    SSL_clear(priv->ssl_ssl);
>
> +    SSL_free(priv->ssl_ssl);
>
> +     gf_log(this->name, GF_LOG_INFO,"priv->ssl_sbio of socket(%d)is %p
> ",priv->sock,priv->ssl_sbio);
>
> +    if(priv->ssl_sbio != NULL)
>
> +        BIO_free(priv->ssl_sbio);
>
> +    priv->ssl_ssl = NULL;
>
> +     priv->ssl_sbio = NULL;
>
> +  }
>
>      if (priv->ssl_private_key) {
>
>        GF_FREE(priv->ssl_private_key);
>
>
>
>
>
> *From:* Milind Changire <mchangir at redhat.com>
> *Sent:* Monday, April 22, 2019 1:35 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
> *Cc:* Atin Mukherjee <amukherj at redhat.com>; gluster-devel at gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> This probably went unnoticed until now.
>
>
>
>
>
>
>
> On Mon, Apr 22, 2019 at 10:45 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.zhou at nokia-sbell.com> wrote:
>
> Why there is no bio_free called in ssl_teardown_connection then?
>
>
>
> cynthia
>
>
>
> *From:* Milind Changire <mchangir at redhat.com>
> *Sent:* Monday, April 22, 2019 10:21 AM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
> *Cc:* Atin Mukherjee <amukherj at redhat.com>; gluster-devel at gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> After patch 22334 <https://review.gluster.org/c/glusterfs/+/22334>, the
> priv->ssl_ctx is now maintained per socket connection and is no longer
> shared.
>
> So you might want to SSL_free(priv->ssl_ctx) as well and set priv->ssl_ctx
> to NULL.
>
>
>
> There might be some strings that are duplicated (gf_strdup()) via the
> socket_init() code path. Please take a look at those as well.
>
>
>
> Sorry about that. I missed it.
>
>
>
>
>
> On Mon, Apr 22, 2019 at 7:25 AM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.zhou at nokia-sbell.com> wrote:
>
>
>
> Hi,
>
> From my code study it seems priv->ssl_ssl is not properly released, I made
> a patch and the glusterfsd memory leak is alleviated with my patch, but
> some otherwhere is still leaking, I have no clue about the other leak
> points.
>
>
>
> --- a/rpc/rpc-transport/socket/src/socket.c
>
> +++ b/rpc/rpc-transport/socket/src/socket.c
>
> @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) {
>
>    memset(&priv->incoming, 0, sizeof(priv->incoming));
>
>    event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
>
> -
>
> +  if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +    gf_log(this->name, GF_LOG_TRACE,
>
> +           "clear and reset for socket(%d), free ssl ",
>
> +           priv->sock);
>
> +    SSL_shutdown(priv->ssl_ssl);
>
> +    SSL_clear(priv->ssl_ssl);
>
> +    SSL_free(priv->ssl_ssl);
>
> +    priv->ssl_ssl = NULL;
>
> +  }
>
>    priv->sock = -1;
>
>    priv->idx = -1;
>
>    priv->connected = -1;
>
> @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) {
>
>      pthread_mutex_destroy(&priv->out_lock);
>
>      pthread_mutex_destroy(&priv->cond_lock);
>
>      pthread_cond_destroy(&priv->cond);
>
> +     if(priv->use_ssl&& priv->ssl_ssl)
>
> +  {
>
> +    gf_log(this->name, GF_LOG_TRACE,
>
> +           "clear and reset for socket(%d), free ssl ",
>
> +           priv->sock);
>
> +    SSL_shutdown(priv->ssl_ssl);
>
> +    SSL_clear(priv->ssl_ssl);
>
> +    SSL_free(priv->ssl_ssl);
>
> +    priv->ssl_ssl = NULL;
>
> +  }
>
>      if (priv->ssl_private_key) {
>
>        GF_FREE(priv->ssl_private_key);
>
>      }
>
>
>
>
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Thursday, April 18, 2019 5:31 PM
> *To:* 'Atin Mukherjee' <amukherj at redhat.com>
> *Cc:* 'Raghavendra Gowdappa' <rgowdapp at redhat.com>; '
> gluster-devel at gluster.org' <gluster-devel at gluster.org>
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> We scan it use memory-leak tool, there are following prints. We doubt some
> open ssl lib malloc is is not properly freed by glusterfs code.
>
> er+0x2af [*libglusterfs.so.0.0.1]\n\t\tstart_thread+0xda* [
> libpthread-2.27.so]'
>         *13580* bytes in 175 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p*]'
>         *232904* bytes in 14 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
> [15:41:56] Top 10 stacks with outstanding allocations:
>         *8792* bytes in 14 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>         *9408* bytes in 42 allocations from stack
>                 b'CRYPTO_realloc+0x4d [*libcrypto.so.1.0.2p*]'
>         *9723* bytes in 14 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>         *10696* bytes in 21 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>         *11319* bytes in 602 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>         *11431* bytes in 518 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>         *11704* bytes in 371 allocations from stack
>                 b'CRYPTO_malloc+0x58 [*libcrypto.so.1.0.2p]\n\t\t[unknown*
> ]'
>
>
>
>
>
> cynthia
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Thursday, April 18, 2019 5:25 PM
> *To:* 'Atin Mukherjee' <amukherj at redhat.com>
> *Cc:* Raghavendra Gowdappa <rgowdapp at redhat.com>;
> gluster-devel at gluster.org
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> I’ve test on glusterfs3.12.15 and glusterfs5.5 all have this issue, after
> enable tls ssl socket, when execute gluster v heal <vol-name> info, will
> trigger glfshel to connect glusterfsd process, and cause glusterfsd process
> memory leak. Could you please try in your env?
>
>
>
> cynthia
>
>
>
>
>
> *From:* Atin Mukherjee <amukherj at redhat.com>
> *Sent:* Thursday, April 18, 2019 1:19 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
> *Cc:* Raghavendra Gowdappa <rgowdapp at redhat.com>;
> gluster-devel at gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
>
>
>
>
> On Wed, 17 Apr 2019 at 10:53, Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.zhou at nokia-sbell.com> wrote:
>
> Hi,
>
> In my recent test, I found that there are very severe glusterfsd memory
> leak when enable socket ssl option
>
>
>
> What gluster version are you testing? Would you be able to continue your
> investigation and share the root cause?
>
>
>
> --
>
> - Atin (atinm)
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> --
>
> Milind
>
>
>
> --
>
> Milind
>


-- 
Milind
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190422/8cc65874/attachment-0001.html>


More information about the Gluster-devel mailing list