[Gluster-devel] glusterfsd memory leak issue found after enable ssl
Raghavendra Gowdappa
rgowdapp at redhat.com
Wed May 8 11:35:19 UTC 2019
On Wed, May 8, 2019 at 1:29 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
cynthia.zhou at nokia-sbell.com> wrote:
> Hi 'Milind Changire' ,
>
> The leak is getting more and more clear to me now. the unsolved memory
> leak is because of in gluterfs version 3.12.15 (in my env)the ssl context
> is a shared one, while we do ssl_acept, ssl will allocate some read/write
> buffer to ssl object, however, ssl_free in socket_reset or fini function of
> socket.c, the buffer is returened back to ssl context free list instead of
> completely freed.
>
Thanks Cynthia for your efforts in identifying and fixing the leak. If you
post a patch to gerrit, I'll be happy to merge it and get the fix into the
codebase.
>
> So following patch is able to fix the memory leak issue
> completely.(created for gluster master branch)
>
>
>
> --- a/rpc/rpc-transport/socket/src/socket.c
> +++ b/rpc/rpc-transport/socket/src/socket.c
> @@ -446,6 +446,7 @@ ssl_setup_connection_postfix(rpc_transport_t *this)
> gf_log(this->name, GF_LOG_DEBUG,
> "SSL verification succeeded (client: %s) (server: %s)",
> this->peerinfo.identifier, this->myinfo.identifier);
> + X509_free(peer);
> return gf_strdup(peer_CN);
>
> /* Error paths. */
> @@ -1157,7 +1158,21 @@ __socket_reset(rpc_transport_t *this)
> memset(&priv->incoming, 0, sizeof(priv->incoming));
>
> event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
> -
> + if(priv->use_ssl&& priv->ssl_ssl)
> + {
> + gf_log(this->name, GF_LOG_TRACE,
> + "clear and reset for socket(%d), free ssl ",
> + priv->sock);
> + if(priv->ssl_ctx)
> + {
> + SSL_CTX_free(priv->ssl_ctx);
> + priv->ssl_ctx = NULL;
> + }
> + SSL_shutdown(priv->ssl_ssl);
> + SSL_clear(priv->ssl_ssl);
> + SSL_free(priv->ssl_ssl);
> + priv->ssl_ssl = NULL;
> + }
> priv->sock = -1;
> priv->idx = -1;
> priv->connected = -1;
> @@ -4675,6 +4690,21 @@ fini(rpc_transport_t *this)
> pthread_mutex_destroy(&priv->out_lock);
> pthread_mutex_destroy(&priv->cond_lock);
> pthread_cond_destroy(&priv->cond);
> + if(priv->use_ssl&& priv->ssl_ssl)
> + {
> + gf_log(this->name, GF_LOG_TRACE,
> + "clear and reset for socket(%d), free ssl
> ",
> + priv->sock);
> + if(priv->ssl_ctx)
> + {
> + SSL_CTX_free(priv->ssl_ctx);
> + priv->ssl_ctx = NULL;
> + }
> + SSL_shutdown(priv->ssl_ssl);
> + SSL_clear(priv->ssl_ssl);
> + SSL_free(priv->ssl_ssl);
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Monday, May 06, 2019 2:12 PM
> *To:* 'Amar Tumballi Suryanarayan' <atumball at redhat.com>
> *Cc:* 'Milind Changire' <mchangir at redhat.com>; 'gluster-devel at gluster.org'
> <gluster-devel at gluster.org>
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> Hi,
>
> From our test valgrind and libleak all blame ssl3_accept
>
> ///////////////////////////from valgrind attached to
> glusterfds///////////////////////////////////////////
>
> ==16673== 198,720 bytes in 12 blocks are definitely lost in loss record
> 1,114 of 1,123
> ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
> ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/
> *libcrypto.so.1.0.2p*)
> ==16673== by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673== by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/*libssl.so.1.0.2p*)
> ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
> ==16673== by 0xA617F38: ssl_handle_server_connection_attempt
> (socket.c:2409)
> ==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
> ==16673== by 0xA618788: socket_event_handler (socket.c:2613)
> ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
> ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
> ==16673== by 0x615C5D9: start_thread (in /usr/lib64/*libpthread-2.27.so
> <http://libpthread-2.27.so>*)
> ==16673==
> ==16673== 200,544 bytes in 12 blocks are definitely lost in loss record
> 1,115 of 1,123
> ==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
> ==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/
> *libcrypto.so.1.0.2p*)
> ==16673== by 0xA855D12: ssl3_setup_read_buffer (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673== by 0xA855E68: ssl3_setup_buffers (in /usr/lib64/
> *libssl.so.1.0.2p*)
> ==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/*libssl.so.1.0.2p*)
> ==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
> ==16673== by 0xA617F38: ssl_handle_server_connection_attempt
> (socket.c:2409)
> ==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
> ==16673== by 0xA618788: socket_event_handler (socket.c:2613)
> ==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
> ==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
> ==16673== by 0x615C5D9: start_thread (in /usr/lib64/*libpthread-2.27.so
> <http://libpthread-2.27.so>*)
> ==16673==
> valgrind --leak-check=f
>
>
>
>
>
> ////////////////////////////////////with libleak attached to
> glusterfsd/////////////////////////////////////////
>
> callstack[2419] expires. count=1 size=224/224 alloc=362 free=350
> /home/robot/libleak/*libleak.so(malloc+0x25*) [0x7f1460604065]
> /lib64/*libcrypto.so.10(CRYPTO_malloc+0x58*) [0x7f145ecd9978]
> /lib64/*libcrypto.so.10(EVP_DigestInit_ex+0x2a9*) [0x7f145ed95749]
> /lib64/*libssl.so.10(ssl3_digest_cached_records+0x11d*)
> [0x7f145abb6ced]
> /lib64/*libssl.so.10(**ssl3_accept**+0xc8f*) [0x7f145abadc4f]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/
> *socket.so(ssl_complete_connection+0x5e*) [0x7f145ae00f3a]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc16d*)
> [0x7f145ae0816d]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc68a*)
> [0x7f145ae0868a]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc9f2*)
> [0x7f145ae089f2]
> /lib64/*libglusterfs.so.0(+0x9b96f*) [0x7f146038596f]
> /lib64/*libglusterfs.so.0(+0x9bc46*) [0x7f1460385c46]
> /lib64/*libpthread.so.0(+0x75da*) [0x7f145f0d15da]
> /lib64/*libc.so.6(clone+0x3f*) [0x7f145e9a7eaf]
>
> callstack[2432] expires. count=1 size=104/104 alloc=362 free=0
> /home/robot/libleak/*libleak.so(malloc+0x25*) [0x7f1460604065]
> /lib64/*libcrypto.so.10(CRYPTO_malloc+0x58*) [0x7f145ecd9978]
> /lib64/*libcrypto.so.10(BN_MONT_CTX_new+0x17*) [0x7f145ed48627]
> /lib64/*libcrypto.so.10(BN_MONT_CTX_set_locked+0x6d*) [0x7f145ed489fd]
> /lib64/*libcrypto.so.10(+0xff4d9*) [0x7f145ed6a4d9]
> /lib64/*libcrypto.so.10(int_rsa_verify+0x1cd*) [0x7f145ed6d41d]
> /lib64/*libcrypto.so.10(RSA_verify+0x32*) [0x7f145ed6d972]
> /lib64/*libcrypto.so.10(+0x107ff5*) [0x7f145ed72ff5]
> /lib64/*libcrypto.so.10(EVP_VerifyFinal+0x211*) [0x7f145ed9dd51]
> /lib64/*libssl.so.10(ssl3_get_cert_verify+0x5bb*) [0x7f145abac06b]
> /lib64/*libssl.so.10(**ssl3_accept**+0x988*) [0x7f145abad948]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/
> *socket.so(ssl_complete_connection+0x5e*) [0x7f145ae00f3a]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc16d*)
> [0x7f145ae0816d]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc68a*)
> [0x7f145ae0868a]
> /usr/lib64/glusterfs/3.12.15/rpc-transport/*socket.so(+0xc9f2*)
> [0x7f145ae089f2]
> /lib64/*libglusterfs.so.0(+0x9b96f*) [0x7f146038596f]
> /lib64/*libglusterfs.so.0(+0x9bc46*) [0x7f1460385c46]
> /lib64/*libpthread.so.0(+0x75da*) [0x7f145f0d15da]
> /lib64/*libc.so.6(clone+0x3f*) [0x7f145e9a7eaf]
>
>
>
> one interesting thing is that the memory goes up to about 300m then it
> stopped increasing !!!
>
> I am wondering if this is caused by open-ssl library? But when I search
> from openssl community, there is no such issue reported before.
>
> Is glusterfs using ssl_accept correctly?
>
>
>
> cynthia
>
> *From:* Zhou, Cynthia (NSB - CN/Hangzhou)
> *Sent:* Monday, May 06, 2019 10:34 AM
> *To:* 'Amar Tumballi Suryanarayan' <atumball at redhat.com>
> *Cc:* Milind Changire <mchangir at redhat.com>; gluster-devel at gluster.org
> *Subject:* RE: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> Hi,
>
> Sorry, I am so busy with other issues these days, could you help me to
> submit my patch for review? It is based on glusterfs3.12.15 code. But even
> with this patch , memory leak still exists, from memory leak tool it should
> be related with ssl_accept, not sure if it is because of openssl library or
> because improper use of ssl interfaces.
>
> --- a/rpc/rpc-transport/socket/src/socket.c
>
> +++ b/rpc/rpc-transport/socket/src/socket.c
>
> @@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) {
>
> memset(&priv->incoming, 0, sizeof(priv->incoming));
>
>
>
> event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
>
> -
>
> + if(priv->use_ssl&& priv->ssl_ssl)
>
> + {
>
> + gf_log(this->name, GF_LOG_INFO,
>
> + "clear and reset for socket(%d), free ssl ",
>
> + priv->sock);
>
> + SSL_shutdown(priv->ssl_ssl);
>
> + SSL_clear(priv->ssl_ssl);
>
> + SSL_free(priv->ssl_ssl);
>
> + priv->ssl_ssl = NULL;
>
> + }
>
> priv->sock = -1;
>
> priv->idx = -1;
>
> priv->connected = -1;
>
> @@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) {
>
> pthread_mutex_destroy(&priv->out_lock);
>
> pthread_mutex_destroy(&priv->cond_lock);
>
> pthread_cond_destroy(&priv->cond);
>
> + if(priv->use_ssl&& priv->ssl_ssl)
>
> + {
>
> + gf_log(this->name, GF_LOG_INFO,
>
> + "clear and reset for socket(%d), free ssl ",
>
> + priv->sock);
>
> + SSL_shutdown(priv->ssl_ssl);
>
> + SSL_clear(priv->ssl_ssl);
>
> + SSL_free(priv->ssl_ssl);
>
> + priv->ssl_ssl = NULL;
>
> + }
>
> if (priv->ssl_private_key) {
>
> GF_FREE(priv->ssl_private_key);
>
> }
>
>
>
>
>
> *From:* Amar Tumballi Suryanarayan <atumball at redhat.com>
> *Sent:* Wednesday, May 01, 2019 8:43 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
> *Cc:* Milind Changire <mchangir at redhat.com>; gluster-devel at gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> Hi Cynthia Zhou,
>
>
>
> Can you post the patch which fixes the issue of missing free? We will
> continue to investigate the leak further, but would really appreciate
> getting the patch which is already worked on land into upstream master.
>
>
>
> -Amar
>
>
>
> On Mon, Apr 22, 2019 at 1:38 PM Zhou, Cynthia (NSB - CN/Hangzhou) <
> cynthia.zhou at nokia-sbell.com> wrote:
>
> Ok, I am clear now.
>
> I’ve added ssl_free in socket reset and socket finish function, though
> glusterfsd memory leak is not that much, still it is leaking, from source
> code I can not find anything else,
>
> Could you help to check if this issue exists in your env? If not I may
> have a try to merge your patch .
>
> Step
>
> 1> while true;do gluster v heal <vol-name> info,
>
> 2> check the vol-name glusterfsd memory usage, it is obviously
> increasing.
>
> cynthia
>
>
>
> *From:* Milind Changire <mchangir at redhat.com>
> *Sent:* Monday, April 22, 2019 2:36 PM
> *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com>
> *Cc:* Atin Mukherjee <amukherj at redhat.com>; gluster-devel at gluster.org
> *Subject:* Re: [Gluster-devel] glusterfsd memory leak issue found after
> enable ssl
>
>
>
> According to BIO_new_socket() man page ...
>
>
>
> *If the close flag is set then the socket is shut down and closed when the
> BIO is freed.*
>
>
>
> For Gluster to have more control over the socket shutdown, the BIO_NOCLOSE
> flag is set. Otherwise, SSL takes control of socket shutdown whenever BIO
> is freed.
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
>
> Amar Tumballi (amarts)
> _______________________________________________
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/836554017
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/486278655
>
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190508/93c4bdf9/attachment-0001.html>
More information about the Gluster-devel
mailing list