[Gluster-devel] glusterfsd memory leak issue found after enable ssl
Zhou, Cynthia (NSB - CN/Hangzhou)
cynthia.zhou at nokia-sbell.com
Mon May 6 06:11:58 UTC 2019
Hi,
From our test valgrind and libleak all blame ssl3_accept
///////////////////////////from valgrind attached to glusterfds///////////////////////////////////////////
==16673== 198,720 bytes in 12 blocks are definitely lost in loss record 1,114 of 1,123
==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p)
==16673== by 0xA855E0C: ssl3_setup_write_buffer (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA855E77: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409)
==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
==16673== by 0xA618788: socket_event_handler (socket.c:2613)
==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so)
==16673==
==16673== 200,544 bytes in 12 blocks are definitely lost in loss record 1,115 of 1,123
==16673== at 0x4C2EB7B: malloc (vg_replace_malloc.c:299)
==16673== by 0x63E1977: CRYPTO_malloc (in /usr/lib64/libcrypto.so.1.0.2p)
==16673== by 0xA855D12: ssl3_setup_read_buffer (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA855E68: ssl3_setup_buffers (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA8485D9: ssl3_accept (in /usr/lib64/libssl.so.1.0.2p)
==16673== by 0xA610DDF: ssl_complete_connection (socket.c:400)
==16673== by 0xA617F38: ssl_handle_server_connection_attempt (socket.c:2409)
==16673== by 0xA618420: socket_complete_connection (socket.c:2554)
==16673== by 0xA618788: socket_event_handler (socket.c:2613)
==16673== by 0x4ED6983: event_dispatch_epoll_handler (event-epoll.c:587)
==16673== by 0x4ED6C5A: event_dispatch_epoll_worker (event-epoll.c:663)
==16673== by 0x615C5D9: start_thread (in /usr/lib64/libpthread-2.27.so)
==16673==
valgrind --leak-check=f
////////////////////////////////////with libleak attached to glusterfsd/////////////////////////////////////////
callstack[2419] expires. count=1 size=224/224 alloc=362 free=350
/home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065]
/lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978]
/lib64/libcrypto.so.10(EVP_DigestInit_ex+0x2a9) [0x7f145ed95749]
/lib64/libssl.so.10(ssl3_digest_cached_records+0x11d) [0x7f145abb6ced]
/lib64/libssl.so.10(ssl3_accept+0xc8f) [0x7f145abadc4f]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2]
/lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f]
/lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46]
/lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da]
/lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf]
callstack[2432] expires. count=1 size=104/104 alloc=362 free=0
/home/robot/libleak/libleak.so(malloc+0x25) [0x7f1460604065]
/lib64/libcrypto.so.10(CRYPTO_malloc+0x58) [0x7f145ecd9978]
/lib64/libcrypto.so.10(BN_MONT_CTX_new+0x17) [0x7f145ed48627]
/lib64/libcrypto.so.10(BN_MONT_CTX_set_locked+0x6d) [0x7f145ed489fd]
/lib64/libcrypto.so.10(+0xff4d9) [0x7f145ed6a4d9]
/lib64/libcrypto.so.10(int_rsa_verify+0x1cd) [0x7f145ed6d41d]
/lib64/libcrypto.so.10(RSA_verify+0x32) [0x7f145ed6d972]
/lib64/libcrypto.so.10(+0x107ff5) [0x7f145ed72ff5]
/lib64/libcrypto.so.10(EVP_VerifyFinal+0x211) [0x7f145ed9dd51]
/lib64/libssl.so.10(ssl3_get_cert_verify+0x5bb) [0x7f145abac06b]
/lib64/libssl.so.10(ssl3_accept+0x988) [0x7f145abad948]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(ssl_complete_connection+0x5e) [0x7f145ae00f3a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc16d) [0x7f145ae0816d]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc68a) [0x7f145ae0868a]
/usr/lib64/glusterfs/3.12.15/rpc-transport/socket.so(+0xc9f2) [0x7f145ae089f2]
/lib64/libglusterfs.so.0(+0x9b96f) [0x7f146038596f]
/lib64/libglusterfs.so.0(+0x9bc46) [0x7f1460385c46]
/lib64/libpthread.so.0(+0x75da) [0x7f145f0d15da]
/lib64/libc.so.6(clone+0x3f) [0x7f145e9a7eaf]
one interesting thing is that the memory goes up to about 300m then it stopped increasing !!!
I am wondering if this is caused by open-ssl library? But when I search from openssl community, there is no such issue reported before.
Is glusterfs using ssl_accept correctly?
cynthia
From: Zhou, Cynthia (NSB - CN/Hangzhou)
Sent: Monday, May 06, 2019 10:34 AM
To: 'Amar Tumballi Suryanarayan' <atumball at redhat.com>
Cc: Milind Changire <mchangir at redhat.com>; gluster-devel at gluster.org
Subject: RE: [Gluster-devel] glusterfsd memory leak issue found after enable ssl
Hi,
Sorry, I am so busy with other issues these days, could you help me to submit my patch for review? It is based on glusterfs3.12.15 code. But even with this patch , memory leak still exists, from memory leak tool it should be related with ssl_accept, not sure if it is because of openssl library or because improper use of ssl interfaces.
--- a/rpc/rpc-transport/socket/src/socket.c
+++ b/rpc/rpc-transport/socket/src/socket.c
@@ -1019,7 +1019,16 @@ static void __socket_reset(rpc_transport_t *this) {
memset(&priv->incoming, 0, sizeof(priv->incoming));
event_unregister_close(this->ctx->event_pool, priv->sock, priv->idx);
-
+ if(priv->use_ssl&& priv->ssl_ssl)
+ {
+ gf_log(this->name, GF_LOG_INFO,
+ "clear and reset for socket(%d), free ssl ",
+ priv->sock);
+ SSL_shutdown(priv->ssl_ssl);
+ SSL_clear(priv->ssl_ssl);
+ SSL_free(priv->ssl_ssl);
+ priv->ssl_ssl = NULL;
+ }
priv->sock = -1;
priv->idx = -1;
priv->connected = -1;
@@ -4238,6 +4250,16 @@ void fini(rpc_transport_t *this) {
pthread_mutex_destroy(&priv->out_lock);
pthread_mutex_destroy(&priv->cond_lock);
pthread_cond_destroy(&priv->cond);
+ if(priv->use_ssl&& priv->ssl_ssl)
+ {
+ gf_log(this->name, GF_LOG_INFO,
+ "clear and reset for socket(%d), free ssl ",
+ priv->sock);
+ SSL_shutdown(priv->ssl_ssl);
+ SSL_clear(priv->ssl_ssl);
+ SSL_free(priv->ssl_ssl);
+ priv->ssl_ssl = NULL;
+ }
if (priv->ssl_private_key) {
GF_FREE(priv->ssl_private_key);
}
From: Amar Tumballi Suryanarayan <atumball at redhat.com<mailto:atumball at redhat.com>>
Sent: Wednesday, May 01, 2019 8:43 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>>; gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>
Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl
Hi Cynthia Zhou,
Can you post the patch which fixes the issue of missing free? We will continue to investigate the leak further, but would really appreciate getting the patch which is already worked on land into upstream master.
-Amar
On Mon, Apr 22, 2019 at 1:38 PM Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote:
Ok, I am clear now.
I’ve added ssl_free in socket reset and socket finish function, though glusterfsd memory leak is not that much, still it is leaking, from source code I can not find anything else,
Could you help to check if this issue exists in your env? If not I may have a try to merge your patch .
Step
1> while true;do gluster v heal <vol-name> info,
2> check the vol-name glusterfsd memory usage, it is obviously increasing.
cynthia
From: Milind Changire <mchangir at redhat.com<mailto:mchangir at redhat.com>>
Sent: Monday, April 22, 2019 2:36 PM
To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>>
Cc: Atin Mukherjee <amukherj at redhat.com<mailto:amukherj at redhat.com>>; gluster-devel at gluster.org<mailto:gluster-devel at gluster.org>
Subject: Re: [Gluster-devel] glusterfsd memory leak issue found after enable ssl
According to BIO_new_socket() man page ...
If the close flag is set then the socket is shut down and closed when the BIO is freed.
For Gluster to have more control over the socket shutdown, the BIO_NOCLOSE flag is set. Otherwise, SSL takes control of socket shutdown whenever BIO is freed.
_______________________________________________
Gluster-devel mailing list
Gluster-devel at gluster.org<mailto:Gluster-devel at gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-devel
--
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190506/98421e48/attachment-0001.html>
More information about the Gluster-devel
mailing list