[Bugs] [Bug 1359364] changelog/rpc: Memory leak- rpc_clnt_t object is never freed
bugzilla at redhat.com
bugzilla at redhat.com
Mon Jul 25 06:40:33 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1359364
--- Comment #2 from Vijay Bellur <vbellur at redhat.com> ---
COMMIT: http://review.gluster.org/14994 committed in release-3.8 by Raghavendra
G (rgowdapp at redhat.com)
------
commit f55e3ac84db9f5da058a66d0a58e1c462d9d1df5
Author: Kotresh HR <khiremat at redhat.com>
Date: Mon Mar 7 11:45:07 2016 +0530
changelog/rpc: Fix rpc_clnt_t mem leaks
Backport of: http://review.gluster.org/13658
PROBLEM:
1. Freeing up rpc_clnt object might lead to crashes. Well,
it was not a necessity to free rpc-clnt object till now
because all the existing use cases needs to reconnect
back on disconnects. Hence timer code was not taking
ref on rpc-clnt object.
Glusterd had some use-cases that led to crash due to
ping-timer and they fixed only those code paths that
involve ping-timer.
Now, since changelog has an use-case where rpc-clnt
need to be freed up, we need to fix timer code to take
refs
2. In changelog, because of issue 1, only mydata was being
freed which is incorrect. And there are races where
rpc-clnt object would access the freed mydata which
would lead to crashes.
Since changelog xlator resides on brick side and is long
living process, if multiple libgfchangelog consumers
register to changelog and disconnect/reconnect mulitple
times, it would result in leak of 'rpc-clnt' object
for every connect/disconnect.
SOLUTION:
1. Handle ref/unref of 'rpc_clnt' structure in timer
functions properly.
2. In changelog, unref 'rpc_clnt' in RPC_CLNT_DISCONNECT
after disabling timers and free mydata on RPC_CLNT_DESTROY.
RPC SETUP IN CHANGELOG:
1. changelog xlator initiates rpc server say 'changelog_rpc_server'
2. libgfchangelog initiates one rpc server say
'libgfchangelog_rpc_server'
3. libgfchangelog initiates rpc client and connects to
'changelog_rpc_server'
4. In return changelog_rpc_server initiates a rpc client and connects
back
to 'libgfchangelog_rpc_server'
REF/UNREF HANDLING IN TIMER FUNCTIONS:
Let's say rpc clnt refcount = 1
1. Take the ref before reigstering callback to timer queue
>>>> rpc_clnt_ref (say ref count becomes = 2)
2. Register a callback to timer say 'callback1'
3. If register fails:
>>>> rpc_clnt_unref (ref count = 1)
4. On timer expiration, 'callback1' gets called. So unref rpc clnt at
the end
in 'callback1'. This is corresponding to ref taken in step 1
>>>> rpc_clnt_unref (ref count = 1)
5. The cycle from step-1 to step-4 continues....until timer cancel event
happens
6. timer cancel of say 'callback1'
If timer cancel fails:
Do nothing, Step-4 would have unrefd
If timer cancel succeeds:
>>>> rpc_clnt_unref (ref count = 1)
Change-Id: I91389bc511b8b1a17824941970ee8d2c29a74a09
BUG: 1359364
Signed-off-by: Kotresh HR <khiremat at redhat.com>
(cherry picked from commit 637ce9e2e27e9f598a4a6c5a04cd339efaa62076)
Reviewed-on: http://review.gluster.org/14994
Smoke: Gluster Build System <jenkins at build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins at build.gluster.org>
CentOS-regression: Gluster Build System <jenkins at build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp at redhat.com>
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=ze7kAt3aoD&a=cc_unsubscribe
More information about the Bugs
mailing list