[Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

Ravishankar N ravishankar at redhat.com
Wed Aug 24 01:03:07 UTC 2016


On 08/24/2016 02:12 AM, Benjamin Edgar wrote:
> My test servers have been running for about 3 hours now (with the 
> while loop to constantly write and delete files) and it looks like the 
> memory usage of the arbiter brick process has not increased in the 
> past hour. Before it was constantly increasing, so it looks like 
> adding the "GF_FREE (ctx->iattbuf);" line in arbiter.c fixed the 
> issue. If anything changes overnight I will post an update, but I 
> believe that the fix worked!
>
> Once this patch makes it into the master branch, how long does it 
> usually take to get released back to 3.8?
>
Hi Ben,
Thanks for testing. The minor release schedule [1] for 3.8.x is on the 
10th of every month. But an out of order 3.8.3 release was just made. So 
maybe 3.8.4 would take a bit longer.

Thanks,
Ravi

[1] https://www.gluster.org/community/release-schedule/
> Thanks!
> Ben
>
> On Tue, Aug 23, 2016 at 2:18 PM, Benjamin Edgar <benedgar8 at gmail.com 
> <mailto:benedgar8 at gmail.com>> wrote:
>
>     Hi Ravi,
>
>     I saw that you updated the patch today (@
>     http://review.gluster.org/#/c/15289/
>     <http://review.gluster.org/#/c/15289/>). I built an RPM of the
>     first iteration you had of the patch (just changing the one line
>     in arbiter.c "GF_FREE (ctx->iattbuf);") and am running that on
>     some test servers now to see if the memory of the arbiter brick
>     gets out of control.
>
>     Ben
>
>     On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N
>     <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote:
>
>         Hi Benjamin
>
>         On 08/23/2016 06:41 AM, Benjamin Edgar wrote:
>>         I've attached a statedump of the problem brick process.  Let
>>         me know if there are any other logs you need.
>
>         Thanks for the report! I've sent a fix @
>         http://review.gluster.org/#/c/15289/
>         <http://review.gluster.org/#/c/15289/> . It would be nice if
>         you can verify if the patch fixes the issue for you.
>
>         Thanks,
>         Ravi
>
>>
>>         Thanks a lot,
>>         Ben
>>
>>         On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri
>>         <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>>
>>             Could you collect statedump of the brick process by
>>             following:
>>             https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump
>>             <https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump>
>>
>>             That should help us identify which datatype is causing
>>             leaks and fix it.
>>
>>             Thanks!
>>
>>             On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar
>>             <benedgar8 at gmail.com <mailto:benedgar8 at gmail.com>> wrote:
>>
>>                 Hi,
>>
>>                 I appear to have a memory leak with a replica 3
>>                 arbiter 1 configuration of gluster. I have a data
>>                 brick and an arbiter brick on one server, and another
>>                 server with the last data brick. The more I write
>>                 files to gluster in this configuration, the more
>>                 memory the arbiter brick process takes up.
>>
>>                 I am able to reproduce this issue by first setting up
>>                 a replica 3 arbiter 1 configuration and then using
>>                 the following bash script to create 10,000 200kB
>>                 files, delete those files, and run forever:
>>
>>                 while true ; do
>>                   for i in {1..10000} ; do
>>                     dd if=/dev/urandom bs=200K count=1
>>                 of=$TEST_FILES_DIR/file$i
>>                   done
>>                   rm -rf $TEST_FILES_DIR/*
>>                 done
>>
>>                 $TEST_FILES_DIR is a location on my gluster mount.
>>
>>                 After about 3 days of this script running on one of
>>                 my clusters, this is what the output of "top" looks like:
>>                   PID   USER  PR  NI    VIRT   RES        SHR S %CPU
>>                 %MEM     TIME+       COMMAND
>>                 16039 root    20   0 1397220  77720 3948 S   20.6  
>>                  1.0            860:01.53  glusterfsd
>>                 13174 root    20   0 1395824  112728 3692 S   19.6  
>>                  1.5            806:07.17  glusterfs
>>                 19961 root    20   0 2967204 *2.145g*   3896 S   17.3
>>                  29.0  752:10.70  glusterfsd
>>
>>                 As you can see one of the brick processes is using
>>                 over 2 gigabytes of memory.
>>
>>                 One work-around for this is to kill the arbiter brick
>>                 process and restart the gluster daemon. This restarts
>>                 arbiter brick process and its memory usage goes back
>>                 down to a reasonable level. However I would rather
>>                 not kill the arbiter brick every week for production
>>                 environments.
>>
>>                 Has anyone seen this issue before and is there a
>>                 known work-around/fix?
>>
>>                 Thanks,
>>                 Ben
>>
>>                 _______________________________________________
>>                 Gluster-users mailing list
>>                 Gluster-users at gluster.org
>>                 <mailto:Gluster-users at gluster.org>
>>                 http://www.gluster.org/mailman/listinfo/gluster-users
>>                 <http://www.gluster.org/mailman/listinfo/gluster-users>
>>
>>
>>
>>
>>             -- 
>>             Pranith
>>
>>
>>
>>
>>         -- 
>>         Benjamin Edgar
>>         Computer Science
>>         University of Virginia 2015
>>         (571) 338-0878
>>
>>
>>         _______________________________________________
>>         Gluster-users mailing list
>>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>>         http://www.gluster.org/mailman/listinfo/gluster-users
>>         <http://www.gluster.org/mailman/listinfo/gluster-users>
>
>     -- 
>     Benjamin Edgar
>     Computer Science
>     University of Virginia 2015
>     (571) 338-0878 <tel:%28571%29%20338-0878>
>
> -- 
> Benjamin Edgar
> Computer Science
> University of Virginia 2015
> (571) 338-0878

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160824/f5d1a60f/attachment.html>


More information about the Gluster-users mailing list