[Gluster-devel] how do you debug ref leaks?

Krishnan Parthasarathi kparthas at redhat.com
Thu Sep 18 16:33:49 UTC 2014


I am going to be bold and throw a suggestion inspired from
what I read today. I was reading briefly about how kernel
manages its objects using the kobject data structure and
establishe 'liveness' (in the garbage collection sense)
relationship across objects. It allows one to group objects
into 'sets'. This would help us to categorise objects into
'groups' and derive from statedumps which group the leaking
object belongs to and infer some context out of that. This is
different from the other approaches that have been discussde. I am not familiar
with what more could be done with that sort of an infrastructure. I hope
this opens up the discussion and doesn't distract it.

~KP

----- Original Message -----
> 
> On 09/18/2014 09:35 PM, Pranith Kumar Karampuri wrote:
> >
> > On 09/18/2014 09:31 PM, Kaleb KEITHLEY wrote:
> >> As a wishlist item, I think it'd be nice if debug builds (or some
> >> other build-time option) would disable the pools. Then valgrind might
> >> be more useful for finding leaks.
> Actually there seems to be some issue with running bricks using
> valgrind. Operations on mount hang when we start the bricks (Ravi
> confirmed this situation even today). That still needs to be solved, it
> used to work. Not sure what happened.
> 
> Pranith.
> >>
> >> Maybe for GlusterFS-4.0?
> > This is already available http://review.gluster.org/7835
> >
> > Pranith
> >>
> >>
> >> On 09/18/2014 11:40 AM, Dan Lambright wrote:
> >>> If we could disable/enable ref tracking dynamically, it may only be
> >>> "heavy weight" tempoarily while the customer is being observed.
> >>> You could get a state dump , or another idea is to take a core of
> >>> the live process.   gcore $(pidof processname)
> >>>
> >>> ----- Original Message -----
> >>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> >>>> To: "Shyam" <srangana at redhat.com>, gluster-devel at gluster.org
> >>>> Sent: Thursday, September 18, 2014 11:34:28 AM
> >>>> Subject: Re: [Gluster-devel] how do you debug ref leaks?
> >>>>
> >>>>
> >>>> On 09/18/2014 07:48 PM, Shyam wrote:
> >>>>> On 09/17/2014 10:13 PM, Pranith Kumar Karampuri wrote:
> >>>>>> hi,
> >>>>>>       Till now the only method I used to find ref leaks
> >>>>>> effectively is to
> >>>>>> find what operation is causing ref leaks and read the code to
> >>>>>> find if
> >>>>>> there is a ref-leak somewhere. Valgrind doesn't solve this problem
> >>>>>> because it is reachable memory from inode-table etc. I am just
> >>>>>> wondering
> >>>>>> if there is an effective way anyone else knows of. Do you guys
> >>>>>> think we
> >>>>>> need a better mechanism of finding refleaks? At least which
> >>>>>> decreases
> >>>>>> the search space significantly i.e. xlator y, fop f etc? It would be
> >>>>>> better if we can come up with ways to integrate statedump and
> >>>>>> this infra
> >>>>>> just like we did for mem-accounting.
> >>>>>>
> >>>>>> One way I thought was to introduce new apis called
> >>>>>> xl_fop_dict/inode/fd_ref/unref (). Each xl keeps an array of
> >>>>>> num_fops
> >>>>>> per inode/dict/fd and increments/decrements accordingly. Dump
> >>>>>> this info
> >>>>>> on statedump.
> >>>>>>
> >>>>>> I myself am not completely sure about this idea. It requires all
> >>>>>> xlators
> >>>>>> to change.
> >>>>>>
> >>>>>> Any ideas?
> >>>>>
> >>>>> On a debug build we can use backtrace information stashed per ref and
> >>>>> unref, this will give us history of refs taken and released. Which
> >>>>> will also give the code path where ref was taken and released.
> >>>>>
> >>>>> It is heavy weight, so not for non-debug setups, but if a problem is
> >>>>> reproducible this could be a quick way to check who is not releasing
> >>>>> the ref's or have a history of the refs and unrefs to dig better into
> >>>>> code.
> >>>>>
> >>>> Do you have any ideas for final builds also? Basically when users
> >>>> report
> >>>> leaks it should not take us too long to figure out the problem
> >>>> area. We
> >>>> should just ask them for statedump and should be able to figure out
> >>>> the
> >>>> problem.
> >>>>
> >>>> Pranith
> >>>>> Shyam
> >>>>> _______________________________________________
> >>>>> Gluster-devel mailing list
> >>>>> Gluster-devel at gluster.org
> >>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >>>>
> >>>> _______________________________________________
> >>>> Gluster-devel mailing list
> >>>> Gluster-devel at gluster.org
> >>>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >>>>
> >>> _______________________________________________
> >>> Gluster-devel mailing list
> >>> Gluster-devel at gluster.org
> >>> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >>>
> >>
> >> _______________________________________________
> >> Gluster-devel mailing list
> >> Gluster-devel at gluster.org
> >> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-devel
> 


More information about the Gluster-devel mailing list