[Gluster-devel] finding and fixing memory leaks in xlators

Amar Tumballi atumball at redhat.com
Tue Apr 25 09:25:30 UTC 2017


Thanks for this detailed Email with clear instructions Niels.

Can we have this as github issues as well? I don't want this information to
be lost as part of email thread in ML archive. I would like to see if I can
get some interns to work on these leaks per component as part of their
college project etc.

-Amar

On Tue, Apr 25, 2017 at 2:29 PM, Niels de Vos <ndevos at redhat.com> wrote:

> Hi,
>
> with the use of gfapi it has become clear that Gluster was never really
> developed to be loaded in applications. There are many different memory
> leaks that get exposed through the usage of gfapi. Before, memory leaks
> were mostly cleaned up automatically because processes would exit. Now,
> there are applications that initialize a GlusterFS client to access a
> volume, and de-init that client once not needed anymore. Unfortunately
> upon the de-init, not all allocated memory is free'd again. These are
> often just a few bytes, but for long running processes this can become a
> real problem.
>
> Finding the memory leaks in xlators has been a tricky thing. Valgrind
> would often not know what function/source the allocation did, and fixing
> the leak would become a real hunt. There have been some patches merged
> that make Valgrind work more easily, and a few patches still need a
> little more review before everything is available. The document below
> describes how to use a new "sink" xlator to detect memory leaks. This
> xlator can only be merged when a change for the graph initialization is
> included too:
>
>  - https://review.gluster.org/16796 - glusterfs_graph_prepare
>  - https://review.gluster.org/16806 - sink xlator and developer doc
>
> It would be most welcome if other developers can reviwe the linked
> changes, so that everyone can easily debug memory leaks from thair
> favorite xlators.
>
> There is a "run-xlator.sh" script in my (for now) personal
> "gluster-debug" repository that can be used to load an arbitrary xlator
> along with "sink". See
> https://github.com/nixpanic/gluster-debug/tree/master/gfapi-load-volfile
> for more details.
>
> Thanks,
> Niels
>
>
> From doc/developer-guide/identifying-resource-leaks.md:
>
> # Identifying Resource Leaks
>
> Like most other pieces of software, GlusterFS is not perfect in how it
> manages
> its resources like memory, threads and the like. Gluster developers try
> hard to
> prevent leaking resources but releasing and unallocating the used
> structures.
> Unfortunately every now and then some resource leaks are unintentionally
> added.
>
> This document tries to explain a few helpful tricks to identify resource
> leaks
> so that they can be addressed.
>
>
> ## Debug Builds
>
> There are certain techniques used in GlusterFS that make it difficult to
> use
> tools like Valgrind for memory leak detection. There are some build options
> that make it more practical to use Valgrind and other tools. When running
> Valgrind, it is important to have GlusterFS builds that contain the
> debuginfo/symbols. Some distributions (try to) strip the debuginfo to get
> smaller executables. Fedora and RHEL based distributions have sub-packages
> called ...-debuginfo that need to be installed for symbol resolving.
>
>
> ### Memory Pools
>
> By using memory pools, there are no allocation/freeing of single structures
> needed. This improves performance, but also makes it impossible to track
> the
> allocation and freeing of srtuctures.
>
> It is possible to disable the use of memory pools, and use standard
> `malloc()`
> and `free()` functions provided by the C library. Valgrind is then able to
> track the allocated areas and verify if they have been free'd. In order to
> disable memory pools, the Gluster sources needs to be configured with the
> `--enable-debug` option:
>
> ```shell
> ./configure --enable-debug
> ```
>
> When building RPMs, the `.spec` handles the `--with=debug` option too:
>
> ```shell
> make dist
> rpmbuild -ta --with=debug glusterfs-....tar.gz
> ```
>
> ### Dynamically Loaded xlators
>
> Valgrind tracks the call chain of functions that do memory allocations. The
> addresses of the functions are stored and before Valgrind exits the
> addresses
> are resolved into human readable function names and offsets (line numbers
> in
> source files). Because Gluster loads xlators dynamically, and unloads then
> before exiting, Valgrind is not able to resolve the function addresses into
> symbols anymore. Whenever this happend, Valgrind shows `???` in the output,
> like
>
> ```
>   ==25170== 344 bytes in 1 blocks are definitely lost in loss record 233
> of 324
>   ==25170==    at 0x4C29975: calloc (vg_replace_malloc.c:711)
>   ==25170==    by 0x52C7C0B: __gf_calloc (mem-pool.c:117)
>   ==25170==    by 0x12B0638A: ???
>   ==25170==    by 0x528FCE6: __xlator_init (xlator.c:472)
>   ==25170==    by 0x528FE16: xlator_init (xlator.c:498)
>   ...
> ```
>
> These `???` can be prevented by not calling `dlclose()` for unloading the
> xlator. This will cause a small leak of the handle that was returned with
> `dlopen()`, but for improved debugging this can be acceptible. For this and
> other Valgrind features, a `--enable-valgrind` option is available to
> `./configure`. When GlusterFS is built with this option, Valgrind will be
> able
> to resolve the symbol names of the functions that do memory allocations
> inside
> xlators.
>
> ```shell
> ./configure --enable-valgrind
> ```
>
> When building RPMs, the `.spec` handles the `--with=valgrind` option too:
>
> ```shell
> make dist
> rpmbuild -ta --with=valgrind glusterfs-....tar.gz
> ```
>
> ## Running Valgrind against a single xlator
>
> Debugging a single xlator is not trivial. But there are some tools to make
> it
> easier. The `sink` xlator does not do any memory allocations itself, but
> contains just enough functionality to mount a volume with only the `sink`
> xlator. There is a little gfapi application under `tests/basic/gfapi/` in
> the
> GlusterFS sources that can be used to run only gfapi and the core GlusterFS
> infrastructure with the `sink` xlator. By extending the `.vol` file to load
> more xlators, each xlator can be debugged pretty much separately (as long
> as
> the xlators have no dependencies on each other). A basic Valgrind run with
> the
> suitable configure options looks like this:
>
> ```shell
> ./autogen.sh
> ./configure --enable-debug --enable-valgrind
> make && make install
> cd tests/basic/gfapi/
> make gfapi-load-volfile
> valgrind ./gfapi-load-volfile sink.vol
> ```
>
> Combined with other very useful options to Valgrind, the following
> execution
> shows many more useful details:
>
> ```shell
> valgrind \
>         --fullpath-after= --leak-check=full --show-leak-kinds=all \
>         ./gfapi-load-volfile sink.vol
> ```
>
> Note that the `--fullpath-after=` option is left empty, this makes Valgrind
> print the full path and filename that contains the functions:
>
> ```
> ==2450== 80 bytes in 1 blocks are definitely lost in loss record 8 of 60
> ==2450==    at 0x4C29975: calloc (/builddir/build/BUILD/
> valgrind-3.11.0/coregrind/m_replacemalloc/vg_replace_malloc.c:711)
> ==2450==    by 0x52C6F73: __gf_calloc (/usr/src/debug/glusterfs-3.
> 11dev/libglusterfs/src/mem-pool.c:117)
> ==2450==    by 0x12F10CDA: init (/usr/src/debug/glusterfs-3.
> 11dev/xlators/meta/src/meta.c:231)
> ==2450==    by 0x528EFD5: __xlator_init (/usr/src/debug/glusterfs-3.
> 11dev/libglusterfs/src/xlator.c:472)
> ==2450==    by 0x528F105: xlator_init (/usr/src/debug/glusterfs-3.
> 11dev/libglusterfs/src/xlator.c:498)
> ==2450==    by 0x52D9D8B: glusterfs_graph_init (/usr/src/debug/glusterfs-3.
> 11dev/libglusterfs/src/graph.c:321)
> ...
> ```
>
> In the above example, the `init` function in `xlators/meta/src/meta.c`
> does a
> memory allocation on line 231. This memory is never free'd again, and hence
> Valgrind logs this call stack. When looking in the code, it seems that the
> allocation of `priv` is assigned to the `this->private` member of the
> `xlator_t` structure. Because the allocation is done in `init()`, free'ing
> is
> expected to happen in `fini()`. Both functions are shown below, with the
> inclusion of the empty `fini()`:
>
>
> ```
> 226 int
> 227 init (xlator_t *this)
> 228 {
> 229         meta_priv_t *priv = NULL;
> 230
> 231         priv = GF_CALLOC (sizeof(*priv), 1, gf_meta_mt_priv_t);
> 232         if (!priv)
> 233                 return -1;
> 234
> 235         GF_OPTION_INIT ("meta-dir-name", priv->meta_dir_name, str,
> out);
> 236
> 237         this->private = priv;
> 238 out:
> 239         return 0;
> 240 }
> 241
> 242
> 243 int
> 244 fini (xlator_t *this)
> 245 {
> 246         return 0;
> 247 }
> ```
>
> In this case, the resource leak can be addressed by adding a single line
> to the
> `fini()` function:
>
> ```
> 243 int
> 244 fini (xlator_t *this)
> 245 {
> 246         GF_FREE (this->private);
> 247         return 0;
> 248 }
> ```
>
> Running the same Valgrind command and comparing the output will show that
> the
> memory leak in `xlators/meta/src/meta.c:init` is not reported anymore.
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20170425/766c96d2/attachment-0001.html>


More information about the Gluster-devel mailing list