[Gluster-Maintainers] GF_CALLOC to GF_MALLOC conversion - is it safe?

Atin Mukherjee amukherj at redhat.com
Thu Mar 21 10:45:33 UTC 2019


All,

In the last few releases of glusterfs, with stability as a primary theme of
the releases, there has been lots of changes done on the code optimization
with an expectation that such changes will have gluster to provide better
performance. While many of these changes do help, but off late we have
started seeing some diverse effects of them, one especially being the
calloc to malloc conversions. While I do understand that malloc syscall
will eliminate the extra memset bottleneck which calloc bears, but with
recent kernels having in-built strong compiler optimizations I am not sure
whether that makes any significant difference, but as I mentioned earlier
certainly if this isn't done carefully it can potentially introduce lot of
bugs and I'm writing this email to share one of such experiences.

Sanju & I were having troubles for last two days to figure out why
https://review.gluster.org/#/c/glusterfs/+/22388/ wasn't working in Sanju's
system but it had no problems running the same fix in my gluster
containers. After spending a significant amount of time, what we now
figured out is that a malloc call [1] (which was a calloc earlier) is the
culprit here. As you all can see, in this function we allocate txn_id and
copy the event->txn_id into it through gf_uuid_copy () . But when we were
debugging this step wise through gdb, txn_id wasn't exactly copied with the
exact event->txn_id and it had some junk values which made the
glusterd_clear_txn_opinfo to be invoked with a wrong txn_id later on
resulting the leaks to remain the same which was the original intention of
the fix.

This was quite painful to debug and we had to spend some time to figure
this out. Considering we have converted many such calls in past, I'd urge
that we review all such conversions and see if there're any side effects to
it. Otherwise we might end up running into many potential memory related
bugs later on. OTOH, going forward I'd request every patch
owners/maintainers to pay some special attention to these conversions and
see they are really beneficial and error free. IMO, general guideline
should be - for bigger buffers, malloc would make better sense but has to
be done carefully, for smaller size, we stick to calloc.

What do others think about it?

[1]
https://github.com/gluster/glusterfs/blob/master/xlators/mgmt/glusterd/src/glusterd-op-sm.c#L5681
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/maintainers/attachments/20190321/36cc1735/attachment.html>


More information about the maintainers mailing list