[Gluster-devel] too many failures on mpx-restart-crash.t on master branch

Amar Tumballi atumball at redhat.com
Thu Dec 20 09:09:37 UTC 2018


Considering, we have the effort to reduce the threads in progress, should
we mark it as known issue till we get the other reduced threads patch
merged?

-Amar

On Thu, Dec 20, 2018 at 2:38 PM Poornima Gurusiddaiah <pgurusid at redhat.com>
wrote:

> So, this failure is related to patch [1] iobuf. Thanks to Pranith for
> identifying this. This patch increases the memory consumption in the brick
> mux use case(**) and causes oom kill. But it is not the problem with the
> patch itself. The only way to rightly fix it is to fix the issue [2]. That
> said we cannot wait until this issue is fixed, the possible work arounds
> are:
> - Reduce the volume creation count in test case mpx-restart-crash.t
> (temporarily until [2] is fixed)
> - Increase the resources(RAM to 4G?) on the regression system
> - Revert the patch until [2] is completely fixed
>
> Root Cause:
> Without the iobuf patch [1], we had a pre allocated pool of min size
> 12.5MB(which can grow), in many cases this entire size may not be
> completely used. Hence we moved to per thread mem pool for iobuf as well.
> With this we expect the memory consumption of the processes to go down, and
> it did go down.After creating 20 volumes on the system, the free -m output:
> With this patch:
>                    total        used        free      shared  buff/cache
> available
> Mem:           3789        2198       290         249        1300
> 968
> Swap:          3071           0         3071
>
> Without this patch:
>                    total        used        free      shared  buff/cache
> available
> Mem:           3789        2280         115         488
> 1393         647
> Swap:          3071           0           3071
> This output can vary based on system state, workload etc. This is not
> indicative of the exact amount of memory reduction, but of the fact that
> the memory usage is reduced.
>
> But, with brick mux the scenario is different. Since we use per thread mem
> pool for iobuf in patch [1], the memory consumption due to iobuf increases
> if the threads increases. In the current brick mux implementation, for 20
> volumes(in the mpx-restart-crash test), the number of threads is 1439. And
> the allocated iobufs(or any other per thread mem pool memory) doesn't get
> freed until 30s(garbage collection time) of issuing free(eg: iobuf_put). As
> a result of this the memory consumption of the process appears to increase
> for brick mux. Reducing the number of threads to <100 [2] will solve this
> issue. To prove this theory, if we add 30sec delay between each volume
> create in mpx-restart-crash, the mem consumption is:
>
> With this patch after adding 30s delay between create volume:
>                    total        used       free      shared  buff/cache
> available
> Mem:           3789        1344      947         488        1497
> 1606
> Swap:          3071           0        3071
>
> With this patch:
>                     total        used        free      shared
> buff/cache   available
> Mem:           3789        1710         840         235        1238
> 1494
> Swap:          3071           0           3071
>
> Without this patch:
>                    total        used        free      shared  buff/cache
> available
> Mem:           3789        1413      969         355        1406
> 1668
> Swap:          3071           0        3071
>
> Regards,
> Poornima
>
> [1] https://review.gluster.org/#/c/glusterfs/+/20362/
> [2] https://github.com/gluster/glusterfs/issues/475
>
> On Thu, Dec 20, 2018 at 10:28 AM Amar Tumballi <atumball at redhat.com>
> wrote:
>
>> Since yesterday at least 10+ patches have failed regression on ./tests/bugs/core/bug-1432542-mpx-restart-crash.t
>>
>>
>> Help to debug them soon would be appreciated.
>>
>>
>> Regards,
>>
>> Amar
>>
>>
>> --
>> Amar Tumballi (amarts)
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>

-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20181220/61cd7c5d/attachment.html>


More information about the Gluster-devel mailing list