[Gluster-devel] Broken code alert - translator fini functions

Kaushal M kshlmster at gmail.com
Thu Nov 26 09:28:56 UTC 2015


On Thu, Nov 26, 2015 at 10:03 AM, Atin Mukherjee <amukherj at redhat.com> wrote:
>
>
> On 11/26/2015 05:49 AM, Jeff Darcy wrote:
>>
>> In the process of debugging a test that relies on my translator’s “fini” function being called, I discovered that these functions are not being called for translators in glusterfsd.  The offending code seems to be at glusterfsd.c:1274 (cleanup_and_exit).
>>
>> 1274         if (ctx->process_mode == GF_GLUSTERD_PROCESS) {
> GlusterD team enabled this code conditionally to fix its URCU related
> crashes. Prior to that the entire code was commented out and it seems
> like a pretty old commit (13c4f8d0) had disabled fini () for all the
> translators with a commit message saying " calling 'fini()' of each
> xlator needs more synchronization work to
> be done. We will be doing a direct 'exit()' as of now."
>
> Does anyone have background on this?
>
> Thanks,
> Atin
>
>> 1275
>> 1276                 trav = NULL;
>> 1277                 if (ctx->active)
>> 1278                         trav = ctx->active->top;
>> 1279                 while (trav) {
>> 1280                         if (trav->fini) {
>> 1281                                 THIS = trav;
>> 1282                                 trav->fini (trav);
>> 1283                         }
>> 1284                         trav = trav->next;
>> 1285                 }
>> 1286
>> 1287         }
>>
>> This might have been a simple fix, except that when I changed the code to call fini in glusterfsd as well as glusterd, I started getting segfaults.  It seems that, since they haven’t been tested in a while, some of these functions are also broken and nobody has known since May when this code was changed.  (Actually it might be longer, since the commit message notes “clean up issues” as a reason for the check on line 1274, but apparently addressing this problem never got on anyone’s TODO list.)
>>
>> The reason I’m mentioning this is not just to complain or assign blame.  Stuff happens.  I just want people to know so that, as this gets cleaned up, it’s in people’s minds as a potential factor in other unexpected new behavior.

The xlator cleanup part of cleanup_and_exit() had been disabled
(almost forever), specifically to avoid the segfaults you've observed.
None of the xlators had their fini()'s called on shutdown. As Atin
mentioned, this started causing problems with URCU. So we enabled the
clean up only for GlusterD.

>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel


More information about the Gluster-devel mailing list