[Gluster-devel] What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez jahernan at redhat.com
Mon Jan 13 09:07:53 UTC 2020


On Mon, Jan 13, 2020 at 9:41 AM Yaniv Kaul <ykaul at redhat.com> wrote:

> To conclude and ensure I understood both:
> - I've sent a patch to remove extra_free parameter (and the associated
> gf_memdup() in 2 places) -
> https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting for CI to
> finish working on it).
> - I'll send a patch to remove extra_stdfree - but this one is slightly
> bigger:
> 1. There are more places it's set (18).
> 2. Somehow, we need to release that memory. I kinda assume the sooner the
> better?
> So perhaps I can just replace:
> dict->extra_stdfree = cli_req.dict.dict_val;
> with:
> free(cli_req.dict.dict_val) ?
>

Yes. I think this is the best and easiest approach.


> 3. In some places, there was actually a call to GF_FREE, which is kind of
> confusing. For example, in __server_get_snap_info() :
>     if (snap_info_rsp.dict.dict_val) {
>         GF_FREE(snap_info_rsp.dict.dict_val);
>     }
>

This seems like a bug. Additionally, this memory should be released using
free() instead of GF_FREE().


>
> I think I should remove that and stick to freeing right after
> unserialization?
>

Yes. I agree.

Regards,

Xavi


>
> On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <jahernan at redhat.com>
> wrote:
>
>> On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <ykaul at redhat.com> wrote:
>>
>>>
>>>
>>> On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <jahernan at redhat.com>
>>> wrote:
>>>
>>>> On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <amarts at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <jahernan at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <amarts at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <jahernan at redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <ykaul at redhat.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I could not find a relevant use for them. Can anyone enlighten me?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure why they are needed. They seem to be used to keep the
>>>>>>>> unserialized version of a dict around until the dict is destroyed. I
>>>>>>>> thought this could be because we were using pointers to the unserialized
>>>>>>>> data inside dict, but that's not the case currently. However, checking very
>>>>>>>> old versions (pre 3.2), I see that dict values were not allocated, but a
>>>>>>>> pointer to the unserialized data was used.
>>>>>>>>
>>>>>>>
>>>>>>> Xavi,
>>>>>>>
>>>>>>> While you are right about the intent, it is used still, at least
>>>>>>> when I grepped latest repo to keep a reference in protocol layer.
>>>>>>>
>>>>>>> This is done to reduce a copy after the dictionary's binary content
>>>>>>> is received from RPC. The 'extra_free' flag is used when we use a
>>>>>>> GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree
>>>>>>> is used when RPC itself allocates the buffer and hence uses 'free()' to
>>>>>>> free the buffer.
>>>>>>>
>>>>>>
>>>>>> I don't see it. When dict_unserialize() is called, key and value are
>>>>>> allocated and copied, so  why do we need to keep the raw data after that ?
>>>>>>
>>>>>> In 3.1 the value was simply a pointer to the unserialized data, but
>>>>>> starting with 3.2, value is memdup'ed. Key is always copied. I don't see
>>>>>> any other reference to the unserialized data right now. I think that
>>>>>> instead of assigning the raw data to extra_(std)free, we should simply
>>>>>> release that memory and remove those fields.
>>>>>>
>>>>>> Am I missing something else ?
>>>>>>
>>>>>
>>>>> I did grep on 'extra_stdfree' and 'extra_free' and saw that many
>>>>> handshake/ and protocol code seemed to use it. Haven't gone deeper to check
>>>>> which part.
>>>>>
>>>>> [amar at kadalu glusterfs]$ git grep extra_stdfree | wc -l
>>>>> 40
>>>>> [amar at kadalu glusterfs]$ git grep extra_free | wc -l
>>>>> 5
>>>>>
>>>>
>>>> Yes, they call dict_unserialize() and then store the unserialized data
>>>> into those variables. That's what I'm saying it's not necessary.
>>>>
>>>
>>> In at least 2 cases, there's even something stranger I could not
>>> understand (see in bold - from server_setvolume() function) :
>>> *    params = dict_new();*
>>>     reply = dict_new();
>>>     ret = xdr_to_generic(req->msg[0], &args,
>>> (xdrproc_t)xdr_gf_setvolume_req);
>>>     if (ret < 0) {
>>>         // failed to decode msg;
>>>         req->rpc_err = GARBAGE_ARGS;
>>>         goto fail;
>>>     }
>>>     ctx = THIS->ctx;
>>>
>>>     this = req->svc->xl;
>>>     /* this is to ensure config_params is populated with the first brick
>>>      * details at first place if brick multiplexing is enabled
>>>      */
>>>     config_params = dict_copy_with_ref(this->options, NULL);
>>>
>>> *    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);*
>>>
>>
>> This is probably unnecessary if we can really remove extra_free. Probably
>> it's here because args.dict.dict_val will be destroyed later.
>>
>>
>>>     if (buf == NULL) {
>>>         op_ret = -1;
>>>         op_errno = ENOMEM;
>>>         goto fail;
>>>     }
>>>
>>> *    ret = dict_unserialize(buf, args.dict.dict_len, &params);*
>>>     if (ret < 0) {
>>>         ret = dict_set_str(reply, "ERROR",
>>>                            "Internal error: failed to unserialize "
>>>                            "request dictionary");
>>>         if (ret < 0)
>>>             gf_msg_debug(this->name, 0,
>>>                          "failed to set error "
>>>                          "msg \"%s\"",
>>>                          "Internal error: failed "
>>>                          "to unserialize request dictionary");
>>>
>>>         op_ret = -1;
>>>         op_errno = EINVAL;
>>>         goto fail;
>>>     }
>>>
>>>
>>>
>>> *    params->extra_free = buf;    buf = NULL;*
>>>
>>
>> This is needed to prevent memory pointed by buf from being destroyed once
>> if has been assigned to extra_free. Also unnecessary if extra_free can be
>> removed.
>>
>> So we could eliminate one memory copy here if those fields are not really
>> necessary.
>>
>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I think this is not needed anymore. Probably we could remove these
>>>>>>>> fields if that's the only reason.
>>>>>>>>
>>>>>>>
>>>>>>> If keeping them is hard to maintain, we can add few allocation to
>>>>>>> remove those elements, that shouldn't matter much IMO too. We are not using
>>>>>>> dictionary itself as protocol now (which we did in 1.x series though).
>>>>>>>
>>>>>>> Regards,
>>>>>>> Amar
>>>>>>> ---
>>>>>>> https://kadalu.io
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> TIA,
>>>>>>>>> Y.
>>>>>>>>> _______________________________________________
>>>>>>>>>
>>>>>>>>> Community Meeting Calendar:
>>>>>>>>>
>>>>>>>>> APAC Schedule -
>>>>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> NA/EMEA Schedule -
>>>>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>>
>>>>>>>>> Gluster-devel mailing list
>>>>>>>>> Gluster-devel at gluster.org
>>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>
>>>>>>>> Community Meeting Calendar:
>>>>>>>>
>>>>>>>> APAC Schedule -
>>>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>
>>>>>>>>
>>>>>>>> NA/EMEA Schedule -
>>>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>
>>>>>>>> Gluster-devel mailing list
>>>>>>>> Gluster-devel at gluster.org
>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20200113/54337d59/attachment.html>


More information about the Gluster-devel mailing list