[Gluster-devel] What do extra_free and extrastd_free params do in the dictionary object?

Yaniv Kaul ykaul at redhat.com
Mon Jan 13 08:40:33 UTC 2020


To conclude and ensure I understood both:
- I've sent a patch to remove extra_free parameter (and the associated
gf_memdup() in 2 places) -
https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting
for CI to finish working on it).
- I'll send a patch to remove extra_stdfree - but this one is slightly
bigger:
1. There are more places it's set (18).
2. Somehow, we need to release that memory. I kinda assume the sooner the
better?
So perhaps I can just replace:
dict->extra_stdfree = cli_req.dict.dict_val;
with:
free(cli_req.dict.dict_val) ?
3. In some places, there was actually a call to GF_FREE, which is kind of
confusing. For example, in __server_get_snap_info() :
    if (snap_info_rsp.dict.dict_val) {
        GF_FREE(snap_info_rsp.dict.dict_val);
    }

I think I should remove that and stick to freeing right after
unserialization?
Y.



On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <jahernan at redhat.com> wrote:

> On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <ykaul at redhat.com> wrote:
>
>>
>>
>> On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <jahernan at redhat.com>
>> wrote:
>>
>>> On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <amarts at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <jahernan at redhat.com>
>>>> wrote:
>>>>
>>>>> On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <amarts at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <jahernan at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <ykaul at redhat.com> wrote:
>>>>>>>
>>>>>>>> I could not find a relevant use for them. Can anyone enlighten me?
>>>>>>>>
>>>>>>>
>>>>>>> I'm not sure why they are needed. They seem to be used to keep the
>>>>>>> unserialized version of a dict around until the dict is destroyed. I
>>>>>>> thought this could be because we were using pointers to the unserialized
>>>>>>> data inside dict, but that's not the case currently. However, checking very
>>>>>>> old versions (pre 3.2), I see that dict values were not allocated, but a
>>>>>>> pointer to the unserialized data was used.
>>>>>>>
>>>>>>
>>>>>> Xavi,
>>>>>>
>>>>>> While you are right about the intent, it is used still, at least when
>>>>>> I grepped latest repo to keep a reference in protocol layer.
>>>>>>
>>>>>> This is done to reduce a copy after the dictionary's binary content
>>>>>> is received from RPC. The 'extra_free' flag is used when we use a
>>>>>> GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree
>>>>>> is used when RPC itself allocates the buffer and hence uses 'free()' to
>>>>>> free the buffer.
>>>>>>
>>>>>
>>>>> I don't see it. When dict_unserialize() is called, key and value are
>>>>> allocated and copied, so  why do we need to keep the raw data after that ?
>>>>>
>>>>> In 3.1 the value was simply a pointer to the unserialized data, but
>>>>> starting with 3.2, value is memdup'ed. Key is always copied. I don't see
>>>>> any other reference to the unserialized data right now. I think that
>>>>> instead of assigning the raw data to extra_(std)free, we should simply
>>>>> release that memory and remove those fields.
>>>>>
>>>>> Am I missing something else ?
>>>>>
>>>>
>>>> I did grep on 'extra_stdfree' and 'extra_free' and saw that many
>>>> handshake/ and protocol code seemed to use it. Haven't gone deeper to check
>>>> which part.
>>>>
>>>> [amar at kadalu glusterfs]$ git grep extra_stdfree | wc -l
>>>> 40
>>>> [amar at kadalu glusterfs]$ git grep extra_free | wc -l
>>>> 5
>>>>
>>>
>>> Yes, they call dict_unserialize() and then store the unserialized data
>>> into those variables. That's what I'm saying it's not necessary.
>>>
>>
>> In at least 2 cases, there's even something stranger I could not
>> understand (see in bold - from server_setvolume() function) :
>> *    params = dict_new();*
>>     reply = dict_new();
>>     ret = xdr_to_generic(req->msg[0], &args,
>> (xdrproc_t)xdr_gf_setvolume_req);
>>     if (ret < 0) {
>>         // failed to decode msg;
>>         req->rpc_err = GARBAGE_ARGS;
>>         goto fail;
>>     }
>>     ctx = THIS->ctx;
>>
>>     this = req->svc->xl;
>>     /* this is to ensure config_params is populated with the first brick
>>      * details at first place if brick multiplexing is enabled
>>      */
>>     config_params = dict_copy_with_ref(this->options, NULL);
>>
>> *    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);*
>>
>
> This is probably unnecessary if we can really remove extra_free. Probably
> it's here because args.dict.dict_val will be destroyed later.
>
>
>>     if (buf == NULL) {
>>         op_ret = -1;
>>         op_errno = ENOMEM;
>>         goto fail;
>>     }
>>
>> *    ret = dict_unserialize(buf, args.dict.dict_len, &params);*
>>     if (ret < 0) {
>>         ret = dict_set_str(reply, "ERROR",
>>                            "Internal error: failed to unserialize "
>>                            "request dictionary");
>>         if (ret < 0)
>>             gf_msg_debug(this->name, 0,
>>                          "failed to set error "
>>                          "msg \"%s\"",
>>                          "Internal error: failed "
>>                          "to unserialize request dictionary");
>>
>>         op_ret = -1;
>>         op_errno = EINVAL;
>>         goto fail;
>>     }
>>
>>
>>
>> *    params->extra_free = buf;    buf = NULL;*
>>
>
> This is needed to prevent memory pointed by buf from being destroyed once
> if has been assigned to extra_free. Also unnecessary if extra_free can be
> removed.
>
> So we could eliminate one memory copy here if those fields are not really
> necessary.
>
>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>> I think this is not needed anymore. Probably we could remove these
>>>>>>> fields if that's the only reason.
>>>>>>>
>>>>>>
>>>>>> If keeping them is hard to maintain, we can add few allocation to
>>>>>> remove those elements, that shouldn't matter much IMO too. We are not using
>>>>>> dictionary itself as protocol now (which we did in 1.x series though).
>>>>>>
>>>>>> Regards,
>>>>>> Amar
>>>>>> ---
>>>>>> https://kadalu.io
>>>>>>
>>>>>>
>>>>>>
>>>>>>> TIA,
>>>>>>>> Y.
>>>>>>>> _______________________________________________
>>>>>>>>
>>>>>>>> Community Meeting Calendar:
>>>>>>>>
>>>>>>>> APAC Schedule -
>>>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>
>>>>>>>>
>>>>>>>> NA/EMEA Schedule -
>>>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>>
>>>>>>>> Gluster-devel mailing list
>>>>>>>> Gluster-devel at gluster.org
>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>
>>>>>>> Community Meeting Calendar:
>>>>>>>
>>>>>>> APAC Schedule -
>>>>>>> Every 2nd and 4th Tuesday at 11:30 AM IST
>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>
>>>>>>>
>>>>>>> NA/EMEA Schedule -
>>>>>>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>>>>>>> Bridge: https://bluejeans.com/441850968
>>>>>>>
>>>>>>> Gluster-devel mailing list
>>>>>>> Gluster-devel at gluster.org
>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>>>>
>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20200113/acb3db5f/attachment-0001.html>


More information about the Gluster-devel mailing list