[Gluster-devel] Query regarding dictionary logic

Mohit Agrawal moagrawa at redhat.com
Thu May 2 06:45:09 UTC 2019


Hi Vijay,

I have tried to execute smallfile tool on volume(12x3), i have not found
any significant performance improvement
for smallfile operations, I have configured 4 clients and 8 thread to run
operations.

I have generated statedump and found below data for dictionaries specific
to gluster processes

brick
max-pairs-per-dict=50
total-pairs-used=192212171
total-dicts-used=24794349
average-pairs-per-dict=7


glusterd
max-pairs-per-dict=301
total-pairs-used=156677
total-dicts-used=30719
average-pairs-per-dict=5


fuse process
[dict]
max-pairs-per-dict=50
total-pairs-used=88669561
total-dicts-used=12360543
average-pairs-per-dict=7

It seems dictionary has max-pairs in case of glusterd and while no. of
volumes are high the number can be increased.
I think there is no performance regression in case of brick and fuse. I
have used hash_size 20 for the dictionary.
Let me know if you can provide some other test to validate the same.

Thanks,
Mohit Agrawal

On Tue, Apr 30, 2019 at 2:29 PM Mohit Agrawal <moagrawa at redhat.com> wrote:

> Thanks, Amar for sharing the patch, I will test and share the result.
>
> On Tue, Apr 30, 2019 at 2:23 PM Amar Tumballi Suryanarayan <
> atumball at redhat.com> wrote:
>
>> Shreyas/Kevin tried to address it some time back using
>> https://bugzilla.redhat.com/show_bug.cgi?id=1428049 (
>> https://review.gluster.org/16830)
>>
>> I vaguely remember the reason to keep the hash value 1 was done during
>> the time when we had dictionary itself sent as on wire protocol, and in
>> most other places, number of entries in dictionary was on an avg, 3. So, we
>> felt, saving on a bit of memory for optimization was better at that time.
>>
>> -Amar
>>
>> On Tue, Apr 30, 2019 at 12:02 PM Mohit Agrawal <moagrawa at redhat.com>
>> wrote:
>>
>>> sure Vijay, I will try and update.
>>>
>>> Regards,
>>> Mohit Agrawal
>>>
>>> On Tue, Apr 30, 2019 at 11:44 AM Vijay Bellur <vbellur at redhat.com>
>>> wrote:
>>>
>>>> Hi Mohit,
>>>>
>>>> On Mon, Apr 29, 2019 at 7:15 AM Mohit Agrawal <moagrawa at redhat.com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>>   I was just looking at the code of dict, I have one query current
>>>>> dictionary logic.
>>>>>   I am not able to understand why we use hash_size is 1 for a
>>>>> dictionary.IMO with the
>>>>>   hash_size of 1 dictionary always work like a list, not a hash, for
>>>>> every lookup
>>>>>   in dictionary complexity is O(n).
>>>>>
>>>>>   Before optimizing the code I just want to know what was the exact
>>>>> reason to define
>>>>>   hash_size is 1?
>>>>>
>>>>
>>>> This is a good question. I looked up the source in gluster's historic
>>>> repo [1] and hash_size is 1 even there. So, this could have been the case
>>>> since the first version of the dictionary code.
>>>>
>>>> Would you be able to run some tests with a larger hash_size and share
>>>> your observations?
>>>>
>>>> Thanks,
>>>> Vijay
>>>>
>>>> [1]
>>>> https://github.com/gluster/historic/blob/master/libglusterfs/src/dict.c
>>>>
>>>>
>>>>
>>>>>
>>>>>   Please share your view on the same.
>>>>>
>>>>> Thanks,
>>>>> Mohit Agrawal
>>>>> _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>>>
>>>> _______________________________________________
>>> Gluster-devel mailing list
>>> Gluster-devel at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>>
>> --
>> Amar Tumballi (amarts)
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190502/a1369c71/attachment-0001.html>


More information about the Gluster-devel mailing list