[Gluster-users] Arbiter brick size estimation
Ravishankar N
ravishankar at redhat.com
Fri Mar 18 01:18:09 UTC 2016
Thanks Oleksandr! I'll update
http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
with a link to your gist.
On 03/18/2016 04:24 AM, Oleksandr Natalenko wrote:
> Ravi,
>
> here is the summary: [1]
>
> Regards,
> Oleksandr.
>
> [1] https://gist.github.com/e8265ca07f7b19f30bb3
>
> On четвер, 17 березня 2016 р. 09:58:14 EET Ravishankar N wrote:
>> On 03/16/2016 10:57 PM, Oleksandr Natalenko wrote:
>>> OK, I've repeated the test with the following hierarchy:
>>>
>>> * 10 top-level folders with 10 second-level folders each;
>>> * 10 000 files in each second-level folder.
>>>
>>> So, this composes 10×10×10000=1M files and 100 folders
>>>
>>> Initial brick used space: 33 M
>>> Initial inodes count: 24
>>>
>>> After test:
>>>
>>> * each brick in replica took 18G, and the arbiter brick took 836M;
>>> * inodes count: 1066036
>>>
>>> So:
>>>
>>> (836 - 33) / (1066036 - 24) == 790 bytes per inode.
>>>
>>> So, yes, it is slightly bigger value than with previous test due to, I
>>> guess, lots of files in one folder, but it is still too far from 4k.
>>> Given a good engineer should consider 30% reserve, the ratio is about 1k
>>> per stored inode.
>>>
>>> Correct me if I'm missing something (regarding average workload and not
>>> corner cases).
>> Looks okay to me Oleksandr. You might want to make a github gist of your
>> tests+results as a reference for others.
>> Regards,
>> Ravi
>>
>>> Test script is here: [1]
>>>
>>> Regards,
>>>
>>> Oleksandr.
>>>
>>> [1] http://termbin.com/qlvz
>>>
>>> On вівторок, 8 березня 2016 р. 19:13:05 EET Ravishankar N wrote:
>>>> On 03/05/2016 03:45 PM, Oleksandr Natalenko wrote:
>>>>> In order to estimate GlusterFS arbiter brick size, I've deployed test
>>>>> setup
>>>>> with replica 3 arbiter 1 volume within one node. Each brick is located
>>>>> on
>>>>> separate HDD (XFS with inode size == 512). Using GlusterFS v3.7.6 +
>>>>> memleak
>>>>> patches. Volume options are kept default.
>>>>>
>>>>> Here is the script that creates files and folders in mounted volume: [1]
>>>>>
>>>>> The script creates 1M of files of random size (between 1 and 32768
>>>>> bytes)
>>>>> and some amount of folders. After running it I've got 1036637 folders.
>>>>> So, in total it is 2036637 files and folders.
>>>>>
>>>>> The initial used space on each brick is 42M . After running script I've
>>>>> got:
>>>>>
>>>>> replica brick 1 and 2: 19867168 kbytes == 19G
>>>>> arbiter brick: 1872308 kbytes == 1.8G
>>>>>
>>>>> The amount of inodes on each brick is 3139091. So here goes estimation.
>>>>>
>>>>> Dividing arbiter used space by files+folders we get:
>>>>>
>>>>> (1872308 - 42000)/2036637 == 899 bytes per file or folder
>>>>>
>>>>> Dividing arbiter used space by inodes we get:
>>>>>
>>>>> (1872308 - 42000)/3139091 == 583 bytes per inode
>>>>>
>>>>> Not sure about what calculation is correct.
>>>> I think the first one is right because you still haven't used up all the
>>>> inodes.(2036637 used vs. the max. permissible 3139091). But again this
>>>> is an approximation because not all files would be 899 bytes. For
>>>> example if there are a thousand files present in a directory, then du
>>>> <dirname> would be more than du <file> because the directory will take
>>>> some disk space to store the dentries.
>>>>
>>>>> I guess we should consider the one
>>>>>
>>>>> that accounts inodes because of .glusterfs/ folder data.
>>>>>
>>>>> Nevertheless, in contrast, documentation [2] says it should be 4096
>>>>> bytes
>>>>> per file. Am I wrong with my calculations?
>>>> The 4KB is a conservative estimate considering the fact that though the
>>>> arbiter brick does not store data, it still keeps a copy of both user
>>>> and gluster xattrs. For example, if the application sets a lot of
>>>> xattrs, it can consume a data block if they cannot be accommodated on
>>>> the inode itself. Also there is the .glusterfs folder like you said
>>>> which would take up some space. Here is what I tried on an XFS brick:
>>>> [root at ravi4 brick]# touch file
>>>>
>>>> [root at ravi4 brick]# ls -l file
>>>> -rw-r--r-- 1 root root 0 Mar 8 12:54 file
>>>>
>>>> [root at ravi4 brick]# du file
>>>> *0 file**
>>>> *
>>>> [root at ravi4 brick]# for i in {1..100}
>>>>
>>>> > do
>>>> > setfattr -n user.value$i -v value$i file
>>>> > done
>>>>
>>>> [root at ravi4 brick]# ll -l file
>>>> -rw-r--r-- 1 root root 0 Mar 8 12:54 file
>>>>
>>>> [root at ravi4 brick]# du -h file
>>>> *4.0K file**
>>>> *
>>>> Hope this helps,
>>>> Ravi
>>>>
>>>>> Pranith?
>>>>>
>>>>> [1] http://termbin.com/ka9x
>>>>> [2]
>>>>> http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-v
>>>>> o
>>>>> lumes-and-quorum/ _______________________________________________
>>>>> Gluster-devel mailing list
>>>>> Gluster-devel at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>
More information about the Gluster-users
mailing list