[Gluster-devel] Regarding the issues gluster DHT and Layouts of bricks

Thu May 21 10:56:05 UTC 2015

Hi  All,

Could you please guide us  to solve the following DHT and brick layout problem we are  dealing with ? Questions are marked bold.

Problem statement :

1.      We have a requirement to achieve maximum write and read performance and we have to meet some committed performance metrics.

               Our goal is to place each file into different bricks to get optimal performance and also observer the nature of the  throughput , hence need to have a mechanism  to generate different hash using gluster glusterfs.gf_dm_hashfn,
(assuming number of files are : N , Number of Bricks :N)  to place spate bricks.

-        How to make sure each file has different hash and   falls to different bricks ?

-        Other way to put the question if I  know the range of the brick layout or more precisely if I know the  hex value of the desired hash ( so that it will be placed desired brick)  that we need to generate from Davis-Meyer algorithm used in gluster,  Can we create a file name such that, that also solve our problem to some extent?

2.      We tried to experiment to see  how a file in gluster is decided to be placed in a particular brick following gluster glusterfs.gf_dm_hashfn and took some idea from
       some articles  like http://gluster.readthedocs.org/en/latest/Features/dht/ ,  https://joejulian.name/blog/dht-misses-are-expensive/ page which describes layout for that brick  and calculate a hash for the file.

        To minimize collisions or generating different hash in such way to place each file in different bricks ( file 1 => brick A, file 2 => Brick B, file 3=>  Brick C, file 4 => brick D)

               We use kind of similar script to get the hash value for a file

def gf_dm_hashfn(filename):
    return ctypes.c_uint32(glusterfs.gf_dm_hashfn(
        filename,
        len(filendame)))

if __name__ == "__main__":
    print hex(gf_dm_hashfn(sys.argv[1]).value)

We can then calculate the hash for a filename:
# python gf_dm_hash.py file1
0x99d1b6fL

Extended attribute is fetch to check the range and try to match the above generated hash value.

getfattr -n trusted.glusterfs.dht -e hex file1

      However we are not able to exactly follow till this point ,  how the hash value matched to one of the layout assignments, to yield what we call a hashed location.

-        My question is if I  know the range of brick lay out ( say  0xc0000000 to  0xffffffff, is range  select a hash 0xc0070000 ) where to be placed the next file can we generate the name ( kind of reverse of  gluster glusterfs.gf_dm_hashfn) ?

PS :  Susant : Can you throw some light or suggest  a method we are trying to solve.

Thanks for your time.

Best Regards,
Subrata Ghosh

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150521/eb997206/attachment-0001.html>