[Gluster-devel] GFID2 - Proposal to add extra byte to existing GFID

Mon May 15 19:08:17 UTC 2017

On 05/15/2017 12:48 PM, Xavier Hernandez wrote:
>> [ snip ]
>> Also, I have a question, What are the chances of uuid collision if we
> take just 3 bits from the first byte ?
>>
>> 000 - Unspecified (can be anything).
>> 001 - Directory
>> 010 - Regular File
>> 011 - Special files (symlink, Block and Char devices, socket files etc).
>> {100 - 111} - Reserved.
> 
> This cannot be done. Since we are currently using random UUIDs, on
> average, one of every eight randomly generated ids will start with each
> one of the combinations.
> 
> Already existing GFIDs will be a problem when updating. The only thing
> that can avoid the problem is to create new GFIDs in a format that won't
> collide with existing ones, and this can only be done safely if we use
> the special fields of the UUID itself.
> 
>>
>> As a side-effect, it reduces the number of directories created at as
> the metadata, inside of .glusterfs directory. (Will be 50% of current
> load).
> 
> Maybe we can find a better way to store the GFIDs using the standard
> fields instead of relying on the first bits, which is not a valid solution.
> 
> We can think more about this.

How about using a variation of Version 5 UUIDs? Or define our own Version 6?

Strictly speaking, Version 5 hashes a NamespaceUUID + Name. That won't
work as we'd have too many collisions in the Name part. Instead we could
hash NamespaceUUID + Time + Name; or we could just use Time, like a
Version 1 UUID; or random bits, like a Version 4 UUID.

And store the bits described above in the clock-seq-low part of the GFID.

E.g.:
74738ff5-5367-5958-91ee-98fffdcd1876
              ^ 5 indicates Version 5
                   ^ required for Type 5 first two bits set to 1 and 0
                    ^ 0001 for directory
-- 

Kaleb