[Gluster-devel] GFID2 - Proposal to add extra byte to existing GFID

Mon Dec 19 06:57:17 UTC 2016

regards
Aravinda

On 12/16/2016 05:47 PM, Xavier Hernandez wrote:
> On 12/16/2016 08:31 AM, Aravinda wrote:
>> Proposal to add one more byte to GFID to store "Type" information.
>> Extra byte will represent type(directory: 00, file: 01, Symlink: 02
>> etc)
>>
>> For example, if a directory GFID is f4f18c02-0360-4cdc-8c00-0164e49a7afd
>> then, GFID2 will be 00f4f18c02-0360-4cdc-8c00-0164e49a7afd.
>>
>> Changes to Backend store
>> ------------------------
>> Existing: .glusterfs/gfid[0:2]/gfid/[2:4]/gfid
>> Proposed: .glusterfs/gfid2[0:2]/gfid2[2:4]/gfid2[4:6]/gfid2
>>
>> Advantages:
>> -----------
>> - Automatic grouping in .glusterfs directory based on file Type.
>> - Easy identification of Type by looking at GFID in logs/status output
>>   etc.
>> - Crawling(Quota/AFR): List of directories can be easily fetched by
>>   crawling `.glusterfs/gfid2[0:2]/` directory. This enables easy
>>   parallel Crawling.
>> - Quota - Marker: Marker transator can mark xtime of current file and
>>   parent directory. No need to update xtime xattr of all directories
>>   till root.
>> - Geo-replication: - Crawl can be multithreaded during initial sync.
>>   With marker changes above it will be more effective in crawling.
>>
>> Please add if any more advantageous.
>>
>> Disadvantageous:
>> ----------------
>> Functionality is not changed with the above change except the length
>> of the ID. I can't think of any disadvantages except the code changes
>> to accommodate this change. Let me know if I missed anything here.
>
> One disadvantage is that 17 bytes is a very ugly number for 
> structures. Compilers will add paddings that will make any structure 
> containing a GFID noticeable bigger. This will also cause troubles on 
> all binary formats where a GFID is used, making them incompatible. One 
> clear case of this is the XDR encoding of the gluster protocol. 
> Currently a GFID is defined this way in many places:
>
>         opaque gfid[16]
>
> This seems to make it quite complex to allow a mix of gluster versions 
> in the same cluster (for example in a middle of an upgrade).
>
> What about this alternative approach:
>
> Based on the RFC4122 [1] that describes the format of an UUID, we can 
> define a new structure for new GFID's using the same length.
>
> Currently all GFID's are generated using the "random" method. This 
> means that all GFID have this structure:
>
>         xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx
>
> Where N can be 8, 9, A or B, and M is 4.
>
> There are some special GFID's that have a M=0 and N=0, for example the 
> root GFID.
>
> What I propose is to use a new variant of GFID, for example E or F 
> (officially marked as reserved for future definition) or even 0 to 7. 
> We could use M as an internal version for the GFID structure (defined 
> by ourselves when needed). Then we could use the first 4 or 8 bits of 
> each GFID as you propose, without needing to extend current GFID 
> length nor risking to collide with existing GFID's.
>
> If we are concerned about the collision probability (quite small but 
> still bigger than the current version) because we loose some random 
> bits, we could use N = 0..7 and leave M random. This way we get 5 more 
> random bits, from which we could use 4 to represent the inode type.
>
> I think this way everything will work smoothly with older versions 
> with minimal effort.
>
> What do you think ?
That is really nice suggestion.

To get the crawling advantageous as mentioned above, we need to make 
backend store as .glusterfs/N/gfid[0:2]/gfid[2:4]/gfid
>
> Xavi
>
> [1] https://www.ietf.org/rfc/rfc4122.txt
>
>>
>> Changes:
>> ---------
>> - Code changes to accommodate 17 bytes GFID instead of 16 bytes(Read
>>   and Write)
>> - Migration Tool to upgrade GFIDs in Volume/Cluster
>>
>> Let me know your thoughts.
>>
>