[Gluster-devel] GFID2 - Proposal to add extra byte to existing GFID

Fri Dec 16 12:17:20 UTC 2016

On 12/16/2016 08:31 AM, Aravinda wrote:
> Proposal to add one more byte to GFID to store "Type" information.
> Extra byte will represent type(directory: 00, file: 01, Symlink: 02
> etc)
>
> For example, if a directory GFID is f4f18c02-0360-4cdc-8c00-0164e49a7afd
> then, GFID2 will be 00f4f18c02-0360-4cdc-8c00-0164e49a7afd.
>
> Changes to Backend store
> ------------------------
> Existing: .glusterfs/gfid[0:2]/gfid/[2:4]/gfid
> Proposed: .glusterfs/gfid2[0:2]/gfid2[2:4]/gfid2[4:6]/gfid2
>
> Advantages:
> -----------
> - Automatic grouping in .glusterfs directory based on file Type.
> - Easy identification of Type by looking at GFID in logs/status output
>   etc.
> - Crawling(Quota/AFR): List of directories can be easily fetched by
>   crawling `.glusterfs/gfid2[0:2]/` directory. This enables easy
>   parallel Crawling.
> - Quota - Marker: Marker transator can mark xtime of current file and
>   parent directory. No need to update xtime xattr of all directories
>   till root.
> - Geo-replication: - Crawl can be multithreaded during initial sync.
>   With marker changes above it will be more effective in crawling.
>
> Please add if any more advantageous.
>
> Disadvantageous:
> ----------------
> Functionality is not changed with the above change except the length
> of the ID. I can't think of any disadvantages except the code changes
> to accommodate this change. Let me know if I missed anything here.

One disadvantage is that 17 bytes is a very ugly number for structures. 
Compilers will add paddings that will make any structure containing a 
GFID noticeable bigger. This will also cause troubles on all binary 
formats where a GFID is used, making them incompatible. One clear case 
of this is the XDR encoding of the gluster protocol. Currently a GFID is 
defined this way in many places:

         opaque gfid[16]

This seems to make it quite complex to allow a mix of gluster versions 
in the same cluster (for example in a middle of an upgrade).

What about this alternative approach:

Based on the RFC4122 [1] that describes the format of an UUID, we can 
define a new structure for new GFID's using the same length.

Currently all GFID's are generated using the "random" method. This means 
that all GFID have this structure:

         xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx

Where N can be 8, 9, A or B, and M is 4.

There are some special GFID's that have a M=0 and N=0, for example the 
root GFID.

What I propose is to use a new variant of GFID, for example E or F 
(officially marked as reserved for future definition) or even 0 to 7. We 
could use M as an internal version for the GFID structure (defined by 
ourselves when needed). Then we could use the first 4 or 8 bits of each 
GFID as you propose, without needing to extend current GFID length nor 
risking to collide with existing GFID's.

If we are concerned about the collision probability (quite small but 
still bigger than the current version) because we loose some random 
bits, we could use N = 0..7 and leave M random. This way we get 5 more 
random bits, from which we could use 4 to represent the inode type.

I think this way everything will work smoothly with older versions with 
minimal effort.

What do you think ?

Xavi

[1] https://www.ietf.org/rfc/rfc4122.txt

>
> Changes:
> ---------
> - Code changes to accommodate 17 bytes GFID instead of 16 bytes(Read
>   and Write)
> - Migration Tool to upgrade GFIDs in Volume/Cluster
>
> Let me know your thoughts.
>