[Gluster-devel] Change in itransform/deitransform logic - dht/afr

Soumya Koduri skoduri at redhat.com
Mon Jun 30 06:02:20 UTC 2014


Hi,

With Change-Id:Ieba9a7071829d51860b7c131982f12e0136b9855 , dht
itansform/deitransform was improved to encode 64-bit brick offset along
with the brick-id in the global d_off.

More details regarding this change are in :
http://review.gluster.org/#/c/4711/

But now with afr using the same itransform/deitransform logic, the
brick-id stored in the afr_global_d_off gets zeroed out when re-encoded
in dht. This happens only when the offsets are huge (i.e with ext4
filesystem) as in such cases, the low n bits are replaced with brick-id
which in turn gets replaced with afr_subvol_id when re-encoded in dht, where
       n = log2(N)
       N = no. of DHT/AFR subvolumes.

To avoid this, instead of using the last 'n' bits, we have decided to
store the brick/subvol-id in the first 'n' bits leaving the TOP_BIT i.e, 
the brick/subvol offset will be first shifted by 'n' bits to the right 
and then the brick-id is copied into those first 'n' bits.

This approach as well may result in the loss of few lower bits of the
brick_offset. But as mentioned in the review#4711, since "both EXT4/XFS
are tolerant in terms of the accuracy of the value presented back in
seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the 
"closest" true offset.", this approach too seems to work.

Also this brick_id encoding change is necessary even in case of small
offsets, as the smaller offset in the AFR translator could result in a
large offset when passed to DHT which may again result in the loss of 
brick_id.

Changes and the modified algorithm have been posted to the review URL -
review.gluster.org/#/c/8201

Kindly review the changes and give us your feedback.

Thanks,
Soumya




More information about the Gluster-devel mailing list