[Gluster-devel] Change in itransform/deitransform logic - dht/afr
Soumya Koduri
skoduri at redhat.com
Mon Jun 30 06:02:20 UTC 2014
Hi,
With Change-Id:Ieba9a7071829d51860b7c131982f12e0136b9855 , dht
itansform/deitransform was improved to encode 64-bit brick offset along
with the brick-id in the global d_off.
More details regarding this change are in :
http://review.gluster.org/#/c/4711/
But now with afr using the same itransform/deitransform logic, the
brick-id stored in the afr_global_d_off gets zeroed out when re-encoded
in dht. This happens only when the offsets are huge (i.e with ext4
filesystem) as in such cases, the low n bits are replaced with brick-id
which in turn gets replaced with afr_subvol_id when re-encoded in dht, where
n = log2(N)
N = no. of DHT/AFR subvolumes.
To avoid this, instead of using the last 'n' bits, we have decided to
store the brick/subvol-id in the first 'n' bits leaving the TOP_BIT i.e,
the brick/subvol offset will be first shifted by 'n' bits to the right
and then the brick-id is copied into those first 'n' bits.
This approach as well may result in the loss of few lower bits of the
brick_offset. But as mentioned in the review#4711, since "both EXT4/XFS
are tolerant in terms of the accuracy of the value presented back in
seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the
"closest" true offset.", this approach too seems to work.
Also this brick_id encoding change is necessary even in case of small
offsets, as the smaller offset in the AFR translator could result in a
large offset when passed to DHT which may again result in the loss of
brick_id.
Changes and the modified algorithm have been posted to the review URL -
review.gluster.org/#/c/8201
Kindly review the changes and give us your feedback.
Thanks,
Soumya
More information about the Gluster-devel
mailing list