[Gluster-devel] Readdir d_off encoding
Shyam
srangana at redhat.com
Mon Dec 15 20:46:37 UTC 2014
With the changes present in [1] and [2],
A short explanation of the change would be, we encode the subvol ID in
the d_off, losing 'n + 1' bits in case the high order n+1 bits of the
underlying xlator returned d_off is not free. (Best to read the commit
message for [1] :) )
Although not related to the latest patch, here is something to consider
for the future:
We now have DHT, AFR, EC(?), DHT over DHT (Tier) which need subvol
encoding in the returned readdir offset. Due to this, the loss in bits
_may_ cause unwanted offset behavior, when used in the current scheme.
As we would end up eating more bits than what we do at present.
Or IOW, we could be invalidating the assumption "both EXT4/XFS are
tolerant in terms of the accuracy of the value presented
back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which
has the "closest" true offset."
Should we reconsider an in memory _cookie_ like approach that can help
in this case?
It would invalidate (some or all based on the implementation) the
following constraints that the current design resolves, (from, [1])
- Nothing to "remember in memory" or evict "old entries".
- Works fine across NFS server reboots and also NFS head failover.
- Tolerant to seekdir() to arbitrary locations.
But, would provide a more reliable readdir offset for use (when valid
and not evicted, say).
How would NFS adapt to this? Does Ganesha need a better scheme when
doing multi-head NFS fail over?
Thoughts?
Shyam
[1] http://review.gluster.org/#/c/4711/
[2] http://review.gluster.org/#/c/8201/
More information about the Gluster-devel
mailing list