[Gluster-devel] ext4 64bit offset related fix for dht

Anand Avati avati at redhat.com
Thu Jul 12 15:06:23 UTC 2012


This looks mostly good, except we can introduce this extra conditional:
If offsets presented by the backend filesystem are already fitting
within the lower 64-log(servers) bits [which happens to be true with XFS
for e.g], then we continue the current approach of transformation as it
permits arbitrary seekdir(). The "FUSE" approach here will only permit
seekdir(0), and seekdir(random) will be broken. This would show up on
the day we encounter that first application we performs seekdir(random)
on a directory FD.

As Ric was mentioning, we can check this with a systemtap script on a
busy system to trap any seekdir call with a non-0 parameter and confirm.
However if we do find an application which relies upon that behavior, we
can implement a cookie cache and return virtual (short lived, for the
life of the fd) d_offs and map them to server+physical_d_off in the cache.

I suggest we not invest time building this cache till we encounter that
first application. In DHT since we will now be caching the subvolume
count in the fd_ctx, we could also store the last presented cookie to
the app and double check if the next requested offset is either 0 or
matches the last presented value. If it did not, then return EOF, and
declare we have encountered that first app :-)

Avati


On Thu, Jul 12, 2012 at 2:50 AM, Shishir Gowda <sgowda at redhat.com
<mailto:sgowda at redhat.com>> wrote:

     Hi All,

     Starting a thread to discuss the fixes for handling offsets being
     64bit in ext4.

     Please feel free to comment.

     DHT:

     1. Set the subvol count number in the xdata as part of readdirp 
response
     2. Do not modify the offset while sending a response
     3. If xdata has subvol count number as part of request, use it to
     identify the subvol to where call would go to
     4. If xdata does not have subvol count number, then start with
     subvol - 0.

     Remaining dht - readdir/p behaviour need not be changed.


     Fuse:
     1. save the subvolume count if available in the xdata in the fd_ctx
     2. When a readdir/p call is received on the same fd, send down the
     subvolume count as part of xdata

     NFS:
     1. Save the subvolume count in the verifier cookie, and pass it to
     the client
     2. Pass the verifier cookie down the graph if recieved.

     With regards,
     Shishir

     _______________________________________________
     Gluster-devel mailing list
     Gluster-devel at nongnu.org <mailto:Gluster-devel at nongnu.org>
     https://lists.nongnu.org/mailman/listinfo/gluster-devel









More information about the Gluster-devel mailing list