[Gluster-devel] compound fop design first cut
Shyam
srangana at redhat.com
Wed Dec 9 14:38:21 UTC 2015
On 12/09/2015 12:52 AM, Pranith Kumar Karampuri wrote:
>
>
> On 12/09/2015 10:39 AM, Prashanth Pai wrote:
>>> However, I’d be even more comfortable with an even simpler approach that
>>> avoids the need to solve what the database folks (who have dealt with
>>> complex transactions for years) would tell us is a really hard problem.
>>> Instead of designing for every case we can imagine, let’s design for the
>>> cases that we know would be useful for improving performance. Open plus
>>> read/write plus close is an obvious one. Raghavendra mentions
>>> create+inodelk as well.
>> From object interface (Swift/S3) perspective, this is the fop order
>> and flow for object operations:
>>
>> GET: open(), fstat(), fgetxattr()s, read()s, close()
> Krutika implemented fstat+fgetxattr(http://review.gluster.org/10180). In
> posix there is an implementation of GF_CONTENT_KEY which is used to read
> a file in lookup by quick-read. This needs to be exposed for fds as well
> I think. So you can do all this using fstat on anon-fd.
>> HEAD: stat(), getxattr()s
> Krutika already implemented this for sharding
> http://review.gluster.org/10158. You can do this using stat fop.
I believe we need to fork this part of the conversation, i.e the stat +
xattr information clubbing.
My view on a stat for gluster is, POSIX stat + gluster extended
information being returned. I state this as, a file system when it stats
its inode, should get all information regarding the inode, and not just
the POSIX ones. In the case of other local FS, the inode structure has
more fields than just what POSIX needs, so when the inode is *read* the
FS can populate all its internal inode information and return to the
application/syscall the relevant fields that it needs.
I believe gluster should do the same, so in the cases above, we should
actually extend our stat information (not elaborating how) to include
all information from the brick, i.e stat from POSIX and all the extended
attrs for the inode (file or dir). This can then be consumed by any
layer as needed.
Currently, each layer adds what it needs in addition to the stat
information in the xdata, as an xattr request, this can continue or go
away, if the relevant FOPs return the whole inode information upward.
This also has useful outcomes in readdirp calls, where we get the
extended stat information for each entry.
With the patches referred to, and older patches, this seems to be the
direction sought (around 2013), any reasons why this is not prevalent
across the stack and made so? Or am I mistaken?
>> PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
> This I think should be a new compound fop. Nothing similar exists.
>> DELETE: getxattr(), unlink()
> This can also be clubbed in unlink already because xdata exists on the
> wire already.
>>
>> Compounding some of these ops and exposing them as consumable libgfapi
>> APIs like glfs_get() and glfs_put() similar to librados compound
>> APIs[1] would greatly improve performance for object based access.
>>
>> [1]:
>> https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L2219
>>
>>
>> Thanks.
>>
>> - Prashanth Pai
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list