[Gluster-devel] [RFC] Zerofill FOP support for GlusterFS
aakash at linux.vnet.ibm.com
aakash at linux.vnet.ibm.com
Tue Jul 16 05:11:46 UTC 2013
Hi,
There is a correction. You can find the correct links for the test
programs here:
* For offloaded zeroing you can visit :
https://docs.google.com/file/d/0B4jeWncLrfS3MzdqcEx3Nm5jSTQ/edit?usp=sharing
* For manually filling zeroes :
https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
Sorry for the inconvenience.
Thanks,
Aakash Lal Das
Quoting aakash at linux.vnet.ibm.com:
> Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the
> specified range. This fop will be useful when a whole file needs to be
> initialized with zero (could be useful for zero filled VM disk image
> provisioning or during scrubbing of VM disk images).
>
> Client/application can issue this FOP for zeroing out. Gluster server will
> zero out required range of bytes ie server offloaded zeroing. In the
> absence of
> this fop, client/application has to repetitively issue write (zero)
> fop to the
> server, which is very inefficient method because of the overheads involved in
> RPC calls and acknowledgements.
>
> WRITESAME is a SCSI T10 command that takes a block of data as input
> and writes
> the same data to other blocks and this write is handled completely within the
> storage and hence is known as offload . Linux ,now has support for SCSI
> WRITESAME command which is exposed to the user in the form of
> BLKZEROOUT ioctl.
> BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus
> zeroing out
> operations can be completely offloaded to the storage device ,
> making it highly
> efficient.
>
> The fop takes two arguments offset and size. It zeroes out 'size' number of
> bytes in an opened file starting from 'offset' position.
>
> This patch adds zerofill support to the following areas:
>
> - libglusterfs
> - io-stats
> - performance/md-cache,open-behind
> - quota
> - cluster/afr,dht,stripe
> - rpc/xdr
> - protocol/client,server
> - io-threads
> - marker
> - storage/posix
> - libgfapi
>
> Client applications can exloit this fop by using glfs_zerofill introduced in
> libgfapi.FUSE support to this fop has not been added as there is no
> system call
> for this fop.
>
> TODO :
> * Add zerofill support to trace xlator
> * Expose zerofill capability as part of gluster volume info
>
> Here is a performance comparison of server offloaded zeofill vs zeroing out
> using repeated writes.
>
> [root at llmvm02 remote]# time ./offloaded aakash-test log 20
>
> real 3m34.155s
> user 0m0.018s
> sys 0m0.040s
> [root at llmvm02 remote]# time ./manually aakash-test log 20
>
> real 4m23.043s
> user 0m2.197s
> sys 0m14.457s
> [root at llmvm02 remote]# time ./offloaded aakash-test log 25;
>
> real 4m28.363s
> user 0m0.021s
> sys 0m0.025s
> [root at llmvm02 remote]# time ./manually aakash-test log 25
>
> real 5m34.278s
> user 0m2.957s
> sys 0m18.808s
>
> The argument 'log' is a file which we want to set for logging purpose and the
> third argument is size in GB .
>
> As we can see there is a performance improvement of around 20% with
> this fop. For
> block devices with the use of BLKZEROOUT ioctl, we can improve the
> performance even more.
>
> The applications used for performance comparison can be found here:
>
> For manually writing zeros:
> https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
>
> For offloaded zeroing :
> https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
>
> Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
> Signed-off-by: Aakash Lal Das <aakash at linux.vnet.ibm.com>
More information about the Gluster-devel
mailing list