[Gluster-devel] [RFC] Zerofill FOP support for GlusterFS

aakash at linux.vnet.ibm.com aakash at linux.vnet.ibm.com
Tue Jul 16 05:11:46 UTC 2013


Hi,
There is a correction. You can find the correct links for the test  
programs here:
* For offloaded zeroing you can visit :  
https://docs.google.com/file/d/0B4jeWncLrfS3MzdqcEx3Nm5jSTQ/edit?usp=sharing

* For manually filling zeroes :  
https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
Sorry for the inconvenience.


Thanks,
Aakash Lal Das

Quoting aakash at linux.vnet.ibm.com:

> Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in the
> specified range. This fop will be useful when a whole file needs to be
> initialized with zero (could be useful for zero filled VM disk image
> provisioning or  during scrubbing of VM disk images).
>
> Client/application can issue this FOP for zeroing out. Gluster server will
> zero out required range of bytes ie server offloaded zeroing. In the  
> absence of
> this fop,  client/application has to repetitively issue write (zero)  
> fop to the
> server, which is very inefficient method because of the overheads involved in
> RPC calls  and acknowledgements.
>
> WRITESAME is a  SCSI T10 command that takes a block of data as input  
> and writes
> the same data to other blocks and this write is handled completely within the
> storage and hence is known as offload . Linux ,now has support for SCSI
> WRITESAME command which is exposed to the user in the form of  
> BLKZEROOUT ioctl.
> BD Xlator can exploit BLKZEROOUT ioctl to implement this fop. Thus  
> zeroing out
> operations can be completely offloaded to the storage device ,  
> making it highly
> efficient.
>
> The fop takes two arguments offset and size. It zeroes out 'size' number of
> bytes in an opened file starting from 'offset' position.
>
> This patch adds zerofill support to the following areas:
>
>         - libglusterfs
>         - io-stats
>         - performance/md-cache,open-behind
>         - quota
>         - cluster/afr,dht,stripe
>         - rpc/xdr
>         - protocol/client,server
>         - io-threads
>         - marker
>         - storage/posix
>         - libgfapi
>
> Client applications can exloit this fop by using glfs_zerofill introduced in
> libgfapi.FUSE support to this fop has not been added as there is no  
> system call
> for this fop.
>
> TODO :
>      * Add zerofill support to trace xlator
>      * Expose zerofill capability as part of gluster volume info
>
> Here is a performance comparison of server offloaded zeofill vs zeroing out
> using repeated writes.
>
> [root at llmvm02 remote]# time ./offloaded aakash-test log 20
>
> real        3m34.155s
> user        0m0.018s
> sys        0m0.040s
> [root at llmvm02 remote]# time ./manually aakash-test log 20
>
> real        4m23.043s
> user        0m2.197s
> sys        0m14.457s
> [root at llmvm02 remote]# time ./offloaded aakash-test log 25;
>
> real        4m28.363s
> user        0m0.021s
> sys        0m0.025s
> [root at llmvm02 remote]# time ./manually aakash-test log 25
>
> real        5m34.278s
> user        0m2.957s
> sys        0m18.808s
>
> The argument 'log' is a file which we want to set for logging purpose and the
> third argument is size in GB .
>
> As we can see there is a performance improvement of around 20% with  
> this fop. For
> block devices with the use of BLKZEROOUT ioctl, we can improve the  
> performance even more.
>
> The applications used for performance comparison can be found here:
>
> For manually writing zeros:  
> https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
>
> For offloaded zeroing :  
> https://docs.google.com/file/d/0B4jeWncLrfS3LVNybW9lR2dPZkk/edit?usp=sharing
>
> Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
> Signed-off-by: Aakash Lal Das <aakash at linux.vnet.ibm.com>






More information about the Gluster-devel mailing list