[Gluster-devel] [RFC PATCH v0 0/1] Zero copy readv

Bharata B Rao bharata.rao at gmail.com
Tue Mar 5 14:39:31 UTC 2013


Here is a highly experimental patch to support zero copy readv in
GlusterFS. An attempt is made to copy the read data in the socket to
client supplied buffers (iovecs) directly thereby eliminating one
memory copy for each readv request. Currently I have support for zero
copy readv only in glfs_preadv_async() which is what QEMU uses.

The approach is basically a hack to quickly measure any performance
gains that could be had with zero copy readv. I will work on a proper
approach and come via gerrit after we decide to pursue this further
for the kind of gains I am seeing. I have taken the path of least
changes in the core rpc/socket code and path of least resistance in
the xlator code to get the implementation working and I don't claim
that this is the right or the optimal approach. And the volume is also
pruned down so that I don't have to touch those xlators that aren't
strictly needed.

The volume configuration looks like this:

[root at bharata ~]# gluster volume info

Volume Name: test
Type: Distribute
Volume ID: 62461295-c265-4d15-916c-f283f28f6cbd
Status: Started
Number of Bricks: 1
Transport-type: tcp
Brick1: bharata:/test
Options Reconfigured:
server.allow-insecure: on
performance.read-ahead: off
performance.write-behind: off
performance.io-cache: off
performance.quick-read: off

I have used FIO from QEMU guest to compare the performance. The setup
is as follows:
Host: 2 core x86_64 sytem, 3.6.11-5.fc17.x86_64 kernel
Guest: 4 CPU Fedora 17, 3.3.4-5.fc17.x86_64 kernel
QEMU commit: 7ce4106c2125
GlusterFS commit: 4e15a0b4189fe5
QEMU cmdline: qemu-system-x86_64 --enable-kvm -nographic -m 2048 -smp
4 -drive file=gluster://bharata/test/F17-test,if=virtio,cache=none
FIO configuration:
; Read 4 files with aio at different depths

Avg of 10 FIO runs from guest with and without zero copy readv:

Without zerocopy: 43817kB/s
With zerocopy 46194kB/s ==> around 5% gain.

Appreciate any feedback to the patch and suggestions on the right
approach to pursue further.


More information about the Gluster-devel mailing list