[Bugs] [Bug 1664934] New: glusterfs-fuse not client not benefiting from page cache on read after write
bugzilla at redhat.com
bugzilla at redhat.com
Thu Jan 10 05:17:17 UTC 2019
https://bugzilla.redhat.com/show_bug.cgi?id=1664934
Bug ID: 1664934
Summary: glusterfs-fuse not client not benefiting from page
cache on read after write
Product: GlusterFS
Version: 5
Hardware: x86_64
OS: Linux
Status: NEW
Component: fuse
Severity: high
Assignee: bugs at gluster.org
Reporter: mpillai at redhat.com
CC: bugs at gluster.org
Target Milestone: ---
Classification: Community
Description of problem:
On a simple single brick distribute volume, I'm running tests to validate
glusterfs-fuse client's use of page cache. The tests are indicating that a read
following a write is reading from the brick, not from client cache. In
contrast, a 2nd read gets data from the client cache.
Version-Release number of selected component (if applicable):
glusterfs-*5.2-1.el7.x86_64
kernel-3.10.0-957.el7.x86_64 (RHEL 7.6)
How reproducible:
Consistently
Steps to Reproduce:
1. use fio to create a data set that would fit easily in the page cache. My
client has 128 GB RAM; I'll create a 64 GB data set:
fio --name=initialwrite --ioengine=sync --rw=write \
--direct=0 --create_on_open=1 --end_fsync=1 --bs=128k \
--directory=/mnt/glustervol/ --filename_format=f.\$jobnum.\$filenum \
--filesize=16g --size=16g --numjobs=4
2. run an fio read test that reads the data set from step 1, without
invalidating the page cache:
fio --name=readtest --ioengine=sync --rw=read --invalidate=0 \
--direct=0 --bs=128k --directory=/mnt/glustervol/ \
--filename_format=f.\$jobnum.\$filenum --filesize=16g \
--size=16g --numjobs=4
Read throughput is much lower than it would be if reading from page cache:
READ: bw=573MiB/s (601MB/s), 143MiB/s-144MiB/s (150MB/s-150MB/s), io=64.0GiB
(68.7GB), run=114171-114419msec
Reads are going over the 10GbE network as shown in (edited) sar output:
05:01:04 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s
05:01:06 AM em1 755946.26 40546.26 1116287.75 3987.24 0.00
[There is some read amplification here: application is getting lower throughput
than what client is reading over the n/w. More on that later]
3. Run the read test in step 2 again. This time read throughput is really high,
indicating read from cache, rather than over the network:
READ: bw=14.8GiB/s (15.9GB/s), 3783MiB/s-4270MiB/s (3967MB/s-4477MB/s),
io=64.0GiB (68.7GB), run=3837-4331msec
Expected results:
The read test in step 2 should be reading from page cache, and should be giving
throughput close to what we get in step 3.
Additional Info:
gluster volume info:
Volume Name: perfvol
Type: Distribute
Volume ID: 7033539b-0331-44b1-96cf-46ddc6ee2255
Status: Started
Snapshot Count: 0
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 172.16.70.128:/mnt/rhs_brick1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list