[Bugs] [Bug 1753843] [Disperse volume]: Regression in IO performance seen in sequential read for large file

bugzilla at redhat.com bugzilla at redhat.com
Fri Sep 20 04:15:25 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1753843



--- Comment #4 from Raghavendra G <rgowdapp at redhat.com> ---
I've done some tests and the reason of the increased amount of read seems to be
read-ahead xlator.

I've checked the amount of data read when a 8 GiB file is read with a dd and I
see that disabling read-ahead xlator, profile info reports 8 GiB read. However
with read-ahead enabled it reports more than 12 GiB read.

Looking at the trace logs of the read-ahead xlator I see the following
sequence:

2019-09-19 11:32:23.347873] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39321600 for size=131072
[2019-09-19 11:32:23.347877] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39321600) when page_count=4
[2019-09-19 11:32:23.347881] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39321600.
[2019-09-19 11:32:23.347886] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=39714816
[2019-09-19 11:32:23.348248] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=39845888
[2019-09-19 11:32:23.348608] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=39976960
[2019-09-19 11:32:23.349153] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40108032
[2019-09-19 11:32:23.350774] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39452672 for size=131072
[2019-09-19 11:32:23.350783] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39452672) when page_count=4
[2019-09-19 11:32:23.350788] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39452672.
[2019-09-19 11:32:23.351186] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39583744 for size=131072
[2019-09-19 11:32:23.351192] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39583744) when page_count=4
[2019-09-19 11:32:23.351197] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39583744.
[2019-09-19 11:32:23.351573] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39714816 for size=131072
[2019-09-19 11:32:23.351578] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39714816) when page_count=4
[2019-09-19 11:32:23.351582] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39714816.
[2019-09-19 11:32:23.351900] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39845888 for size=131072
[2019-09-19 11:32:23.351904] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39845888) when page_count=4
[2019-09-19 11:32:23.351908] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39845888.
[2019-09-19 11:32:23.351913] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40239104
[2019-09-19 11:32:23.352229] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40370176
[2019-09-19 11:32:23.352797] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40501248
[2019-09-19 11:32:23.353324] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40632320
[2019-09-19 11:32:23.354954] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=39976960 for size=131072
[2019-09-19 11:32:23.354969] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (39976960) when page_count=4
[2019-09-19 11:32:23.354983] T [MSGID: 0] [read-ahead.c:382:dispatch_requests]
0-ec-read-ahead: HIT at offset=39976960.
[2019-09-19 11:32:23.355003] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=40108032 for size=131072
[2019-09-19 11:32:23.355019] T [MSGID: 0] [read-ahead.c:473:ra_readv]
0-ec-read-ahead: unexpected offset (39976960 != 40108032) resetting
[2019-09-19 11:32:23.355035] T [MSGID: 0] [read-ahead.c:388:dispatch_requests]
0-ec-read-ahead: IN-TRANSIT at offset=40108032.
[2019-09-19 11:32:23.355047] T [MSGID: 0] [read-ahead.c:402:dispatch_requests]
0-ec-read-ahead: MISS at offset=40108032.
[2019-09-19 11:32:23.355839] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=40239104 for size=131072
[2019-09-19 11:32:23.355843] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (40239104) when page_count=0
[2019-09-19 11:32:23.355850] T [MSGID: 0] [read-ahead.c:388:dispatch_requests]
0-ec-read-ahead: IN-TRANSIT at offset=40239104.
[2019-09-19 11:32:23.355859] T [MSGID: 0] [read-ahead.c:402:dispatch_requests]
0-ec-read-ahead: MISS at offset=40239104.
[2019-09-19 11:32:23.357077] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=40370176 for size=131072
[2019-09-19 11:32:23.357088] T [MSGID: 0] [read-ahead.c:479:ra_readv]
0-ec-read-ahead: expected offset (40370176) when page_count=1
[2019-09-19 11:32:23.357101] T [MSGID: 0] [read-ahead.c:388:dispatch_requests]
0-ec-read-ahead: IN-TRANSIT at offset=40370176.
[2019-09-19 11:32:23.357110] T [MSGID: 0] [read-ahead.c:402:dispatch_requests]
0-ec-read-ahead: MISS at offset=40370176.
[2019-09-19 11:32:23.357562] T [MSGID: 0] [read-ahead.c:317:read_ahead]
0-ec-read-ahead: RA at offset=40501248
[2019-09-19 11:32:23.358147] T [MSGID: 0] [read-ahead.c:459:ra_readv]
0-ec-read-ahead: NEW REQ at offset=40501248 for size=131072
[2019-09-19 11:32:23.358155] T [MSGID: 0] [read-ahead.c:473:ra_readv]
0-ec-read-ahead: unexpected offset (40370176 != 40501248) resetting
[2019-09-19 11:32:23.358171] T [MSGID: 0] [read-ahead.c:388:dispatch_requests]
0-ec-read-ahead: IN-TRANSIT at offset=40501248.
[2019-09-19 11:32:23.358178] T [MSGID: 0] [read-ahead.c:402:dispatch_requests]
0-ec-read-ahead: MISS at offset=40501248.

My understanding of what happens is that 'dd' sends the requests one by one
sequentially. The read-ahead xlator starts reading more pages than requested,
which makes it possible that future requests can be served directly from the
cached data. At some point the kernel also detects the sequential access and
starts sending more read requests in parallel. Many of these requests are
served very fast by read-ahead xlator, but after some requests, the cache is
empty.

Apparently there's a race when this happens:

At 2019-09-19 11:32:23.354954 a new request at offset 39976960 arrives. It's
the expected offset and the cache contains that page, so it's served
immediately. However, at 2019-09-19 11:32:23.355003, a new request arrives for
the next offset (40108032) but read-ahead has not yet updated the expected
offset and it incorrectly believes that the read is not sequential, dropping
cached data and starts sending requests to bricks again. Apparently this causes
that the same offset is read more than once, increasing the read volume on
bricks.

Raghavendra, can you take a look to see if it's an easy change ?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list