[Bugs] [Bug 1593199] New: Stack overflow in readdirp with parallel-readdir enabled
bugzilla at redhat.com
bugzilla at redhat.com
Wed Jun 20 09:33:05 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1593199
Bug ID: 1593199
Summary: Stack overflow in readdirp with parallel-readdir
enabled
Product: GlusterFS
Version: 3.12
Component: distribute
Assignee: bugs at gluster.org
Reporter: nbalacha at redhat.com
CC: bugs at gluster.org
Description of problem:
Wind/unwind in readdirp causes the stack to grow if parallel-readdir is
enabled.
commit b9406e210717621bc672a63c1cbd1b0183834056 changed DHT to continue to wind
readdirp to its child xlator as long as there is space in the buffer.
DHT also strips out certain entries returned in the readdirp response from the
child xlator (linkto files, directories whose hashed subvol is not the child
that the call was wound to, etc).
If the buffer has just enough space left to hold a very few entries (1 or 2)
and the rda cache has lots of entries which dht would strip out, this can cause
the stack to grow at an alarming rate and eventually overflow, killing the
client process.
Assume that the buffer is almost full in dht_readdirp_cbk(for example, the
local->size is 4096 and local->filled is 3800)
1. dht will wind the readdirp call to its rda child xlator.
2. rda sees that there is only enough space to return one entry, so it returns
one entry from its cache and unwinds to dht_readdirp_cbk.
3. dht_readdirp_cbk processed the single entry returned which in this case is a
linkto file and skips it (count == 0). As the buffer is still not full , it
winds to the same rda xlator again.
4. Rda, in its turn, returns one more entry (also a linkto file)from its cache
to dht.
This process (steps 3 and 4) continues with rda returning 1 linkto file entry
each time and dht winding to rda again. Eventually the stack overflows and the
process crashes.
Version-Release number of selected component (if applicable):
How reproducible:
Tried once
Steps to Reproduce:
I was able to reproduce the crash with a 2 brick distribute volume with
thousands of entries and thousands of linkto files on one of the bricks.
Fuse mount the volume and run
ls -l <mountpoint>
Actual results:
Client mount process crashes
Expected results:
Client should not crash
Additional info:
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list