[Gluster-devel] performance issues Manoj found in EC testing
mpillai at redhat.com
Sat Jun 25 17:42:57 UTC 2016
----- Original Message -----
> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
> To: "Xavier Hernandez" <xhernandez at datalab.es>
> Cc: "Manoj Pillai" <mpillai at redhat.com>, "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Thursday, June 23, 2016 8:50:44 PM
> Subject: performance issues Manoj found in EC testing
> hi Xavi,
> Meet Manoj from performance team Redhat. He has been testing EC
> performance in his stretch clusters. He found some interesting things we
> would like to share with you.
> 1) When we perform multiple streams of big file writes(12 parallel dds I
> think) he found one thread to be always hot (99%CPU always). He was asking
> me if fuse_reader thread does any extra processing in EC compared to
> replicate. Initially I thought it would just lock and epoll threads will
> perform the encoding but later realized that once we have the lock and
> version details, next writes on the file would be encoded in the same
> thread that comes to EC. write-behind could play a role and make the writes
> come to EC in an epoll thread but we saw consistently there was just one
> thread that is hot. Not multiple threads. We will be able to confirm this
> in tomorrow's testing.
> 2) This is one more thing Raghavendra G found, that our current
> implementation of epoll doesn't let other epoll threads pick messages from
> a socket while one thread is processing one message from that socket. In
> EC's case that can be encoding of the write/decoding read. This will not
> let replies of operations on different files to be processed in parallel.
> He thinks this can be fixed for 3.9.
> Manoj will be raising a bug to gather all his findings. I just wanted to
> introduce him and let you know the interesting things he is finding before
> you see the bug :-).
Thanks, Pranith :).
Here's the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1349953
Comparing EC and replica-2 runs, the hot thread is seen in both cases, so
I have not opened this as an EC bug. But initial impression is that
performance impact for EC is particularly bad (details in the bug).
More information about the Gluster-devel