[Gluster-devel] performance issues Manoj found in EC testing

Pranith Kumar Karampuri pkarampu at redhat.com
Mon Jun 27 07:12:35 UTC 2016


On Mon, Jun 27, 2016 at 11:52 AM, Xavier Hernandez <xhernandez at datalab.es>
wrote:

> Hi Manoj,
>
> I always enable client-io-threads option for disperse volumes. It improves
> performance sensibly, most probably because of the problem you have
> detected.
>
> I don't see any other way to solve that problem.
>

I agree. Updated the bug with same info.


>
> I think it would be a lot better to have a true thread pool (and maybe an
> I/O thread pool shared by fuse, client and server xlators) in libglusterfs
> instead of the io-threads xlator. This would allow each xlator to decide
> when and what should be parallelized in a more intelligent way, since
> basing the decision solely on the fop type seems too simplistic to me.
>
> In the specific case of EC, there are a lot of operations to perform for a
> single high level fop, and not all of them require the same priority. Also
> some of them could be executed in parallel instead of sequentially.
>

I think it is high time we actually schedule(for which release) to get this
in gluster. May be you should send out a doc where we can work out details?
I will be happy to explore options to integrate io-threads, syncop/barrier
with this infra based on the design may be.


>
> Xavi
>
>
> On 25/06/16 19:42, Manoj Pillai wrote:
>
>>
>> ----- Original Message -----
>>
>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>> Cc: "Manoj Pillai" <mpillai at redhat.com>, "Gluster Devel" <
>>> gluster-devel at gluster.org>
>>> Sent: Thursday, June 23, 2016 8:50:44 PM
>>> Subject: performance issues Manoj found in EC testing
>>>
>>> hi Xavi,
>>>           Meet Manoj from performance team Redhat. He has been testing EC
>>> performance in his stretch clusters. He found some interesting things we
>>> would like to share with you.
>>>
>>> 1) When we perform multiple streams of big file writes(12 parallel dds I
>>> think) he found one thread to be always hot (99%CPU always). He was
>>> asking
>>> me if fuse_reader thread does any extra processing in EC compared to
>>> replicate. Initially I thought it would just lock and epoll threads will
>>> perform the encoding but later realized that once we have the lock and
>>> version details, next writes on the file would be encoded in the same
>>> thread that comes to EC. write-behind could play a role and make the
>>> writes
>>> come to EC in an epoll thread but we saw consistently there was just one
>>> thread that is hot. Not multiple threads. We will be able to confirm this
>>> in tomorrow's testing.
>>>
>>> 2) This is one more thing Raghavendra G found, that our current
>>> implementation of epoll doesn't let other epoll threads pick messages
>>> from
>>> a socket while one thread is processing one message from that socket. In
>>> EC's case that can be encoding of the write/decoding read. This will not
>>> let replies of operations on different files to be processed in parallel.
>>> He thinks this can be fixed for 3.9.
>>>
>>> Manoj will be raising a bug to gather all his findings. I just wanted to
>>> introduce him and let you know the interesting things he is finding
>>> before
>>> you see the bug :-).
>>> --
>>> Pranith
>>>
>>
>> Thanks, Pranith :).
>>
>> Here's the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1349953
>>
>> Comparing EC and replica-2 runs, the hot thread is seen in both cases, so
>> I have not opened this as an EC bug. But initial impression is that
>> performance impact for EC is particularly bad (details in the bug).
>>
>> -- Manoj
>>
>>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160627/5c9aea29/attachment.html>


More information about the Gluster-devel mailing list