[Gluster-devel] performance issues Manoj found in EC testing

Mon Jun 27 12:18:24 UTC 2016

On Mon, Jun 27, 2016 at 12:42 PM, Pranith Kumar Karampuri <
pkarampu at redhat.com> wrote:

>
>
> On Mon, Jun 27, 2016 at 11:52 AM, Xavier Hernandez <xhernandez at datalab.es>
> wrote:
>
>> Hi Manoj,
>>
>> I always enable client-io-threads option for disperse volumes. It
>> improves performance sensibly, most probably because of the problem you
>> have detected.
>>
>> I don't see any other way to solve that problem.
>>
>
> I agree. Updated the bug with same info.
>
>
>>
>> I think it would be a lot better to have a true thread pool (and maybe an
>> I/O thread pool shared by fuse, client and server xlators) in libglusterfs
>> instead of the io-threads xlator. This would allow each xlator to decide
>> when and what should be parallelized in a more intelligent way, since
>> basing the decision solely on the fop type seems too simplistic to me.
>>
>> In the specific case of EC, there are a lot of operations to perform for
>> a single high level fop, and not all of them require the same priority.
>> Also some of them could be executed in parallel instead of sequentially.
>>
>
> I think it is high time we actually schedule(for which release) to get
> this in gluster. May be you should send out a doc where we can work out
> details? I will be happy to explore options to integrate io-threads,
> syncop/barrier with this infra based on the design may be.
>

I was just thinking why we can't reuse synctask framework. It already
scales up/down based on the tasks. At max it uses 16 threads. Whatever we
want to be executed in parallel we can create a synctask around it and run
it. Would that be good enough?

>
>
>>
>> Xavi
>>
>>
>> On 25/06/16 19:42, Manoj Pillai wrote:
>>
>>>
>>> ----- Original Message -----
>>>
>>>> From: "Pranith Kumar Karampuri" <pkarampu at redhat.com>
>>>> To: "Xavier Hernandez" <xhernandez at datalab.es>
>>>> Cc: "Manoj Pillai" <mpillai at redhat.com>, "Gluster Devel" <
>>>> gluster-devel at gluster.org>
>>>> Sent: Thursday, June 23, 2016 8:50:44 PM
>>>> Subject: performance issues Manoj found in EC testing
>>>>
>>>> hi Xavi,
>>>>           Meet Manoj from performance team Redhat. He has been testing
>>>> EC
>>>> performance in his stretch clusters. He found some interesting things we
>>>> would like to share with you.
>>>>
>>>> 1) When we perform multiple streams of big file writes(12 parallel dds I
>>>> think) he found one thread to be always hot (99%CPU always). He was
>>>> asking
>>>> me if fuse_reader thread does any extra processing in EC compared to
>>>> replicate. Initially I thought it would just lock and epoll threads will
>>>> perform the encoding but later realized that once we have the lock and
>>>> version details, next writes on the file would be encoded in the same
>>>> thread that comes to EC. write-behind could play a role and make the
>>>> writes
>>>> come to EC in an epoll thread but we saw consistently there was just one
>>>> thread that is hot. Not multiple threads. We will be able to confirm
>>>> this
>>>> in tomorrow's testing.
>>>>
>>>> 2) This is one more thing Raghavendra G found, that our current
>>>> implementation of epoll doesn't let other epoll threads pick messages
>>>> from
>>>> a socket while one thread is processing one message from that socket. In
>>>> EC's case that can be encoding of the write/decoding read. This will not
>>>> let replies of operations on different files to be processed in
>>>> parallel.
>>>> He thinks this can be fixed for 3.9.
>>>>
>>>> Manoj will be raising a bug to gather all his findings. I just wanted to
>>>> introduce him and let you know the interesting things he is finding
>>>> before
>>>> you see the bug :-).
>>>> --
>>>> Pranith
>>>>
>>>
>>> Thanks, Pranith :).
>>>
>>> Here's the bug: https://bugzilla.redhat.com/show_bug.cgi?id=1349953
>>>
>>> Comparing EC and replica-2 runs, the hot thread is seen in both cases, so
>>> I have not opened this as an EC bug. But initial impression is that
>>> performance impact for EC is particularly bad (details in the bug).
>>>
>>> -- Manoj
>>>
>>>
>
>
> --
> Pranith
>

-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20160627/6a9ad522/attachment.html>