[Gluster-devel] Latency analysis of GlusterFS' network layer for pgbench

Wed Dec 26 03:05:25 UTC 2018

On Mon, Dec 24, 2018 at 6:05 PM Raghavendra Gowdappa <rgowdapp at redhat.com>
wrote:

>
>
> On Mon, Dec 24, 2018 at 3:40 PM Sankarshan Mukhopadhyay <
> sankarshan.mukhopadhyay at gmail.com> wrote:
>
>> [pulling the conclusions up to enable better in-line]
>>
>> > Conclusions:
>> >
>> > We should never have a volume with caching-related xlators disabled.
>> The price we pay for it is too high. We need to make them work consistently
>> and aggressively to avoid as many requests as we can.
>>
>> Are there current issues in terms of behavior which are known/observed
>> when these are enabled?
>>
>
> We did have issues with pgbench in past. But they've have been fixed.
> Please refer to bz [1] for details. On 5.1, it runs successfully with all
> caching related xlators enabled. Having said that the only performance
> xlators which gave improved performance were open-behind and write-behind
> [2] (write-behind had some issues, which will be fixed by [3] and we'll
> have to measure performance again with fix to [3]). For some reason,
> read-side caching didn't improve transactions per second.
>

One possible reason for read-caching in glusterfs didn't show increased
performance can be, VFS already supports read-ahead (of 128KB) and
page-cache. It could be that whatever performance boost that can be
provided with caching is already leveraged at VFS page-cache  itself and
hence making glusterfs caching redundant. I'll run some tests to gather
evidence to (dis)prove this hypothesis.

I am working on this problem currently. Note that these bugs measure
> transaction phase of pgbench, but what xavi measured in his mail is init
> phase. Nevertheless, evaluation of read caching (metadata/data) will still
> be relevant for init phase too.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1512691
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1629589#c4
> [3] https://bugzilla.redhat.com/show_bug.cgi?id=1648781
>
>
>> > We need to analyze client/server xlators deeper to see if we can avoid
>> some delays. However optimizing something that is already at the
>> microsecond level can be very hard.
>>
>> That is true - are there any significant gains which can be accrued by
>> putting efforts here or, should this be a lower priority?
>>
>
> The problem identified by xavi is also the one we (Manoj, Krutika, me and
> Milind) had encountered in the past [4]. The solution we used was to have
> multiple rpc connections between single brick and client. The solution
> indeed fixed the bottleneck. So, there is definitely work involved here -
> either to fix the single connection model or go with multiple connection
> model. Its preferred to improve single connection and resort to multiple
> connections only if bottlenecks in single connection are not fixable.
> Personally I think this is high priority along with having appropriate
> client side caching.
>
> [4] https://bugzilla.redhat.com/show_bug.cgi?id=1467614#c52
>
>
>> > We need to determine what causes the fluctuations in brick side and
>> avoid them.
>> > This scenario is very similar to a smallfile/metadata workload, so this
>> is probably one important cause of its bad performance.
>>
>> What kind of instrumentation is required to enable the determination?
>>
>> On Fri, Dec 21, 2018 at 1:48 PM Xavi Hernandez <xhernandez at redhat.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I've done some tracing of the latency that network layer introduces in
>> gluster. I've made the analysis as part of the pgbench performance issue
>> (in particulat the initialization and scaling phase), so I decided to look
>> at READV for this particular workload, but I think the results can be
>> extrapolated to other operations that also have small latency (cached data
>> from FS for example).
>> >
>> > Note that measuring latencies introduces some latency. It consists in a
>> call to clock_get_time() for each probe point, so the real latency will be
>> a bit lower, but still proportional to these numbers.
>> >
>>
>> [snip]
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20181226/36f12eb1/attachment.html>