[Gluster-devel] Performance experiments with io-stats translator

Xavier Hernandez xhernandez at datalab.es
Wed Jun 7 06:29:46 UTC 2017


Hi Krutika,

On 06/06/17 13:35, Krutika Dhananjay wrote:
> Hi,
>
> As part of identifying performance bottlenecks within gluster stack for
> VM image store use-case, I loaded io-stats at multiple points on the
> client and brick stack and ran randrd test using fio from within the
> hosted vms in parallel.
>
> Before I get to the results, a little bit about the configuration ...
>
> 3 node cluster; 1x3 plain replicate volume with group virt settings,
> direct-io.
> 3 FUSE clients, one per node in the cluster (which implies reads are
> served from the replica that is local to the client).
>
> io-stats was loaded at the following places:
> On the client stack: Above client-io-threads and above protocol/client-0
> (the first child of AFR).
> On the brick stack: Below protocol/server, above and below io-threads
> and just above storage/posix.
>
> Based on a 60-second run of randrd test and subsequent analysis of the
> stats dumped by the individual io-stats instances, the following is what
> I found:
>
> _*​​Translator Position*_*                       *_*Avg Latency of READ
> fop as seen by this translator*_
>
> 1. parent of client-io-threads                1666us
>
> ∆ (1,2) = 50us
>
> 2. parent of protocol/client-0                1616us
>
> ∆(2,3) = 1453us
>
> ----------------- end of client stack ---------------------
> ----------------- beginning of brick stack -----------
>
> 3. child of protocol/server                   163us
>
> ∆(3,4) = 7us
>
> 4. parent of io-threads                        156us
>
> ∆(4,5) = 20us
>
> 5. child-of-io-threads                          136us
>
> ∆ (5,6) = 11us
>
> 6. parent of storage/posix                   125us
> ...
> ---------------- end of brick stack ------------------------
>
> So it seems like the biggest bottleneck here is a combination of the
> network + epoll, rpc layer?
> I must admit I am no expert with networks, but I'm assuming if the
> client is reading from the local brick, then
> even latency contribution from the actual network won't be much, in
> which case bulk of the latency is coming from epoll, rpc layer, etc at
> both client and brick end? Please correct me if I'm wrong.
>
> I will, of course, do some more runs and confirm if the pattern is
> consistent.

very interesting. These results are similar to what I also observed when 
doing some ec tests.

My personal feeling is that there's high serialization and/or contention 
in the network layer caused by mutexes, but I don't have data to support 
that.

Xavi

>
> -Krutika
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



More information about the Gluster-devel mailing list