<div dir="ltr"><div><div><div>Hi,<br><br></div>This is what my job file contains:<br><br>[global]<br>ioengine=libaio<br>#unified_rw_reporting=1<br>randrepeat=1<br>norandommap=1<br>group_reporting<br>direct=1<br>runtime=60<br>thread<br>size=16g<br><br><br>[workload]<br>bs=4k<br>rw=randread<br>iodepth=8<br>numjobs=1<br>file_service_type=random<br>filename=/perf5/iotest/fio_5<br>filename=/perf6/iotest/fio_6<br>filename=/perf7/iotest/fio_7<br>filename=/perf8/iotest/fio_8<br><br></div>I have 3 vms reading from one mount, and each of these vms is running the above job in parallel.<br><br></div>-Krutika<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 6, 2017 at 9:14 PM, Manoj Pillai <span dir="ltr"><<a href="mailto:mpillai@redhat.com" target="_blank">mpillai@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Tue, Jun 6, 2017 at 5:05 PM, Krutika Dhananjay <span dir="ltr"><<a href="mailto:kdhananj@redhat.com" target="_blank">kdhananj@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div><div><div><div><div><div><div><div><div><div><div>Hi,<br><br></div>As part of identifying performance bottlenecks within gluster stack for VM image store use-case, I loaded io-stats at multiple points on the client and brick stack and ran randrd test using fio from within the hosted vms in parallel.<br><br></div>Before I get to the results, a little bit about the configuration ...<br><br></div>3 node cluster; 1x3 plain replicate volume with group virt settings, direct-io.<br></div><div>3 FUSE clients, one per node in the cluster (which implies reads are served from the replica that is local to the client).<br><br></div>io-stats was loaded at the following places:<br></div>On the client stack: Above client-io-threads and above protocol/client-0 (the first child of AFR).<br></div>On the brick stack: Below protocol/server, above and below io-threads and just above storage/posix.<br><br></div>Based on a 60-second run of randrd test and subsequent analysis of the stats dumped by the individual io-stats instances, the following is what I found:<br><br></div><div><u><b>Translator Position</b></u><b> </b><u><b>Avg Latency of READ fop as seen by this translator</b></u><br></div><div><br></div><div>1. parent of client-io-threads <wbr> 1666us<br><br></div><div><span style="color:rgb(34,34,34);font-family:verdana,arial,helvetica,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">∆ (1,2) = 50us<br><br></span></div><div>2. parent of protocol/client-0 1616us<br><br></div><div><span style="color:rgb(34,34,34);font-family:verdana,arial,helvetica,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">∆<span class="m_-9121270971748928698m_7978759376296577183m_-5265424521932273188m_5596340466479769558gmail-"> (2,3) = 1453us<br><br></span></span></div><div>----------------- end of client stack ---------------------<br></div><div>----------------- beginning of brick stack -----------<br></div><div><br></div><div>3. child of protocol/server <wbr> 163us<br><br></div><div><span style="color:rgb(34,34,34);font-family:verdana,arial,helvetica,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">∆<span class="m_-9121270971748928698m_7978759376296577183m_-5265424521932273188m_5596340466479769558gmail-"> </span>(3,4) = 7us</span><br></div><div><br></div><div>4. parent of io-threads <wbr> 156us<br><br></div><div><span style="color:rgb(34,34,34);font-family:verdana,arial,helvetica,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">∆<span class="m_-9121270971748928698m_7978759376296577183m_-5265424521932273188m_5596340466479769558gmail-"> (4,5) </span>= 20us</span><br></div><div><br></div><div>5. child-of-io-threads <wbr> 136us<br><br></div><div><span style="color:rgb(34,34,34);font-family:verdana,arial,helvetica,sans-serif;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;display:inline;float:none">∆ (5,6) = 11us</span><br></div><div><br></div><div>6. parent of storage/posix <wbr> 125us<br>...<br></div><div>---------------- end of brick stack ------------------------<br></div><div><br></div><div>So it seems like the biggest bottleneck here is a combination of the network + epoll, rpc layer?<br></div><div>I must admit I am no expert with networks, but I'm assuming if the client is reading from the local brick, then<br></div><div>even latency contribution from the actual network won't be much, in which case bulk of the latency is coming from epoll, rpc layer, etc at both client and brick end? Please correct me if I'm wrong.<br><br></div><div>I will, of course, do some more runs and confirm if the pattern is consistent.<span class="m_-9121270971748928698m_7978759376296577183HOEnZb"><font color="#888888"><br><br></font></span></div><span class="m_-9121270971748928698m_7978759376296577183HOEnZb"><font color="#888888"><div>-Krutika<br></div></font></span></div></div></div></div></div></div></div>
<br></blockquote><div><br></div></div></div><div>Really interesting numbers! How many concurrent requests are in flight in this test? Could you post the fio job? I'm wondering if/how these latency numbers change if you reduce the number of concurrent requests.</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>-- Manoj</div><div><br></div></font></span></div></div></div>
</blockquote></div><br></div>