<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 5, 2019 at 10:57 PM Xavi Hernandez &lt;<a href="mailto:xhernandez@redhat.com">xhernandez@redhat.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah &lt;<a href="mailto:pgurusid@redhat.com" target="_blank">pgurusid@redhat.com</a>&gt; wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr">On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez &lt;<a href="mailto:xhernandez@redhat.com" rel="noreferrer" target="_blank">xhernandez@redhat.com</a> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez &lt;<a href="mailto:xhernandez@redhat.com" rel="noreferrer noreferrer" target="_blank">xhernandez@redhat.com</a>&gt; wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah &lt;<a href="mailto:pgurusid@redhat.com" rel="noreferrer noreferrer" target="_blank">pgurusid@redhat.com</a>&gt; wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Can the threads be categorised to do certain kinds of fops?</div></blockquote><div><br></div><div>Could be, but creating multiple thread groups for different tasks is generally bad because many times you end up with lots of idle threads which waste resources and could increase contention. I think we should only differentiate threads if it&#39;s absolutely necessary.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Read/write affinitise to certain set of threads, the other metadata fops to other set of threads. So we limit the read/write threads and not the metadata threads? Also if aio is enabled in the backend the threads will not be blocked on disk IO right? </div></blockquote><div><br></div><div>If we don&#39;t block the thread but we don&#39;t prevent more requests to go to the disk, then we&#39;ll probably have the same problem. Anyway, I&#39;ll try to run some tests with AIO to see if anything changes.</div></div></div></blockquote><div><br></div><div>I&#39;ve run some simple tests with AIO enabled and results are not good. A simple dd takes &gt;25% more time. Multiple parallel dd take 35% more time to complete.</div></div></div></blockquote></div></div><div dir="auto"><br></div><div dir="auto"><br></div><div>Thank you. That is strange! Had few questions, what tests are you running for measuring the io-threads performance(not particularly aoi)? is it dd from multiple clients?</div></div></div></blockquote><div><br></div><div>Yes, it&#39;s a bit strange. What I see is that many threads from the thread pool are active but using very little CPU. I also see an AIO thread for each brick, but its CPU usage is not big either. Wait time is always 0 (I think this is a side effect of AIO activity). However system load grows very high. I&#39;ve seen around 50, while on the normal test without AIO it&#39;s stays around 20-25.</div><div><br></div><div>Right now I&#39;m running the tests on a single machine (no real network communication) using an NVMe disk as storage. I use a single mount point. The tests I&#39;m running are these:</div><div><ul><li>Single dd, 128 GiB, blocks of 1MiB</li><li>16 parallel dd, 8 GiB per dd, blocks of 1MiB</li><li>fio in sequential write mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file</li><li>fio in sequential read mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file</li><li>fio in random write mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file</li><li>fio in random read mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file</li><li>smallfile create, 16 threads, 256 files per thread, 32 MiB per file (with one brick down, for the following test)</li><li>self-heal of an entire brick (from the previous smallfile test)</li><li>pgbench init phase with scale 100</li></ul><div>I run all these tests for a replica 3 volume and a disperse 4+2 volume.</div></div></div></div></blockquote><div><br></div><div><br></div><div>Are these performance results available somewhere? I am quite curious to understand the performance gains on NVMe!</div><div><br></div><div>Thanks,</div><div>Vijay </div></div></div>