<p dir="ltr">As your issue is Network, consider changing the MTU if the infrastructure is allowing it.<br>
The tuned profiles are very important, as they control ratios for dumping data in memory to disk (this case gluster over network). You want to avoid keeping a lot of data in client's memory(in this case the gluster server), just to unleash it over network.</p>
<p dir="ltr">These 2 can be implemented online and I do not expect any issues.</p>
<p dir="ltr">Filesystem of bricks is important because the faster they soak data, the faster gluster can take more.<br></p>
<p dir="ltr">Of course, you need to reproduce it in test.</p>
<p dir="ltr">Also consider checking if there is any kind of backup running on the bricks. I have seen too many 'miracles' :D</p>
<p dir="ltr">Best Regards,<br>
Strahil Nikolov</p>
<div class="quote">On Jan 8, 2020 01:03, David Cunningham <dcunningham@voisonics.com> wrote:<br type='attribution'><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Strahil,</div><div><br /></div><div>Thanks for that. The queue/scheduler file for the relevant disk reports "noop [deadline] cfq", so deadline is being used. It is using ext4, and I've verified that the MTU is 1500.</div><div><br /></div><div>We could change the filesystem from ext4 to xfs, but in this case we're not looking to tinker around the edges and get a small performance improvement - we need a very large improvement on the 114MBps of network traffic to make it usable.</div><div><br /></div><div>I think what we really need to do first is to reproduce the problem in testing, and then come back to possible solutions.</div><div><br /></div></div><br /><div class="elided-text"><div dir="ltr">On Tue, 7 Jan 2020 at 22:15, Strahil Nikolov <<a href="mailto:hunter86_bg@yahoo.com">hunter86_bg@yahoo.com</a>> wrote:<br /></div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb( 204 , 204 , 204 );padding-left:1ex"><div><div style="font-family:'courier new' , 'courier' , 'monaco' , monospace , sans-serif;font-size:16px"><div></div>
<div dir="ltr">To find the scheduler , find all pvs of the LV is providing your storage</div><div dir="ltr"><br /></div><div dir="ltr"><div>[root@ovirt1 ~]# df -Th /gluster_bricks/data_fast<br />Filesystem Type Size Used Avail Use% Mounted on<br />/dev/mapper/gluster_vg_nvme-gluster_lv_data_fast xfs 100G 39G 62G 39% /gluster_bricks/data_fast<br /><br /></div><div dir="ltr"><br /><div>[root@ovirt1 ~]# pvs | grep gluster_vg_nvme<br /> /dev/mapper/vdo_nvme gluster_vg_nvme lvm2 a-- <1024.00g 0<br /><br /></div></div><div dir="ltr"><div>[root@ovirt1 ~]# cat /etc/vdoconf.yml<br />####################################################################<br /># THIS FILE IS MACHINE GENERATED. DO NOT EDIT THIS FILE BY HAND.<br />####################################################################<br />config: !Configuration<br /> vdos:</div><div dir="ltr"> vdo_nvme: !VDOService<br /><div dir="ltr"><div> device: /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596<br /><br /></div><div><br /></div><div dir="ltr"><div>[root@ovirt1 ~]# ll /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596<br />lrwxrwxrwx. 1 root root 13 Dec 17 20:21 /dev/disk/by-id/nvme-ADATA_SX8200PNP_2J1120011596 -> ../../nvme0n1<br />[root@ovirt1 ~]# cat /sys/block/nvme0n1/queue/scheduler<br />[none] mq-deadline kyber</div><div><br /></div><div dir="ltr">Note: If device is under multipath , you need to check all paths (you can get them from 'multipath -ll' command).</div><div dir="ltr">The only scheduler you should avoid is "cfq" which was default for RHEL 6 & SLES 11.</div><div dir="ltr"><br /></div><div dir="ltr">XFS has better performance that ext-based systems.</div><div dir="ltr"><br /></div><div dir="ltr">Another tuning is to use Red hat's tuned profiles for gluster. You can extract them from (or newer if you find) <a href="ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.4.2.0-1.el7rhgs.src.rpm">ftp://ftp.redhat.com/redhat/linux/enterprise/7Server/en/RHS/SRPMS/redhat-storage-server-3.4.2.0-1.el7rhgs.src.rpm</a></div><div dir="ltr"><br /></div><div dir="ltr"><br /></div><div dir="ltr">About MTU - it's reducing the ammount of packages that the kernel has to process - but requires infrastructure to support that too. You can test by setting MTU on both sides to 9000 and then run 'tracepath remote-ip'. Also run a ping with large size without do not fragment flag -> 'ping -M do -s 8900 <</div></div></div></div></div></div></div></div></blockquote></div></blockquote></div>