<div dir="ltr">Thank you Darrel, now I have clear steps what to do. The data is very valuable so 2xmirror +arbiter, or 3 replica nodes would be a setup.<div>Just for the clarification we have now LustreFS it is nice but no redundancy. I am not using it for the VMs, the workloads are following - gluster should be mounted on the multiple nodes, connection is Infiniband or 10Gbit. The clients are pulling the data and making some data analysis, IO pattern is very different - 26MB blocks or random 1k IO, different codes, different projects. I am thinking to put all <128K files on the special device (yes I am on the zfs 2.0.6 branch) On the gluster I have seen .gluster folder has a lot of small folders or files,  would improve the performance if I move them to nvme as well or better to increase the RAM(now I cant, but for the future)?</div><div>Unfortunately cannot add more RAM, but your tuning consideration is important note.   </div><div>   a.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Dec 14, 2021 at 12:25 AM Darrell Budic <<a href="mailto:budic@onholyground.com">budic@onholyground.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div style="overflow-wrap: break-word;"><div>A few thoughts from another ZFS backend user:</div><div><br></div>ZFS:<div>use arcstats to look at your cache use over time and consider:<br><div><span style="white-space:pre-wrap">        </span>Don’t mirror your cache drives, use them as 2x cache volumes to increase available cache.<div><span style="white-space:pre-wrap">  </span><span style="color:rgb(0,0,0)">Add more RAM. Lots more RAM (if I’m reading that right and you have 32Gb ram per zfs server).</span></div><div><span style="color:rgb(0,0,0);white-space:pre-wrap">     </span><font color="#000000"><span>Adjust ZFS’s max arc caching upwards if you have lots of RAM.</span></font></div><div><font color="#000000"><span style="white-space:pre-wrap">        </span>Try more metadata caching & less content caching if you’re find heavy.</font></div><div><font color="#000000">compression on these volumes could help improve IO on the raidZ2s, but you’ll have to copy the data on with compression enabled if you didn’t already have it enabled. Different zStd levels are worth evaluating here.</font></div><div><font color="#000000">Read up on recordsize and consider if you would get any performance benefits from 64K or maybe something larger for your large data, depends on where the reads are being done. </font></div><div>Use relatime or no atime tracking.</div><div>Upgrade to ZFS 2.0.6 if you aren’t already at 2 or 2.1</div><div><br></div><div>For gluster, sounds like gluster 10 would be good for your use case. Without knowing what your workload is (VMs, gluster mounts, nfs mounts?), I don’t have much else on that level, but you can probably play with the cluster.read-hash-mode (try 3) to spread the read load out amongst your servers. Search the list archives for general performance hints too, server & client .event-threads are probably good targets, and the various performance.*threads may/may not help depending on how the volumes are being used.</div><div><br></div><div>More details (zfs version, gluster version, volume options currently applied, more details on the workload) may help if others use similar setups. You may be getting into the area where you just need to get your environment setup to try some A/B testing with different options though.</div><div><br></div><div>Good luck!</div><div><br></div><div>  -Darrell</div><div><br><div><br><blockquote type="cite"><div>On Dec 11, 2021, at 5:27 PM, Arman Khalatyan <<a href="mailto:arm2arm@gmail.com" target="_blank">arm2arm@gmail.com</a>> wrote:</div><br><div><div dir="auto">Hello everybody,<div dir="auto">I was looking for some performance consideration on glusterfs with zfs.</div><div dir="auto">The data diversity is following: 90% <50kb and 10%>10GB-100GB . totally over 100mln, about 100TB.</div><div dir="auto">3replicated Jbods each one with:</div><div dir="auto">2x8disks-RaidZ2 +special device mirror  2x1TBnvme+cache mirror 2xssd+32GB ram.</div><div dir="auto"><br></div><div dir="auto">most operations are  reading and "find file".</div><div dir="auto">i put some parameters on zfs like: xattr=sa, primarycache=all, secondary cache=all</div><div dir="auto">what else could be tuned?</div><div dir="auto">thank you in advanced.</div><div dir="auto">greetings from Potsdam,</div><div dir="auto">Arman.</div><div dir="auto"><br></div></div>

________<br><br><br><br>Community Meeting Calendar:<br><br>Schedule -<br>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>Bridge: <a href="https://meet.google.com/cpu-eiue-hvk" target="_blank">https://meet.google.com/cpu-eiue-hvk</a><br>Gluster-users mailing list<br><a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br><a href="https://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br></div></blockquote></div><br></div></div></div></div></blockquote></div>