<div dir="ltr"><div>Hi Erik,</div><div>Reading thru your email I saw that you run gluster on 16 gluster servers, what version of gluster are you using?</div><div><br></div><div>Thanks,</div><div><br></div><div>Adrian Quintero<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jun 17, 2020 at 9:53 AM Erik Jacobson &lt;<a href="mailto:erik.jacobson@hpe.com">erik.jacobson@hpe.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">We never ran tests with Ceph mostly due to time constraints in<br>

engineering. We also liked that, at least when I started as a novice,<br>

gluster seemed easier to set up. We use the solution in automated<br>

setup scripts for maintaining very large clusters. Simplicity in<br>

automated setup is critical here for us including automated installation<br>

of supercomputers in QE and near-automation at customer sites.<br>

<br>

We have been happy with our performance using gluster and gluster NFS<br>

for root filesystems when using squashfs object files for the NFS roots<br>

as opposed to expanded files (on a sharded volume). For writable NFS, we<br>

use XFS filesystem images on gluster NFS instead of expanded trees (in<br>

this case, not on sharded volume).<br>

<br>

We have systems running as large as 3072 nodes with 16 gluster servers<br>

(subvolumes of 3, distributed/replicate).<br>

<br>

We will have 5k nodes in production soon and will need to support 10k<br>

nodes in a year or so. So far we use CTDB for &quot;ha-like&quot; functionality as<br>

pacemaker is scary to us.<br>

<br>

<br>

We also have designed a second solution around gluster for<br>

high-availability head nodes (aka admin nodes). The old solution used two<br>

admin nodes, pacemaker, external shared storage, to host a VM that would<br>

start on the 2nd server if the first server died. As we know, 2-node ha<br>

is not optimal. We designed a new 3-server HA solution that eliminates<br>

the external shared storage (which was expensive) and instead uses<br>

gluster, sharded volume, and a qemu raw image hosted in the shared<br>

storage to host the virtual admin node.  We use RAIDD10 4-disk per<br>

server for gluster use in this. We have been happy with the performance<br>

of this. It&#39;s only a little slower than the external shared filesystem<br>

solution (we tended to use GFS2 or OCFS or whatever it is called in the<br>

past solution). We did need to use pacemaker for this one as virtual<br>

machine availability isn&#39;t suitable for CTDB (or less natural anyway).<br>

One highlight of this solution is it allows a customer to put each of<br>

the 3 servers in a separate firewalled vault or room to keep the head <br>

alive even if there were a fire that destroyed one server.<br>

<br>

A key to our use of gluster and not suffering from poor performance in<br>

our root-filesystem-workloads is encapsulating filesystems in image<br>

files instead of using expanded trees of small files.<br>

<br>

So far we have relied on gluster NFS for the boot servers as Ganesha<br>

would crash. We haven&#39;t re-tried in several months though and owe<br>

debugging on that front. We have not had resources to put in to<br>

debugging Ganesha just yet.<br>

<br>

I sure hope Gluster stays healthy and active. It is good to have<br>

multiple solutions with various strengths out there. I like choice.<br>

Plus, choice lets us learn from each other. I hope project sponsors see<br>

that too.<br>

<br>

Erik<br>

<br>

&gt; 17.06.2020 08:59, Artem Russakovskii пишет:<br>

&gt; &gt; It may be stable, but it still suffers from performance issues, which<br>

&gt; &gt; the team is working on. But nevertheless, I&#39;m curious if maybe Ceph has<br>

&gt; &gt; those problem sorted by now.<br>

&gt; <br>

&gt; <br>

&gt; Dunno, we run gluster on small clusters, kvm and gluster on the same hosts.<br>

&gt; <br>

&gt; There were plans to use ceph on dedicated server next year, but budget cut<br>

&gt; because you don&#39;t want to buy our oil for $120 ;-)<br>

&gt; <br>

&gt; Anyway, in our tests ceph is faster, this is why we wanted to use it, but<br>

&gt; not migrate from gluster.<br>

&gt; <br>

&gt; <br>

&gt; ________<br>

&gt; <br>

&gt; <br>

&gt; <br>

&gt; Community Meeting Calendar:<br>

&gt; <br>

&gt; Schedule -<br>

&gt; Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>

&gt; Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

&gt; <br>

&gt; Gluster-users mailing list<br>

&gt; <a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

&gt; <a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

________<br>

<br>

<br>

<br>

Community Meeting Calendar:<br>

<br>

Schedule -<br>

Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC<br>

Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature">Adrian Quintero<br></div>