<div dir="ltr"><div>Hi,</div><div><br></div><div>Stefan, sorry to hear that things are breaking a lot in your cluster, please file a bug(s) with the necessary information so we can take a look.</div><div>If already filed, share it here so we are reminded of it. Fixing broken cluster state should be easy with Gluster. <br></div><div>There are a few older threads you should be able to find regarding the same. <br></div><div><br></div><div>Do consider the facts that the devs are limited in bandwidth. we do look at the issues and are fixing them actively.</div><div>We may take some time expecting the community to help each other as well. If they couldn&#39;t resolve it we get in try to sort it out.<br></div><div>FYI: You can see dozens of bugs being worked on even in the past 2 days: <a href="https://review.gluster.org/#/q/status:open+project:glusterfs">https://review.gluster.org/#/q/status:open+project:glusterfs</a></div><div>And there are other activities happening around as well to make gluster project healthier. Like Glusto. We are working on this testing framework <br></div><div>to cover as many cases as possible. If you can send out a test case, it will be beneficial for you as well as the community.<br></div><div><br></div><div>We don&#39;t see many people sending out mails that their cluster is healthy and they are happy (not sure if they think they are spamming. <br></div><div>which they won&#39;t be. It helps us understand how well things are going).</div><div>Thanks Erik and Strahi, for sharing your experience. It means a lot to us :)<br></div><div>People usually prefer to send a mail when something breaks and that&#39;s one main reason all the threads you read are creating negativity.</div><div><br></div><div>Do let us know what is the issue and we will try our best to help you out.</div><div><br></div><div>Regards,</div><div>Hari.<br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Feb 12, 2020 at 11:58 AM Strahil Nikolov &lt;<a href="mailto:hunter86_bg@yahoo.com">hunter86_bg@yahoo.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On February 12, 2020 12:28:14 AM GMT+02:00, Erik Jacobson &lt;<a href="mailto:erik.jacobson@hpe.com" target="_blank">erik.jacobson@hpe.com</a>&gt; wrote:<br>

&gt;&gt; looking through the last couple of week on this mailing list and<br>

&gt;reflecting our own experiences, I have to ask: what is the status of<br>

&gt;GlusterFS? So many people here reporting bugs and no solutions are in<br>

&gt;sight. GlusterFS clusters break left and right, reboots of a node have<br>

&gt;become a warrant for instability and broken clusters, no way to fix<br>

&gt;broken clusters. And all of that with recommended settings, and in our<br>

&gt;case, enterprise hardware underneath.<br>

&gt;<br>

&gt;<br>

&gt;I have been one of the people asking questions. I sometimes get an<br>

&gt;answer, which I appreciate. Other times not. But I&#39;m not paying for<br>

&gt;support in this forum so I appreciate what I can get. My questions<br>

&gt;are sometimes very hard to summarize and I can&#39;t say I&#39;ve been offering<br>

&gt;help as much as I ask. I think I will try to do better.<br>

&gt;<br>

&gt;<br>

&gt;Just to counter with something cool....<br>

&gt;As we speak now, I&#39;m working on a 2,000 node cluster that will soon be<br>

&gt;a<br>

&gt;5120 node cluster. We&#39;re validating it with the newest version of our<br>

&gt;cluster manager.<br>

&gt;<br>

&gt;It has 12 leader nodes (soon to have 24) that are gluster servers and<br>

&gt;gnfs servers.<br>

&gt;<br>

&gt;I am validating Gluster7.2 (updating from 4.6). Things are looking very<br>

&gt;good. 5120 nodes using RO NFS root with RW NFS overmounts (for things<br>

&gt;like /var, /etc, ...)...<br>

&gt;- boot 1 (where each node creates a RW XFS image on top of NFS for its<br>

&gt;  writable area then syncs /var, /etc, etc) -- full boot is 15-16<br>

&gt;  minutes for 2007 nodes.<br>

&gt;- boot 2 (where the writable area pre-exists and is reused, just<br>

&gt;  re-rsynced) -- 8-9 minutes to boot 2007 nodes.<br>

&gt;<br>

&gt;This is similar to gluster 4, but I think it&#39;s saying something to not<br>

&gt;have had any failures in this setup on the bleeding edge release level.<br>

&gt;<br>

&gt;We also use a different volume shared between the leaders and the head<br>

&gt;node for shared-storage consoles and system logs. It&#39;s working great.<br>

&gt;<br>

&gt;I haven&#39;t had time to test other solutions. Our old solution from SGI<br>

&gt;days (ICE, ICE X, etc) was a different model where each leader served<br>

&gt;a set of nodes and NFS-booted 288 or so. No shared storage.<br>

&gt;<br>

&gt;Like you, I&#39;ve wondered if something else matches this solution. We<br>

&gt;like<br>

&gt;the shared storage and the ability for a leader to drop and not take<br>

&gt;288 noes with it.<br>

&gt;<br>

&gt;(All nodes running RHEL8.0, Glusterfs 72, CTDB 4.9.1)<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;So we can say gluster is providing the network boot solution for now<br>

&gt;two<br>

&gt;supercomputers.<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;Erik<br>

&gt;________<br>

&gt;<br>

&gt;Community Meeting Calendar:<br>

&gt;<br>

&gt;APAC Schedule -<br>

&gt;Every 2nd and 4th Tuesday at 11:30 AM IST<br>

&gt;Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

&gt;<br>

&gt;NA/EMEA Schedule -<br>

&gt;Every 1st and 3rd Tuesday at 01:00 PM EDT<br>

&gt;Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

&gt;<br>

&gt;Gluster-users mailing list<br>

&gt;<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

&gt;<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

<br>

Hi Stefan,<br>

<br>

It seems that devs are not so active in the mailing lists, but based on my experience the bugs will be fixed in a reasonable timeframe. I admit that I was quite frustrated when my Gluster v6.5  to v6.6 upgrade made my lab useless for 2 weeks  and the only help came from oVirt Dev, while gluster-users/devel were semi-silent.<br>

Yet, I&#39;m not paying for any support and I know that any help here is just a good will.<br>

I hope this has nothing in common with the recent acquisition from IBM, but we will see.<br>

<br>

<br>

There is a reason why Red Hat clients are still using Gluster v3 (even with backports) - it is the most tested version in Gluster.<br>

For me Gluster v4+ compared  to v3 is like  Fedora  to RHEL. After all, the upstream is not so well tested and Gluster community is taking over here - reporting bugs, sharing workarounds, giving advices .<br>

<br>

Of course, if you need rock-solid Gluster environment - you definately need the enterprise solution with it&#39;s 24/7 support.<br>

<br>

Keep in mind that even the most expensive storage arrays break after an upgrade (it happened 3 times for less than 2 weeks where 2k+ machines were read-only,  before the vendor provided a new patch), so the issues in Gluster are nothing new  and we should not forget that Gluster is free (and doesn&#39;t costs millions like some arrays).<br>

The only mitigation is to thoroughly test each patch on a cluster that provides storage for your dev/test clients.<br>

<br>

I hope you didn&#39;t  understand me wrong - just lower your expectations -&gt; even arrays for millions break , so Gluster is not an exclusion , but at least it&#39;s OpenSource and free.<br>

<br>

Best Regards,<br>

Strahil Nikolov<br>

<br>

________<br>

<br>

Community Meeting Calendar:<br>

<br>

APAC Schedule -<br>

Every 2nd and 4th Tuesday at 11:30 AM IST<br>

Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

<br>

NA/EMEA Schedule -<br>

Every 1st and 3rd Tuesday at 01:00 PM EDT<br>

Bridge: <a href="https://bluejeans.com/441850968" rel="noreferrer" target="_blank">https://bluejeans.com/441850968</a><br>

<br>

Gluster-users mailing list<br>

<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>

<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a><br>

<br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature">Regards,<br>Hari Gowtham.</div>