[Gluster-users] GlusterFS problems & alternatives

Wed Feb 12 10:54:55 UTC 2020

Hi Stefan,

Adding to what Hari said; I want to share a link [1] which talks about
the future releases.
Apart from that, I would suggest you to join community meeting and
share your thoughts.

Hi Strahil,
We have separate version system for "Red Hat's client" and community,
so it's not Gluster-V3.
We encourage everyone to use the latest release which we believe is
more stable and comes with the latest fixes.

[1] https://lists.gluster.org/pipermail/gluster-devel/2020-January/056779.html

/sunny

On Wed, Feb 12, 2020 at 7:23 AM Hari Gowtham <hgowtham at redhat.com> wrote:
>
> Hi,
>
> Stefan, sorry to hear that things are breaking a lot in your cluster, please file a bug(s) with the necessary information so we can take a look.
> If already filed, share it here so we are reminded of it. Fixing broken cluster state should be easy with Gluster.
> There are a few older threads you should be able to find regarding the same.
>
> Do consider the facts that the devs are limited in bandwidth. we do look at the issues and are fixing them actively.
> We may take some time expecting the community to help each other as well. If they couldn't resolve it we get in try to sort it out.
> FYI: You can see dozens of bugs being worked on even in the past 2 days: https://review.gluster.org/#/q/status:open+project:glusterfs
> And there are other activities happening around as well to make gluster project healthier. Like Glusto. We are working on this testing framework
> to cover as many cases as possible. If you can send out a test case, it will be beneficial for you as well as the community.
>
> We don't see many people sending out mails that their cluster is healthy and they are happy (not sure if they think they are spamming.
> which they won't be. It helps us understand how well things are going).
> Thanks Erik and Strahi, for sharing your experience. It means a lot to us :)
> People usually prefer to send a mail when something breaks and that's one main reason all the threads you read are creating negativity.
>
> Do let us know what is the issue and we will try our best to help you out.
>
> Regards,
> Hari.
>
>
>
> On Wed, Feb 12, 2020 at 11:58 AM Strahil Nikolov <hunter86_bg at yahoo.com> wrote:
>>
>> On February 12, 2020 12:28:14 AM GMT+02:00, Erik Jacobson <erik.jacobson at hpe.com> wrote:
>> >> looking through the last couple of week on this mailing list and
>> >reflecting our own experiences, I have to ask: what is the status of
>> >GlusterFS? So many people here reporting bugs and no solutions are in
>> >sight. GlusterFS clusters break left and right, reboots of a node have
>> >become a warrant for instability and broken clusters, no way to fix
>> >broken clusters. And all of that with recommended settings, and in our
>> >case, enterprise hardware underneath.
>> >
>> >
>> >I have been one of the people asking questions. I sometimes get an
>> >answer, which I appreciate. Other times not. But I'm not paying for
>> >support in this forum so I appreciate what I can get. My questions
>> >are sometimes very hard to summarize and I can't say I've been offering
>> >help as much as I ask. I think I will try to do better.
>> >
>> >
>> >Just to counter with something cool....
>> >As we speak now, I'm working on a 2,000 node cluster that will soon be
>> >a
>> >5120 node cluster. We're validating it with the newest version of our
>> >cluster manager.
>> >
>> >It has 12 leader nodes (soon to have 24) that are gluster servers and
>> >gnfs servers.
>> >
>> >I am validating Gluster7.2 (updating from 4.6). Things are looking very
>> >good. 5120 nodes using RO NFS root with RW NFS overmounts (for things
>> >like /var, /etc, ...)...
>> >- boot 1 (where each node creates a RW XFS image on top of NFS for its
>> >  writable area then syncs /var, /etc, etc) -- full boot is 15-16
>> >  minutes for 2007 nodes.
>> >- boot 2 (where the writable area pre-exists and is reused, just
>> >  re-rsynced) -- 8-9 minutes to boot 2007 nodes.
>> >
>> >This is similar to gluster 4, but I think it's saying something to not
>> >have had any failures in this setup on the bleeding edge release level.
>> >
>> >We also use a different volume shared between the leaders and the head
>> >node for shared-storage consoles and system logs. It's working great.
>> >
>> >I haven't had time to test other solutions. Our old solution from SGI
>> >days (ICE, ICE X, etc) was a different model where each leader served
>> >a set of nodes and NFS-booted 288 or so. No shared storage.
>> >
>> >Like you, I've wondered if something else matches this solution. We
>> >like
>> >the shared storage and the ability for a leader to drop and not take
>> >288 noes with it.
>> >
>> >(All nodes running RHEL8.0, Glusterfs 72, CTDB 4.9.1)
>> >
>> >
>> >
>> >So we can say gluster is providing the network boot solution for now
>> >two
>> >supercomputers.
>> >
>> >
>> >
>> >Erik
>> >________
>> >
>> >Community Meeting Calendar:
>> >
>> >APAC Schedule -
>> >Every 2nd and 4th Tuesday at 11:30 AM IST
>> >Bridge: https://bluejeans.com/441850968
>> >
>> >NA/EMEA Schedule -
>> >Every 1st and 3rd Tuesday at 01:00 PM EDT
>> >Bridge: https://bluejeans.com/441850968
>> >
>> >Gluster-users mailing list
>> >Gluster-users at gluster.org
>> >https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> Hi Stefan,
>>
>> It seems that devs are not so active in the mailing lists, but based on my experience the bugs will be fixed in a reasonable timeframe. I admit that I was quite frustrated when my Gluster v6.5  to v6.6 upgrade made my lab useless for 2 weeks  and the only help came from oVirt Dev, while gluster-users/devel were semi-silent.
>> Yet, I'm not paying for any support and I know that any help here is just a good will.
>> I hope this has nothing in common with the recent acquisition from IBM, but we will see.
>>
>>
>> There is a reason why Red Hat clients are still using Gluster v3 (even with backports) - it is the most tested version in Gluster.
>> For me Gluster v4+ compared  to v3 is like  Fedora  to RHEL. After all, the upstream is not so well tested and Gluster community is taking over here - reporting bugs, sharing workarounds, giving advices .
>>
>> Of course, if you need rock-solid Gluster environment - you definately need the enterprise solution with it's 24/7 support.
>>
>> Keep in mind that even the most expensive storage arrays break after an upgrade (it happened 3 times for less than 2 weeks where 2k+ machines were read-only,  before the vendor provided a new patch), so the issues in Gluster are nothing new  and we should not forget that Gluster is free (and doesn't costs millions like some arrays).
>> The only mitigation is to thoroughly test each patch on a cluster that provides storage for your dev/test clients.
>>
>> I hope you didn't  understand me wrong - just lower your expectations -> even arrays for millions break , so Gluster is not an exclusion , but at least it's OpenSource and free.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> ________
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/441850968
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
> --
> Regards,
> Hari Gowtham.
> ________
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users