[Gluster-users] GlusterFS problems & alternatives

Sun Feb 16 15:28:28 UTC 2020

Dear Amar and others who have answered, 

many thanks for the answers. Sorry for my late reply, my internet access here varies currently between 500 and 16 kbit/s and I'm working most days, so it's not so easy to find the time to write a full response :-) 
I did not mean to come across condescending, and from your responses it seems that you weren't offended, thanks. 

As I said, my main concern right now is to stabilise our production systems. I work for several non-profit orgs and money is an issue, so most of the IT "budget" is comprised of my work time. One of the issues we're having is similar to this report: [ https://lists.gluster.org/pipermail/gluster-users/2020-February/037657.html | https://lists.gluster.org/pipermail/gluster-users/2020-February/037657.html ] 
We have tried the advice given but it didn't help. I am also working with Ceph a lot and the tooling for ceph seems better than for gluster, especially in terms of checking the status and analyzing and repairing fault situations. So it's good to hear that there are plans to improve GlusterFS in this area. 

It might be that our setup is so small that it's not well tested. We mostly run 2+1 setups, data bricks on large HDDs and the arbiter brick on SSD. It could be that the HDDs are too slow and there are timeouts in gluster that "stop" or hinder the healing process? I don't know but I can try to find out. 

I absolutely acknowledge that GlusterFS is an open source project (although the backing by RedHat makes it look loke a "big"/professional project). I have contributed to a lot of open source projects in the past, and being a developer myself, I can certainly relate to the frustration of "bug reports" that do not come with sufficient information and are not reproducible. When I return from this work trip I will try to create a test case for our problems and report it as a bug. 

Thanks so far, 

Stefan 

> From: "Amar Tumballi" <amar at kadalu.io>
> To: "Stefan" <gluster at stefanseidel.info>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Saturday, 15 February, 2020 16:11:03
> Subject: Re: [Gluster-users] GlusterFS problems & alternatives

> Hi Stefan,

> Some responses inline (Adding to already expressed opinions from few users, and
> some developers).

>> From: Stefan < [ mailto:gluster at stefanseidel.info | gluster at stefanseidel.info ]
>> >
>> Date: Wed, Feb 12, 2020 at 3:34 AM

>> Hi,

>> looking through the last couple of week on this mailing list and reflecting our
>> own experiences, I have to ask: what is the status of GlusterFS?

> Status is surely 'Active'. Considering the project is of 13+ yrs, treat current
> mode is 'low tide' in activity wave. But I want to assure, being a maintainer,
> I would like to keep the project active and will continue to solve use-cases
> where it fits well.

> Also, if you notice, we just recently had Release-8 planning meeting, and the
> scope and future of the project leading to GlusterX (GlusterFS 10 release) is
> being discussed.

>> So many people here reporting bugs and no solutions are in sight.

> Sorry about that. I have to accept the statement with a pinch of salt. There is
> no 'good' answer when a user is unhappy. But I want to write a long answer for
> this particular point.

> Like any 'open source' project, there are few commercially supporting
> Enterprises backing Gluster, who still have many developers engaged in making
> sure the project is active. But the fact is, a developer can pick an issue to
> fix based on their 'employer' priorities. Sometimes, it so happens that the
> usecases where there are problems in glusterfs may not be a priority for most
> of the developers due to their employer priority.

> As an individual contributor/maintainer for the project, the biggest challenge
> in picking and fixing the issues which are complicated can be listed like
> below:

> * The problem may not happen when there is lesser load, or scale. Which means,
> the fix would still be based on 'speculation' or 'knowing the code'. Validate
> our fixes at scale is a challenge without a company support.
> * We may not be able to reproduce the issue, which again would delay the fix.
> * The project itself deals with 'data', so many users may not be able to provide
> more information, which again makes the delay.

> All these issues aside, I want to take time and appreciate those users who
> actively answer queries of fellow users, and keep the lights on in the
> community. It is one of the major contributions anyone can give back to an
> opensource community.

>> GlusterFS clusters break left and right, reboots of a node have become a warrant
>> for instability and broken clusters, no way to fix broken clusters. And all of
>> that with recommended settings, and in our case, enterprise hardware
>> underneath.

> I should acknowledge the 'documentation', which would have 'recommended' setup,
> is possibly out of date! We are on to fixing the documentation right now..

>> Is it time to abandon this for production systems?

> This is a very critical question. As person responsible for running an infra for
> my company, I surely should be asking this question, especially if things are
> not working out. Nothing wrong in you asking this. I would say if you have
> production systems, consider talking to companies who support the project etc,
> and have the alternative options properly thought of.

> If interested, I am happy to make time and we can talk about particular issues,
> and what do you expect to see, and give feedback. After that, you can take
> decision on this.

>> What are the alternatives?

> Not a good person to answer. But would be great to know alternatives, so we also
> learn what more we can do here.
> Regards,
> -Amar Tumballi

>> Looking through the solutions, CephFS might be one, but it doesn't seem to be
>> very fast. MooseFS would be one, but it only support RAID-1 style replication,
>> no arbiter and Erasure Coding only in the Pro version.
>> Tahoe-LAFS is geared towards a different use case.
>> Any other suggestions?

>> Thanks,

>> Stefan
>> ________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200216/b84a6e6d/attachment.html>