[Gluster-devel] Performance Translators' Stability and Usefulness

Sun Jul 5 05:16:38 UTC 2009

Hi Gordan,

> What is production unready (more than Gluster) about PeerFS or SeznamFS?

Well, I'm mostly going by your email comparing these of a few months ago. Your 
needs are not that dissimilar to mine.

I see on the project page for SeznamFS now that there's apparently support for 
SeznamFS to do master-master replication 'MySQL' style - with the limitations 
of MySQL's master-master replication, apparently.

However, I can't seem to find out exactly what those limitations entail - or 
how to set it up in this mode. (And I am looking for a system that would 
allow more than two masters/peers, which is why I passed over DRBD for 
GlusterFS originally.)

I can't get even the PeerFS web page to load. That's a disturbing sign to me.

> You can fail over NFS servers. If the servers themselves are mirrored
> (DRBD) and/or have a shared file system NFS should be able to handle the
> IP being migrated between servers. I've found it this tends to work
> better with NFS over UDP provided you have a network that doesn't
> normally suffer packet loss.

Sorry, thought you were talking about NFS exports from just one local 
drive/RAID array.

My leading fallback option for when I give up on Gluster is pretty much 
exactly what you've just described. However - I have the same (potential) 
issue as you with DRBD and WANs looming over my project i.e. the eventual 
need to run masters/peers in geographically distributed sites.

> How do you mean? GFS1 has been in the vanilla kernel for a while.

I don't use a vanilla kernel. I use a 'hardened' kernel patched with PaX and a 
few other security systems, to protect against stack smashing attacks and 
other nasties. (Just a little bit of extra, relative security, to make 
would-be attackers go after softer targets.)

PaX is especially intolerant of memory faults in general, which is where my 
efforts in patching GlusterFS were focused. (And yes, I have disabled PaX 
features for Gluster. No, it didn't improve anything.)

When I was looking into GFS, I found that the GFS patches (perhaps I was 
looking at v2) didn't work with the hardened patchset. GlusterFS had more 
promise than GFS anyway, so I went with GlusterFS.

> > An older version of GlusterFS - as buggy as it is for me - is
> > unfortunately still the best option.
>
> Out of interest, what was the last version of Gluster did you deem
> completely stable?

What works for me with only (only!) a few crashes a day, and no apparent data 
corruption is 1.4.0tla849. TLA 636 worked a little better for me - only 
random crashes once in a while. (But again - backwards incompatible changes 
had crept in between the two versions, so I couldn't go back.)

I had much better stability with the earlier 1.3 releases. I can't remember 
exactly which ones now. (I suspect it was 1.3.3, but I'm no longer sure.) 
It's been quite a while.

> I don't agree on that particular point, since the last outstanding bug
> I'm seeing with any significant frequency in my use case is the one of
> having to wait for a few seconds for the FS to settle after mounting
> before doing anything or the operation fails. And to top it off, I've
> just had it succeed without the wait. That seems quite heisenbuggy/recey
> to me. :)

Sorry, I was talking about the data corruption bugs. Not your first-access 
issue.

> That doesn't help - the first-access-settle-time bug has been around for
> a very long time. ;)

Indeed.

It's my hope that once testing frameworks (and syslog logging, in your case) 
are made available to the community, people like us can attempt to debug our 
systems with some degree of confidence that we're not causing other subtle 
issues with our patches.

That's got to be better for the project as a whole.

Geoff.

On Sun, 5 Jul 2009, Gordan Bobic wrote:
> Geoff Kassel wrote:
> >> Sounds like a lot of effort and micro-downtime compared to a migration
> >> to something else. Have you explored other options like PeerFS, GFS and
> >> SeznamFS? Or NFS exports with failover rather than Gluster clients, with
> >> Gluster only server-to-server?
> >
> > These options are not production ready (as I believe has been pointed out
> > already to the list) for what I need;
>
> What is production unready (more than Gluster) about PeerFS or SeznamFS?
>
> > or in the case of NFS, defeating the
> > point of redundancy in the first place.
>
> You can fail over NFS servers. If the servers themselves are mirrored
> (DRBD) and/or have a shared file system NFS should be able to handle the
> IP being migrated between servers. I've found it this tends to work
> better with NFS over UDP provided you have a network that doesn't
> normally suffer packet loss.
>
> > (Also, GFS is also not compatible
> > with the kernel patchset I need to use.)
>
> How do you mean? GFS1 has been in the vanilla kernel for a while.
>
> > I have tried AFR on the server side and the client side. Both display
> > similar issues.
> >
> > An older version of GlusterFS - as buggy as it is for me - is
> > unfortunately still the best option.
>
> Out of interest, what was the last version of Gluster did you deem
> completely stable?
>
> > (That doesn't mean I can't complain about the lack of progress towards
> > stability and reliability, though :)
>
> Heh - and would you believe I just rebooted one of my root-on-glusterfs
> nodes and it came up OK without the bail-out requiring manual
> intervention caused by the bug that causes first access after mounting
> to fail before things have settled.
>
> >> One of the problems is that some tests in this case are impossible to
> >> carry out without having multiple nodes up and running, as a number of
> >> bugs have been arising in cases where nodes join/leave or cause race
> >> conditions. It would require a distributed test harness which would be
> >> difficult to implement so that they run on any client that builds the
> >> binaries. Just because the test harness doesn't ship with the sources
> >> doesn't mean it doesn't exist on a test rig the developers use
> >
> > Okay, so what about the volume of test cases that can be tested without a
> > distributed test harness? I don't see any sign of testing mechanisms for
> > that.
>
> That point is hard to argue against. :)
>
> > And wouldn't it be prudent anyway - giving how often the GlusterFS devs
> > do not have access to the platform with the reported problem - to provide
> > this harness so that people can generate the appropriate test results the
> > devs need for themselves? (Giving a complete stranger from overseas root
> > access is a legal minefield to those who have to work with data held
> > in-confidence.)
>
> Indeed. And shifting test-case VM images tends to be impractical (even
> though I have provided both to the gluster developers in the past for
> specific error-case analysis).
>
> > It's been my impression, though, that the relevant bugs are not
> > heisenbugs or race conditions.
>
> I don't agree on that particular point, since the last outstanding bug
> I'm seeing with any significant frequency in my use case is the one of
> having to wait for a few seconds for the FS to settle after mounting
> before doing anything or the operation fails. And to top it off, I've
> just had it succeed without the wait. That seems quite heisenbuggy/recey
> to me. :)
>
> > (I'm judging that on the speed of the follow up patch, by the way - race
> > conditions notoriously can take a long time to track down.)
>
> That doesn't help - the first-access-settle-time bug has been around for
> a very long time. ;)
>
> Gordan
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel