[Gluster-devel] Local vs unify

Mon Apr 28 10:54:24 UTC 2008

Mmm,

One of the issues I've noticed is a degree of "well, it works for me" syndrome.
Another is "that would involve a lot of changes".

Neither is a particularly attractive attribute from a [potential] user's point of view.

A lot of this comes down to the documentation.. there is an implication that Gluster can be used to replace things like NFS because it's "better". Indeed design-wise (IMHO) it is, and it could be made to do the same things, and it COULD be made to  do them better.

Yet I hear;
"In short - comparing NFS to GlusterFS isn't really meaningful."

Yet on the web pages, the documentation [FAQ] states;

---snip---
Q. What is the problem with DAS / RAID / JBOD / NFS / SAN? 
A. They don't scale both in terms of size and performance. SAN is better than the rest, but it is exorbitantly expensive. But even SAN cannot scale to hundreds of TBs for a large number of clients. 
---snip---

Now, I read this as "but Gluster can hence is better than the aforementioned and can be used in their stead".
Maybe that's just me ...

Then we have;
---snip---
Q. Is GlusterFS like parallel NFS? 
A. From user's point of view YES. But a very different design internally. NFS protocol is brain damaged. It is hard to fix or improve it. Parallel NFS client is anyway incompatible with existing NFS protocol. Thats why we moved ahead with a new GlusterFS file system implementation. GlusterFS has superior features and performance over NFS. See this link for GlusterFS vs NFS. 
---snip---

"GlusterFS has superior features and performance over NFS." .. even my poor English is fairly confident of what this says.
"See this link for GlusterFS vs NFS" .. I know the target page is missing, but the meaning is fairly clear.

>From my point of view, currently Clusters DO use NFS in some instances and they are being enticed into trying out GlusterFS as a replacement. Saying they don't compare in the theoretical world is fine, but in the real world that sort of statement just doesn't wash.

They are comparable, people will compare them, even the Gluster documentation compares them.

In that vein, would it not be worth considering improving in areas where NFS kicks Gluster's ass, rather than trying to pretend the issues either don't exist or are not important?

Gareth.

----- Original Message -----
From: gordan at bobich.net
To: gluster-devel at nongnu.org
Sent: Monday, April 28, 2008 11:20:33 AM GMT +00:00 GMT Britain, Ireland, Portugal
Subject: RE: [Gluster-devel] Local vs unify

On Mon, 28 Apr 2008, Paul Arch wrote:

>>> Thanks for supporting our design. We are working towards fixing those few
> glitches!
>>> But true that things are not changing as fast as we have wished. Each new
> idea needs
>>> time to get converted into code, get tested. Hence bit delay in things.
>>
>> No problem and thank you for this email, it has answered a major issue
>> for me .. I am of course going to ask;
>> a. Any timescale on the metadata changes?
>> b. How much of a difference will it make.. will we be approaching
>> local(ish) speeds .. or are we just talking x2 of current?
>
>> I imagine that would depend on the metadata expiry timeouts. If it's set
>> to 100ms, the chances are that you won't see much improvement. If it's set
>> for 100 seconds, it'll go as fast as local FS for cached data but you'll
>> be working on FS state that might as well be imaginary in some cases. No
>> doubt someone will then complain about the fact that posix semantics no
>> longer work.
>
> <snip>
>
> I have been following this thread and the metadata stuff does interest me -
> we have millions and millions of small files.
>
> In the above situation though, I would of thought knowing all of the inputs
> into the system ( ie - gluster knows that state everything is in, as long as
> no-one enters and changes things from outside of the mechanism in the
> back-ground ) could see some fair potential for caching the meta data.  If
> the system is in a degraded state sure you wouldn't and shouldn't trust this
> cache, but all things being equal and happy, why can't we trust a good sized
> cache metadata is AFR/unity/whatever is reporting the system is happy and
> operational ?

This relates to the point I made a few days ago on the other thread. You 
_could_ do this, but in order to do that, you'd have to change the 
sync-on-read paradigm and couple the systems much more tightly. This would 
likely involve things like mandatory fencing requirements which are 
currently avoided.

If you have a read-lock on a file, you cannot get a write-lock on it, so 
you could potentially sacrifice write-lock performance for read-locking in 
that case, by making read-locks always available without external checking 
against other nodes unless a write lock is in place (which needs to be 
broadcast and acknowledged by _all_ nodes in the cluster).

This is also made more difficult with unify or striping because the data 
is remote in the first place, so you have to retrieve the metadata at 
least from the server - unless you want to cache it locally, which would 
gain break posix semantics.

Note - NFS is not posix. You can set metadata cache expiry on NFS. NFS 
also has the advantage that the data is on _one_ server, so even if 
there was some form of locking that reliably works over NFS available 
(there isn't, but for the sake of the argument, if there was) there would 
still be no concept of chasing locks across the cluster to make sure the 
mirrors are consistent before granting them.

In short - comparing NFS to GlusterFS isn't really meaningful.

Gordan

_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel