[Gluster-users] The continuing story ...
Mark Mielke
mark at mark.mielke.cc
Thu Sep 10 13:27:12 UTC 2009
On 09/10/2009 06:25 AM, Stephan von Krawczynski wrote:
> On Wed, 09 Sep 2009 19:43:15 -0400
> Mark Mielke<mark at mark.mielke.cc> wrote:
>
>
>> In this case, there is too many unknowns - but I agree with Anand's
>> logic 100%. Gluster should not be able to cause a CPU lock up. It should
>> be impossible. If it is not impossible - it means a kernel bug, and the
>> best place to have this addressed is the kernel devel list, or, if you
>> have purchased a subscription from a company such as RedHat, than this
>> belongs as a ticket open with RedHat.
>>
> You know, I am really bothered about the way the maintainers are acting since
> I read this list. There is really a lot of ideology going on ("can't be", "is
> impossible for userspace" etc) and very few real debugging.
>
In general, if one didn't understand the architecture (black box), you
would be right. I would not want to here these things either -
However, Anand did not *just* say "can't be", or "is impossible for
userspace". He provided significant explanation which exactly matches my
own prior understanding about the separation between user space and
kernel space. In particular, if you read about the intent of FUSE - the
technology being used to create a file system, I think you will find
that what Anand is saying is the *exact* purpose for this project. Why
have a file system in user space in the first place? It introduces
performance limitations. The "why do it" is precisely to provide the
level of isolation and separation from the kernel. Take into account
that FUSE *encourages* user's to use or write their own file system
layers. That is, you do not need to be admin/root in order to call
fusermount and mount a FUSE file system. If any user of your system can
create it - don't you think it is reasonable to expect the kernel and
FUSE developers to protect "complete system lock up" from occurring?
If FUSE cannot provide separation from kernel space, then FUSE is
garbage and should be thrown away.
GlusterFS folk chose to use FUSE to obtain this separation. If this
separation is not being provided - than the value of FUSE in the first
place is brought into question.
> This application is not the only one in the world. People use heavily file-
> and net-acting applications like firefox, apache, shell-scripts, name-one on
> their boxes. None leads to effects seen if you play with glusterfs. If you
> really think it is a logical way of debugging to go out and simply tell
> "userspace can't do that" while the rest of the application-world does not
> show up with dead-ends like seen on this list, how can I change your mind?
> I hardly believe I can. I can only tell you what I would do: I would try to
> document _first_ that my piece of code really does behave well. But as you may
> have noticed there is no real way to provide this information. And that is
> indeed part of the problem.
Other applications fail all of the time as well - it seems to be a
question, in this case, of how critical the failure is. But, let us say
that the problem is "every time you send too many bytes to the network
device, it dies" - whose responsibility is this to fix? Is it GlusterFS
for being too efficient? Is it the Linux kernel driver for not loading
the data onto the device properly? Is it the device for locking up under
a certain type of load? Do you think the best and only place to look for
an answer is the user space program? Let's say every time you used
Firefox, your CPU + network device locked up - would you go to the
Firefox devel mailing list and demand they fix this? Because, you know
you would get the same response. They would laugh you out - whether you
were right or wrong. The onus is on you, as it is your hardware which is
dying. Now, if you have paid a subscription price - the responsibility
might increase - but if it's free / open source / community-based
support, and the general consensus on the technology is that FUSE is a
user space capability, and user space should not be able to lock up CPU
or network device - which is most definitely true, at least as an ideal
- then yes, the onus is the community member to prove that others should
reconsider their long held beliefs.
I disagree that other user space programs do not trigger kernel bugs.
Read the kernel devel list for a while, or patch the kernel release
notes. I think you will find that most bugs - of which there are
thousands or more - are triggered by user space programs. The kernel
developers fix the problems, because they are kernel problems. Unless
you tell them about the problem - they won't know to look into it. In
this particular case - what do you want Anand or gluster.com to do?
Let's say every time they send one particular packet very quickly after
another particular packet, it locks up. Do you want Anand or gluster.com
to stop sending these packets? Or do you want the Linux developers to
fix the device so this no longer happens again? Which is a better
solution to the problem? Which helps the most people, and prevents the
problem from occurring in the future?
> Wouldn't it be a nice step if you could debug the ongoings of a
> glusterfs-server on the client by simply reading an exported file (something
> like a server-dependant meta-debug-file) that outputs something like strace
> does? Something that enables you to say: "Ok, here you can see what the
> application did, and there you can see what the kernel made of it". As we
> noticed a server-logfile is not sufficient.
>
Sure, that would be awesome - and I think it's provided by such things
as DTrace, or strace. This is a kernel problem. If every user space
application must re-invent this wheel, that's surely a lot of effort on
the part of application developers. Should Firefox provide the same
functionality?
> Is ideology really a prove for anything in todays' world? Do you really think
> it is possible to understand the complete world by seeing half of it and the
> other half painted by ideology? What is wrong about _proving_ being not
> guilty? About acting defensive ?
>
Ideology? It depends. Some basic principles and basic understandings of
how the systems are designed are required to help us triage problems we
face every day. Knowing the system has failed is insufficient. The first
question is - where did the failure occur, and who should I ask for
help? In the case of a CPU lockup and/or network device lockup, most
people would start with the Linux kernel. Now, if this was an NFS
problem - than NFS is part of the kernel, and so the NFS part of the
kernel would be open for consideration. But, we know that GlusterFS does
*not* introduce any code into the kernel. This little bit of information
is important, and should not be ignored.
Now, maybe it's easier to start with GlusterFS, and try and pull on the
community and the GlusterFS support people for *help*, because we can
argue "it's a failure that from a superficial level, seems to be trigger
by your software, so you should be concerned" - but this has a limit to
how effective it is as a means of addressing the problem. First, the
GlusterFS people probably have little or no clue on how to diagnose a
CPU lockup or network device lockup. Since their software is entirely in
user space, they would not require this capability as part of their
employment requirements. If somebody on the staff *happens* to have the
right experience, and happens to know the answer, than you would be
lucky. This is not the sort of thing I would expect however, even if I
had a subscription for support. For them to say - we have looked into
this, and this is not our problem because of such and such, but please
come back if you can prove otherwise such that we can do something about
this - is a fair answer.
> It is important to understand that this application is a kind of core
> technology for data storage. This means people want to be sure that their
> setup does not explode just because they made a kernel update or some other
> change where their experience tells them it should have no influence on the
> glusterfs service. You want to be sure, just like you are when using nfs. It
> just does work (even being in kernel-space!).
> Now, answer for yourself if you think glusterfs is as stable as nfs on the
> same box.
One could say the same thing about every layer between user space and
the hardware. If your disk dies, is this a GlusterFS problem? If your
disk controller dies, is this a GlusterFS problem? If your network
device dies, is this a GlusterFS problem? If your CPU dies, is this a
GlusterFS problem?
I don't agree that NFS just works. NFS has gone through a lot of
evolution and maturity. At our company, I've been aware of numerous
problems with NFS. Coincidentally, I was in a call with one of the
owners of the Linux NFS code from RedHat a few weeks ago to discuss the
subject of file system caching, and whether or not NFS would be a
solution to a problem we were having with another network file system
(ClearCase MVFS). During the call, the subject of NFS development did
come up, and as I recall, he humbly acknowledged that NFS has had a lot
of problems and they have been working hard on it. I told him I thought
they were doing a great job and I meant it. Everything is relative. Yes,
we rely on NFS every day at work - but, for the most part, it works
great. We have problems, but RedHat has been responsive to our problems
*once we have identified them as NFS problems*, and we work together
towards a solution. But then, we also pay RedHat money to support us.
Cheers,
mark
--
Mark Mielke<mark at mielke.cc>
More information about the Gluster-users
mailing list