[Gluster-users] The continuing story ...

Stephan von Krawczynski skraw at ithnet.com
Tue Sep 8 14:51:54 UTC 2009


On Tue, 8 Sep 2009 05:37:09 -0700
Anand Avati <avati at gluster.com> wrote:

> >> > I doubt that this can be a real solution. My guess is that glusterfsd runs
> >> > into some race condition where it locks itself up completely.
> >> > It is not funny to debug something the like on a production setup. Best would
> >> > be to have debugging output sent from the servers' glusterfsd directly to a
> >> > client to save the logs. I would not count on syslog in this case, if it
> >> > survives one could use a serial console for syslog output though.
> 
> I'm going to iterate through this yet again at the risk of frustrating
> you. glusterfsd (on the server side) is yet another process running
> only system calls. If glusterfsd has a race condition and locks itself
> up, then it locks _only its own process_ up. What you are having is a
> frozen system. There is no way glusterfsd can lock up your system
> through just VFS system calls, even if it wanted to, intentionally. It
> is a pure user space process and has no power to lock up the system.
> The worst glusterfsd can do to your system is deadlock its own process
> resulting in a glusterfs fuse mountpoint hang, or segfault and result
> in a core dump.
> 
> Please consult system/kernel programmers you trust. Or ask on the
> kernel-devel mailing list. The system freeze you are facing is not
> something which can be caused by _any_ user space application.

Please read carefully what I told about the system condition. The fact that I
can ping the box means that the kernel is not messed up, i.e. this is no
freeze. But as I cannot login nor use any other user-space software to get
hands on the box only means that an application should only be able to mess up
the userspace to an extent that every other application gets few to no
timeslices, or some system resource is eaten up to an extent that others are
simply locked out. That does not sound impossible to me as it is just like a
local DoS attack which is possible. Maybe one only needs some messed up
pointers to create such a situation. What really bothers me more is the fact
that you continously deny to see what several people on the list described.
It is not our intention to waste someones time, we try to give as much
information as possible to go out and find some problem. Unfortunately we
cannot do that job, because we don't have the background knowledge about your
code. 
Since it all is userspace maybe it would be helpful to have a version that
just outputs logs to serial, so that we can trace where it went before things
blew up. Maybe we can watch it cycling somewhere...

Do you really deny that a local DoS attack is generally possible? 
-- 
Regards,
Stephan




More information about the Gluster-users mailing list