[Gluster-users] Replication not working on server hang
Jeff Evans
jeffe at tricab.com
Sun Aug 30 10:21:52 UTC 2009
Hi Avati,
I'm experiencing complete system-wide hangs exactly as David has
mentioned.
> The discussion in this
> thread is about those situations where the server (machine
> hosting the
> storage/posix volume) hangs the backend filesystem (verified by
> kernel console logs) and that in turn results in the mountpoint
> hang.
That seems to be the case in Stephan's situation, yes, as we have
evidence from reiserFS. What evidence have we in the ext3 cases?
> While your symptoms are similar on the client side hanging,
In the case of 144, my systems didn't hang. Maybe I was just lucky.
Now that I have disabled read-ahead to workaround 144, I am seeing
total system hangs. I also saw these hangs back before I used
read-ahead (with 1.3).
As I have said, it is like new FD's cannot be allocated, while those
already open continue normally. I'm talking about regular ext3 mounts
here, not glusterfs ones.
> The discussion thread is about the situation where the server side
> kernel misbehaves and results in glusterfs hanging. The two
> actual problems are quite different.
Perhaps, as I said, it may be coincidence, but when I ran with
read-ahead, I didn't get any system hangs, just the core-dumps.
Now, I don't get core dumps any more. I get system-wide hangs.
Thanks, Jeff.
More information about the Gluster-users
mailing list