[Gluster-users] The continuing story ...

Daniel Jordan Bambach dan at lateral.net
Mon Sep 7 19:53:37 UTC 2009


Yep, I experience this exact lock-up state on the 2.x train of  
GlusterFS with two severs, each with local client, and have so far  
given up testing :( - I run 1.3 in production which still has problems  
when one of the servers goes down, and was hoping to move up to 2.x  
quickly, but cant at the moment.

Every time a new version comes out I update hoping it will be solved.

Because the machine that hangs, hangs so completely one can't ssh in  
and can't get a proper dump from the process, and any DEBUG log  
enabled has no information in it either, so I haven't been able to  
provide anything useful to the team to work from :(



On 7 Sep 2009, at 15:46, Stephan von Krawczynski wrote:

> Hello all,
>
> last week we saw our first try to enable something like a real-world
> environment on glusterfs fail.
> Nevertheless we managed to get a working combination of _one_ server  
> and _one_
> client (using a replicate setup with a missing second server).
> This setup worked for about 4 days, so yesterday we tried to enable  
> the second
> server. Within minutes the first one crashed. Well, really we do not  
> know if
> it crashed in its true meaning, the situation looked like this:
> - server was ping'able
> - glusterfsd was disconnected by the client because of missing ping- 
> pong
> - no login possible
> - no fs action (no lights on the hd-stack)
> - no screen (was blank, stayed blank)
>
> This could also be a user-space hang or cpu busy/looping. We don't  
> know.
> The really interesting part is that the server worked for days being  
> single,
> but as soon as dual server fs action (obviously in combination with  
> self
> healing) started it did not survive 10 minutes.
> Of course the second server went on, but we had to stop the whole  
> thing
> because the data was not completely healed, so it made no sense to  
> go on with
> old copies.
> This was glusterfs 2.0.6 with a minimal server setup (storage/posix,
> features/locks, performance/io-threads) on a linux kernel 2.6.25.2.
> Is there someone out there that experienced something the like?
> Any ideas?
>
> -- 
> Regards,
> Stephan
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>




More information about the Gluster-users mailing list