[Gluster-devel] Crash on HA config when restoring a server

Anand Avati avati at zresearch.com
Wed Aug 8 18:14:46 UTC 2007


Kevan,
is it possible to get a backtrace from the coredump using gdb?
thanks
avati

2007/8/8, Kevan Benson <kbenson at a-1networks.com>:
>
>
> When running a HA config with 2 servers and 2 clients, I can consistently
> crash the active server after failing the other.  This is on TLA version
> patched to 440.
>
> System configs at http://glusterfs.pastebin.com/m52564c56
> Server A: 172.16.1.81
> Server B: 172.16.1.82
> Client A: 172.16.1.85
> Client B: 172.16.1.86
> Note: Client transport-timeout (on clients and servers) was set to 10 in
> first
> two crashes, and set to 30 on Client A and B in the last one (servers
> still
> had it set to 10).
>
> For the first crash, I fail server B (ifdown eth1), and then try to ls the
> mount point with the client (time ls -l /mnt/glusterfs) from both
> clients.  I
> generally get a "ls: /mnt/glusterfs/: Transport endpoint is not connected"
> error once or twice, and then the active server's (A) glusterfsd will
> either
> start responding or crash (about 50% chance).  In this case, I had
> restored
> network connectivity to server B and ran a few more ls's from the clients.
>
> The glusterfsd.log (including backtrace) is at
> http://glusterfs.pastebin.com/m15d7f914
>
> Upon restarting glusterfs on server A and restoring the network connection
> to
> server B, I initiated the above ls from the clients and crashed server A's
> glusterfsd again.  Glusterfsd on Server B was never restarted, it was
> failed
> because of lack of connectivity.
>
> The glusterfsd.log (including backtrace) for THIS crash is at
> http://glusterfs.pastebin.com/m28ee8e5a
>
> Here's a crash from doing an ls with one server failed, after restarting
> one
> of  the servers a few times.
>
> The glusterfsd.log (including backtrace):
> http://glusterfs.pastebin.com/m2ee6c471
>
> All logs shown are from the crashing server, Server A.  I can just as
> easily
> crash server B by failing A.  Let me know if you need more logs from other
> hosts and I'll re-run whichever scenarios you like,
>
> --
> - Kevan Benson
> - A-1 Networks
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>



-- 
It always takes longer than you expect, even when you take into account
Hofstadter's Law.

-- Hofstadter's Law



More information about the Gluster-devel mailing list