[Gluster-users] One node goes offline, the other node can't see the replicated volume anymore

Joe Julian joe at julianfamily.org
Sat Jul 13 16:22:32 UTC 2013

No, they're equal peers. Each client connects to both servers after retrieving the configuration from the server specified in the mount command.

When a server shuts down, the TCP connection is properly closed and the clients continue to operate with the remaining servers. In a replicated volume that means without any missing data.

When the TCP connection is not closed, the client will attempt to reach the missing server for 42 (network.ping-timeout) seconds. The filesystem appears frozen during that timeout. Once timed out, the client should continue as above.

Your logs, however, say that the client has lost connection with ALL the servers. What I've seen in your logs so far, however, don't show both disconnects. I've only seen the last. If you'll follow my instructions, I can get a clearer picture of what's going wrong.

This is one of the reasons I hate mailing lists and do most of my support via IRC. On IRC there's not these hours or days long delays between. We're generally able to solve the worst problems in at few hours so I feel I am making a difference.

Anyway, follow my complete instructions and I'll help you further. I'm sure we can figure this out.

