[Gluster-users] Gluster-users Digest, Vol 16, Issue 16

Simon Liang simonl at bigair.net.au
Thu Aug 13 05:59:02 UTC 2009


Just had one of the servers die on me (gs2). This resulted in the whole
cluster becoming unaccessible.

When I tried to execute "df -h" on the client, it just froze and nothing
was happening.

I tried to access gs2 but it was not responding (however I was still
able to ping it). I had to restart gs2 in order for everything to be
accessible.

Please tell me how I can fix this issue...

On 08/12/2009 10:33 AM, Simon Liang wrote:
> I have a 2 client (gc1, gc2) and 2 server (gs1, gs2) cluster setup.
Both the servers have 2 x 1TB HDD in them, gs1 and gs2 are replicated.
>
> With my configuration below, if gs2 goes offline... should I still be
able to have access to the cluster?
>    

Yes? :-)

I'd suggest try and see. I'm a proponent of the "pull the plug and see" 
model of testing before deploying. Too many people trust marketing 
material, and/or trust their own understanding and choice of 
configuration. It can be a big surprise for people when they are doing 
database backups, for instance, when the database actually does require 
a restore, and the restore process does not work. Oops.

AFR puts the data on each of the boxes, AFR has code in it to detect and

deal with volumes being unavailable, and AFR has "self-heal" 
capabilities to try to fix the data once the broken nodes are brought 
back into service. The theory is yes. Try it out and see for yourself as

to whether it works for you in practice. :-)

Cheers,
mark




More information about the Gluster-users mailing list