<div dir="ltr"><div>Hi Stefan,</div><div><br></div>I think what you propose will work, though you should test it thoroughly.<div><br></div><div>I think more generally, "the GlusterFS way" would be to use 2-way replication instead of a distributed volume; then you can lose one of your servers without outage. And re-synchronize when it comes back up.</div><div><br></div><div>Chances are if you weren't using the SAN volumes; you could have purchased two servers each with enough disk to make two copies of the data, all for less dollars...</div><div><br></div><div>Regards,</div><div>Alex</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Dec 11, 2017 at 12:52 PM, Stefan Solbrig <span dir="ltr"><<a href="mailto:stefan.solbrig@ur.de" target="_blank">stefan.solbrig@ur.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear all,<br>
<br>
I'm rather new to glusterfs but have some experience running lager lustre and beegfs installations. These filesystems provide active/active failover. Now, I discovered that I can also do this in glusterfs, although I didn't find detailed documentation about it. (I'm using glusterfs 3.10.8)<br>
<br>
So my question is: can I really use glusterfs to do failover in the way described below, or am I misusing glusterfs? (and potentially corrupting my data?)<br>
<br>
My setup is: I have two servers (qlogin and gluster2) that access a shared SAN storage. Both servers connect to the same SAN (SAS multipath) and I implement locking via lvm2 and sanlock, so I can mount the same storage on either server.<br>
The idea is that normally each server serves one brick, but in case one server fails, the other server can serve both bricks. (I'm not interested on automatic failover, I'll always do this manually. I could also use this to do maintainance on one server, with only minimal downtime.)<br>
<br>
<br>
#normal setup:<br>
[root@qlogin ~]# gluster volume info g2<br>
#...<br>
# Volume Name: g2<br>
# Type: Distribute<br>
# Brick1: qlogin:/glust/castor/brick<br>
# Brick2: gluster2:/glust/pollux/brick<br>
<br>
# failover: let's artificially fail one server by killing one glusterfsd:<br>
[root@qlogin] systemctl status glusterd<br>
[root@qlogin] kill -9 <pid/of/glusterfsd/running/<wbr>brick/castor><br>
<br>
# unmount brick<br>
[root@qlogin] umount /glust/castor/<br>
<br>
# deactive LV<br>
[root@qlogin] lvchange -a n vgosb06vd05/castor<br>
<br>
<br>
### now do the failover:<br>
<br>
# active same storage on other server:<br>
[root@gluster2] lvchange -a y vgosb06vd05/castor<br>
<br>
# mount on other server<br>
[root@gluster2] mount /dev/mapper/vgosb06vd05-castor /glust/castor<br>
<br>
# now move the "failed" brick to the other server<br>
[root@gluster2] gluster volume replace-brick g2 qlogin:/glust/castor/brick gluster2:/glust/castor/brick commit force<br>
### The last line is the one I have doubts about<br>
<br>
#now I'm in failover state:<br>
#Both bricks on one server:<br>
[root@qlogin ~]# gluster volume info g2<br>
#...<br>
# Volume Name: g2<br>
# Type: Distribute<br>
# Brick1: gluster2:/glust/castor/brick<br>
# Brick2: gluster2:/glust/pollux/brick<br>
<br>
<br>
Is it intended to work this way?<br>
<br>
Thanks a lot!<br>
<br>
best wishes,<br>
Stefan<br>
<br>
______________________________<wbr>_________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org">Gluster-users@gluster.org</a><br>
<a href="http://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a><br>
</blockquote></div><br></div>