[Gluster-users] GlusterFS on a two-node setup
John Jolet
jjolet at drillinginfo.com
Mon May 21 13:47:52 UTC 2012
i suspect that an rsync with the proper argument will be fine before starting glusterd on the recovered node….
On May 21, 2012, at 7:58 AM, Ramon Diaz-Uriarte wrote:
>
>
>
> On Mon, 21 May 2012 00:46:33 +0000,John Jolet <jjolet at drillinginfo.com> wrote:
>
>> On May 20, 2012, at 4:55 PM, Ramon Diaz-Uriarte wrote:
>
>>>
>>>
>>>
>>> On Sun, 20 May 2012 20:38:02 +0100,Brian Candler <B.Candler at pobox.com> wrote:
>>>> On Sun, May 20, 2012 at 01:26:51AM +0200, Ramon Diaz-Uriarte wrote:
>>>>> Questions:
>>>>> ==========
>>>>>
>>>>> 1. Is using GlusterFS an overkill? (I guess the alternative would be to use
>>>>> NFS from one of the nodes to the other)
>>>
>>>> In my opinion, the other main option you should be looking at is DRBD
>>>> (www.drbd.org). This works at the block level, unlike glusterfs which works
>>>> at the file level. Using this you can mirror your disk remotely.
>>>
>>>
>>> Brian, thanks for your reply.
>>>
>>>
>>> I might have to look at DRBD more carefully, but I do not think it fits my
>>> needs: I need both nodes to be working (and thus doing I/O) at the same
>>> time. These are basically number crunching nodes and data needs to be
>>> accessible from both nodes (e.g., some jobs will use MPI over the
>>> CPUs/cores of both nodes ---assuming both nodes are up, of course ;-).
>>>
>>>
>>>
>>>
>>>> If you are doing virtualisation then look at Ganeti: this is an environment
>>>> which combines LVM plus DRBD and allows you to run VMs on either node and
>>>> live-migrate them from one to the other.
>>>> http://docs.ganeti.org/ganeti/current/html/
>>>
>>> I am not doing virtualisation. I should have said that explicitly.
>>>
>>>
>>>> If a node fails, you just restart the VMs on the other node and away you go.
>>>
>>>>> 2. I plan on using a dedicated partition from each node as a brick. Should
>>>>> I use replicated or distributed volumes?
>>>
>>>> A distributed volume will only increase the size of storage available (e.g.
>>>> combining two 600GB drives into one 1.2GB volume - as long as no single file
>>>> is too large). If this is all you need, you'd probably be better off buying
>>>> bigger disks in the first place.
>>>
>>>> A replicated volume allows you to have a copy of every file on both nodes
>>>> simultaneously, kept in sync in real time, and gives you resilience against
>>>> one of the nodes failing.
>>>
>>>
>>> But from the docs and the mailing list I get the impression that
>>> replication has severe performance penalties when writing and some
>>> penalties when reading. And with a two-node setup, it is unclear to me
>>> that, even with replication, if one node fails, gluster will continue to
>>> work (i.e., the other node will continue to work). I've not been able to
>>> find what is the recommended procedure to continue working, with
>>> replicated volumes, when one of the two nodes fails. So that is why I am
>>> wondering what would replication really give me in this case.
>>>
>>>
>> replicated volumes have a performance penalty on the client. for
>> instance, i have a replicated volume, with one replica on each of two
>> nodes. I'm front ending this with an ubuntu box running samba for cifs
>> sharing. if my windows client sends 100MB to the cifs server, the cifs
>> server will send 100MB to each node in the replica set. As for what you
>> have to do to continue working if a node went down, i have tested this.
>> Not on purpose, but one of my nodes was accidentally downed. my client
>> saw no difference. however, running 3.2.x, in order to get the client
>> to use the downed node after it was brought back up, i had to remount
>> the share on the cifs server. this is supposedly fixed in 3.3.
>
> OK, great. Thanks for the info. It is clear, then, that several of you
> report that this will work just fine.
>
>
>> It's important to note that self-healing will create files created while
>> the node was offline, but does not DELETE files deleted while the node
>> was offline. not sure what the official line is there, but my use is
>> archival, so it doesn't matter enough to me to run down (if they'd
>> delete files, i wouldn't need gluster..)
>
>
> That is good to know, but is not something I'd want. Is there any way to
> get files to be deleted? Maybe rsync'ing or similar before self-healing
> starts? Or will that lead to chaos?
>
>
>
> Best,
>
> R.
>
>
>>> Best,
>>>
>>> R.
>>>
>>>
>>>
>>>
>>>> Regards,
>>>
>>>> Brian.
>>> --
>>> Ramon Diaz-Uriarte
>>> Department of Biochemistry, Lab B-25
>>> Facultad de Medicina
>>> Universidad Autónoma de Madrid
>>> Arzobispo Morcillo, 4
>>> 28029 Madrid
>>> Spain
>>>
>>> Phone: +34-91-497-2412
>>>
>>> Email: rdiaz02 at gmail.com
>>> ramon.diaz at iib.uam.es
>>>
>>> http://ligarto.org/rdiaz
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
> --
> Ramon Diaz-Uriarte
> Department of Biochemistry, Lab B-25
> Facultad de Medicina
> Universidad Autónoma de Madrid
> Arzobispo Morcillo, 4
> 28029 Madrid
> Spain
>
> Phone: +34-91-497-2412
>
> Email: rdiaz02 at gmail.com
> ramon.diaz at iib.uam.es
>
> http://ligarto.org/rdiaz
>
More information about the Gluster-users
mailing list