[Gluster-devel] Spurious disconnections / connectivity loss
Gordan Bobic
gordan at bobich.net
Mon Feb 1 13:45:51 UTC 2010
Daniel Maher wrote:
> Gordan Bobic wrote:
>
>> That's hardly unexpected. If you are using client-side replicate, I'd
>> expect to see the bandwidth requirements multiply with the number of
>> replicas. For all clustered configurations (not limited to glfs) I use
>> a separate LAN for cluster communication to ensure best possible
>> throughput/latencies, and specifically in case of glfs, I do server
>> side replicate so that the replicate traffic gets offloaded to that
>> private cluster LAN, so the bandwidth requirements to the clients can
>> be kept down to sane levels.
>
> If you're willing to describe your setup further, i'd love to hear about
> it. I'm currently using client-side replication, and for the reasons of
> scalability you described, i'd like to move to a server-side replication
> setup.
>
> In particular, i'm interested in how you handle the connectivity between
> the clients and the servers vis-à-vis load balancing (if any) and
> availability. For example, in your configuration, how does a given
> client « know » which server to speak to, and what happens that server
> becomes inaccessible ?
Since client connections persist, you could use something as naive as
DNS round-robin load balancing - it'll do a good enough job in most
cases, if you have lots of clients. In terms of fail-over, that's more
tricky. NFS over UDP (with unfsd) works relatively gracefully with just
failing over the IP to one of the surviving servers (use something like
RedHat Cluster to handle the IP resource fail-over). Unfortunately, glfs
protocol itself doesn't handle disconnects gracefully - you just end up
with a "transport end point not connected" and you have to
umount+remount to get the volume back, which is messy and most
definitely not transparent. The obvious disadvantage of unfsd is
performance (and it's pretty dire, no two ways about it), although as I
mentioned in a thread here a while back, glfs protocol for client
connection doesn't seem to yield noticable benefits over unfsd, due to
it's own fuse overheads.
Performance translators help a lot, but unfortunately, last time I
tested, they destabilized things too much and I had to remove them.
> I realise that this may not be necessarily be appropriate discussion for
> the devel mailing list, but iirc, you're not on the user list, hence the
> reply here.
Yeah, I should probably sign up to the users list at some point. Most of
my posts are about possible bug reports which wouldn't be particularly
useful on the users list so I never bothered signing up to it.
Gordan
More information about the Gluster-devel
mailing list