[Gluster-devel] Suggestions

Wed Jun 8 14:02:27 UTC 2011

On Wed, 2011-06-08 at 12:34 +0100, Gordan Bobic wrote:
> Hans K. Rosbach wrote:
> 
> > -SCTP support, this might not be a silver bullet but it feels
> [...]
> 
> >  Features that might need glusterfs code changes:
> [...]
> >   -Multihoming (failover when one nic dies)
> 
> How is this different to what can be achieved (probably much more 
> cleanly) with NIC bonding?

NIC bonding is nice for a small network, but routed networks might
have advantages from this. This is not something I feel that I need,
but I am sure it would be an advantage for some other users. This could
possibly be of help in geo-replication setups for example.

> [...]
> > -Ability to have the storage nodes autosync themselves.
> >  In our setup the normal nodes have 2x1Gbit connections while the
> >  storage boxes have 2x10Gbit connections, so having the storage
> >  boxes use their own bandwidth and resources to sync would be nice.
> 
> Sounds like you want server-side rather than client-side replication. 
> You could do this by using afr/replicate on the servers, and export via 
> NFS to the clients. Have failover handled as for any normal NFS server.

We have considered this, and might decide to go down this route
eventually, however it seems strange that this can not also be done
using the native client.

The fact that each client writes to both servers is fine, but the
fact that the clients needs to do the re-sync work whenever the
storage nodes are out of sync (one of them rebooted for example)
seems strange and feels very unreliable especially since this is
a manual operation.

> > -An ability for the clients to subscribe to metadata updates for
> >  a specific directory would also be nice, so that it can cache that
> >  folders stats while working there and still know that it will not
> >  miss any changes. This would perhaps increase overhead in large
> >  clusters but could improve performance by a lot in clusters where
> >  several nodes work in the same folder (mail spool folder for example).
> 
> You have a shared mail spool on your nodes? How do you avoid race 
> conditions on deferred mail?

Several nodes can deliver mails to the spool folder, and dedicated queue
runners will pick them up and deliver them to local and/or remote hosts.
I am not certain what race conditions you are referring to, but locking
should make sure no more than one queue runner touches the file at one
time. Am I missing something?

-HK