[Gluster-devel] Custom Transport layers
Jeff Darcy
jdarcy at redhat.com
Fri Oct 28 14:46:28 UTC 2016
> Is it possible to write custom transport layers for gluster?, data
> transfer, not the management protocols. Pointers to the existing code
> and/or docs :) would be helpful
Is it *possible*? Yes. Is it easy or well documented? Definitely no.
The two transports we have - TCP/UNIX-domain sockets and IB RDMA - are
both in rpc/rpc-transport in the source tree. They need to interact
with several other pieces:
generic RPC layer (rpc/rpc-lib)
event polling (event_register and friends)
server and client translators (xlators/protocol)
authentication pseudo-translators (xlators/protocol/auth)
Unfortunately neither of the examples we have are well documented,
internally or externally, so a certain amount of reverse engineering
will be necessary to understand these interfaces.
> I'd like to experiment with broadcast udp to see if its feasible in
> local networks. It would be amazing if we could write at 1GB speeds
> simultaneously to all nodes.
That particular idea involves some extra complexity. Our current
communications model is all point-to-point request/response. Any
kind of broadcast or multicast would therefore involve changing
how we "think" about addressing. How does the user specify a
multicast group? How do we generate a client volfile with one
multicast client instead of several unicast ones? How do we track
multiple acknowledgements to a single outbound message, so that we
can enforce quorum and consistency? That's going to affect AFR as
well as the other components mentioned (neither EC nor JBR could
take advantage of this). How do we track which file descriptors
are still valid on the servers, and which need to be recovered?
> Alternatively let me know if this has been tried and discarded as a bad
> idea ...
I'm not saying it's a bad idea, but it's quite a departure from the
communications model we have now. In a modern switched network, the
savings are only on the sender side; the switch has to copy the
packet to N receiver ports anyway. Server-side replication has that
same advantage, plus it can use a separate (often faster) network
for all but that first hop. If you want to help us improve traffic
flows, that's where I'd suggest most effort should be spent.
More information about the Gluster-devel
mailing list