[Gluster-devel] Brick multiplexing approaches

Tue Jun 14 15:49:39 UTC 2016

> There is *currently* no graph switch on the bricks (as I understand it).
> Configuration changes, yes, but no graph switch as the xlator pipeline
> is fixed, if that changes the bricks need to be restarted. Others can
> correct me if I am wrong.
> 
> Noting the above here, as it may not be that big a deal. Also noting the
> 'currently' in bold, as the future could mean something different.

I'd say it certainly will.  Adding bricks to JBR or DHT2, which have
server-side components sensitive to such changes, practically require it.

> The sub-graph model seems the best for certain other things, like
> preserving the inode and other tables as is by the master xlator. It
> does introduce the onus of keeping the inodes (and fd's) current on the
> xlators though (watering this down to sub-graph at the sub-brick level
> is possible, but that would be 'watering' this concept down). This needs
> some more thought, but I do like the direction.

The inode table's kind of a weird beast.  In a multiplexing model it
doesn't really belong on the server translator, but it belongs even
less on the next translator down because the server translator's the
one that uses it.  Really, the server translator is such a special
case in so many ways that I'm not even sure it's really a translator
at all.  What we kind of need is a first-class abstraction of a
sub-graph or translator stack that can be attached to a server
translator.  Such an object would "contain" not just the graph but
also the inode table, auth info, and probably other stuff (e.g. QoS
info) as well.  All this refactoring is inconvenient, but I don't
see any way around it if we want to support much larger numbers of
volumes on relatively small numbers of nodes (as e.g. container
workloads require).