[Gluster-devel] Data classification proposal

Tue Jun 24 11:57:37 UTC 2014

> Am I right if I understood that the value for media-type is not
> interpreted beyond the scope of matching rules? That is to say, we
> don't need/have any notion of media-types that type check internally
> for forming (sub)volumes using the rules specified.

Exactly.  To us it's just an opaque ID.

> Should the no. of bricks or lower-level subvolumes that match the rule
> be an exact multiple of group-size?

Good question.  I think users see the current requirement to add bricks
in multiples of the replica/stripe size as an annoyance.  This will only
get worse with erasure coding where the group size is larger.  On the
other hand, we do need to make sure that members of a group are on
different machines.  This is why I think we need to be able to split
bricks, so that we can use overlapping replica/erasure sets.  For
example, if we have five bricks and two-way replication, we can split
bricks to get a multiple of two and life's good again.  So *long term* I
think we can/should remove any restriction on users, but there are a
whole bunch of unsolved issues around brick splitting.  I'm not sure
what to do in the short term.

> > Here's a more complex example that adds replication and erasure
> > coding to the mix.
> >
> >     # Assume 20 hosts, four fast and sixteen slow (named
> >     appropriately).
> >
> >     rule tier-1
> >             select *fast*
> >             group-size 2
> >             type cluster/afr
> >
> >     rule tier-2
> >             # special pattern matching otherwise-unused bricks
> >             select %{unclaimed}
> >             group-size 8
> >             type cluster/ec parity=2
> >             # i.e. two groups, each six data plus two parity
> >
> >     rule all
> >             select tier-1
> >             select tier-2
> >             type features/tiering
> >
>
> In the above example we would have 2 subvolumes each containing 2
> bricks that would be aggregated by rule tier-1. Lets call those
> subvolumes as tier-1-fast-0 and tier-fast-1.  Both of these subvolumes
> are afr based two-way replicated subvolumes.  Are these instances of
> tier-1-* composed using cluster/dht by the default semantics?

Yes.  Any time we have multiple subvolumes and no other specified way to
combine them into one, we just slap DHT on top.  We do this already at
the top level; with data classification we might do it at lower levels
too.