[Gluster-devel] Data classification proposal

Jeff Darcy jdarcy at redhat.com
Mon Jun 23 23:11:44 UTC 2014


> Rather than using the keyword "unclaimed", my instinct was to
> explicitly list which bricks have not been "claimed".  Perhaps you
> have something more subtle in mind, it is not apparent to me from your
> response. Can you provide an example of why it is necessary and a list
> could not be provided in its place? If the list is somehow "difficult
> to figure out", due to a particularly complex setup or some such, I'd
> prefer a CLI/GUI build that list rather than having sysadmins
> hand-edit this file.

It's not *difficult* to make sure every brick has been enumerated by
some rule, and that there are no overlaps, but it's certainly tedious
and error prone.  Imagine that a user has four has bricks in four
machines, using names like serv1-b1, serv1-b2, ..., serv4-b6.
Accordingly, they've set up rules to put serv1* into one set and
serv[234]* into another set (which is already more flexibility than I
think your proposal gave them).  Now when they add serv5 they need an
extra step to add it to the tiering config, which wouldn't have been
necessary if we supported defaults.  What percentage of users would
forget that step at least once?  I don't know for sure, but I'd guess
it's pretty high.

Having a CLI or GUI create configs just means that we have to add
support for defaults there instead.  We'd still have to implement the
same logic, they'd still have to specify the same thing.  That just
seems like moving the problem around instead of solving it.

> The key-value piece seems like syntactic sugar - an "alias". If so,
> let the name itself be the alias. No notions of SSD or physical
> location need be inserted. Unless I am missing that it *is* necessary,
> I stand by that value judgement as a philosophy of not putting
> anything into the configuration file that you don't require. Can you
> provide an example of where it is necessary?

OK...

  * some files should go on ssds in rack 4

  * some files should go on ssds anywhere

  * some files should go anywhere in rack 4

  * some files we just don't care

Notice how the rules *overlap*.  We can't support that if our syntax
only allows the user to express a list (or list of lists).  If the list
is ordered by type, we can't also support location-based rules.  If the
list is ordered by location, we lose type-based rules instead.   Brick
properties create a matrix, with an unknown number of dimensions (e.g.
security level, tenant ID, and so on as well as type and location).  The
logical way to represent such a space for rule-matching purposes is to
let users define however many dimensions (keys) as they want and as many
values for each dimension as they want.

Whether the exact string "type" or "unclaimed" appears anywhere isn't
the issue.  What matters is that the *semantics* of assigning properties
to a brick have to be more sophisticated than just assigning each a
position in a list, and we need a syntax that supports those semantics.
Otherwise we'll end up solving the same UX problems again and again each
time we add a feature that involves treating bricks or data differently.
Each time we'll probably do it a little differently and confuse users a
little more, if history is any guide.  That's what I'd rather avoid.


More information about the Gluster-devel mailing list