[Gluster-devel] Data classification proposal
Xavier Hernandez
xhernandez at datalab.es
Wed Jun 25 14:19:55 UTC 2014
On Wednesday 25 June 2014 08:35:05 Jeff Darcy wrote:
> > For the short-term, wouldn't it be OK to disallow adding bricks that
> > is not a multiple of group-size?
>
> In the *very* short term, yes. However, I think that will quickly
> become an issue for users who try to deploy erasure coding because those
> group sizes will be quite large. As soon as we implement tiering, our
> very next task - perhaps even before tiering gets into a release -
> should be to implement automatic brick splitting. That will bring other
> benefits as well, such as variable replication levels to handle the
> sanlock case, or overlapping replica sets to spread a failed brick's
> load over more peers.
If I understand correctly the proposed data-classification architecture, each
server will have a number of bricks that will be dynamically modified as
needed: as more data-classifying conditions are defined, a new layer of
translators will be added (a new DHT or AFR, or something else) and some or
all existing bricks will be split to accommodate the new and, maybe,
overlapping condition.
How space will be allocated to each new sub-brick ? some sort of thin-
provisioning or will it be distributed evenly on each split ?
If using thin-provisioning, it will be hard to determine real available space.
If using a fixed amount, we can get to scenarios where a file cannot be
written even if there seems to be enough free space. This can already happen
today if using very big files on almost full bricks. I think brick splitting
can accentuate this.
Also, the addition of multiple layered DHT translators, as it's implemented
today, could add a lot more of latency, specially on directory listings.
Another problem I see is that splitting bricks will require a rebalance, which
is a costly operation. It doesn't seem right to require a so expensive
operation every time you add a new condition on an already created volume.
Maybe I've missed something important ?
Thanks,
Xavi
More information about the Gluster-devel
mailing list