[Gluster-users] PLEASE READ ! We need your opinion. GSOC-2014 and the Gluster community

Carlos Capriotti capriotti.carlos at gmail.com
Thu Mar 13 12:58:22 UTC 2014


Now, to the tech talk.

1) Striped, as far as I could see, already works on a block level and it
uses disk space in a more sensible way. auto-balance tends to be simpler,
or quicker, and resilience is embedded on each node. Other solutions in the
market do similar things and that works REALLY well. We have a fear of  it
because we have a "split-brain" trauma, I am afraid. But RAIDZ-like
solutions are everywhere, as far as I can see, with excellent results. This
is a complete makeover, which (potentially)  radically changes how stripes
and redundancy works. In my opinion, much more attratcive to the market.
Can you imagine a structure like AWS with higher data-protection, using 25%
less power and hardware ? Or, using that hardware to power other
applications ?

2) "My" number 2 (which is not mine at all. It is just a suggestion), is
only valid for distributed/replicated volumes, because they rely on copied
of the files. With the proposed change on the stripe algorithm, split-brain
arbitration may become redundant.

3) We have a problem with small files, of small chunks of files.
Performance may be slow. For the VM universe this makes a LOT of
difference. It may also be VERY attractive to the database/dig data world.
But, since this is just a figment of my imagination so far, let's leave it
like this. That is why it is in the last position anyway.

But, THANKS for joining the discussion ! THAT is the community
participation we have to have.




On Thu, Mar 13, 2014 at 12:29 PM, Bernhard Glomm <bernhard.glomm at ecologic.eu
> wrote:

> thnx Carlos for your ambition Carlos
> I'm not much of a developer but for what it's worth here my thoughts:
>
> Your #1) sounds great, but does that mean object store or will it still be
> whole files that are handled?
> up to now I loved the feeling of having at least my files on the bricks if
> something went really wrong and
> not ending up with a huge number of xMB sized snippets splattered around
> (well I know pros and cons
> and depending on scenario and size, but still though...)
>
> Your #2) could be incorporated in #1) somehow? "minimum 3 bricks, 1 per
> node" with kind of quorum mechanism???
>
> As #3) I would vote for reliable encryption to be able to use third party
> storage,
> just speed??? is affected by to many other things, bandwith, speed of
> underlaing storage, different speed of different bricks...
> so speed I would vote down on rank #4) ;-)
>
> Bernhard
>
> Am 13.03.2014 12:10:17, schrieb Carlos Capriotti:
>
> Hello, all.
>
> I am a little bit impressed by the lack of action on this topic. I hate to
> be "that guy", specially being new here, but it has to be done.
>
> If I've got this right, we have here a chance of developing Gluster even
> further, sponsored by Google, with a dedicated programmer for the summer.
>
> In other words, if we play our cards right, we can get a free programmer
> and at least a good start/advance on this fantastic.
>
> Well, I've checked the trello board, and there is a fair amount of things
> there.
>
> There are a couple of things that are not there as well.
>
> I think it would be nice to listen to the COMMUNITY (yes, that means YOU),
> for either suggestions, or at least a vote.
>
> My opinion, being also my vote, in order of PERSONAL preference:
>
> 1) There is a project going on (https://forge.gluster.org/disperse), that
> consists on re-writing the stripe module on gluster. This is specially
> important because it has a HUGE impact on Total Cost of Implementation
> (customer side), Total Cost of Ownership, and also matching what the
> competition has to offer. Among other things, it would allow gluster to
> implement a RAIDZ/RAID5 type of fault tolerance, much more efficient, and
> would, as far as I understand, allow you to use 3 nodes as a minimum
> stripe+replication. This means 25% less money in computer hardware, with
> increased data safety/resilience.
>
> 2) We have a recurring issue with split-brain solution. There is an entry
> on trello asking/suggesting a mechanism that arbitrates this resolution
> automatically. I pretty much think this could come together with another
> solution that is file replication consistency check.
>
> 3) Accelerator node project. Some storage solutions out there offer an
> "accelerator node", which is, in short, a, extra node with a lot of RAM,
> eventually fast disks (SSD), and that works like a proxy to the regular
> volumes. active chunks of files are moved there, logs (ZIL style) are
> recorded on fast media, among other things. There is NO active project for
> this, or trello entry, because it is something I started discussing with a
> few fellows just a couple of days ago. I thought of starting to play with
> RAM disks (tmpfs) as scratch disks, but, since we have an opportunity to do
> something more efficient, or at the very least start it, why not ?
>
> Now, c'mon ! Time is running out. We need hands on deck here, for a simple
> vote !
>
> Can you share 3 lines with your thoughts ?
>
> Thanks
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140313/4f0ee30e/attachment.html>


More information about the Gluster-users mailing list