[Gluster-users] Distributed-replicated vs striped-replicated.. some	basic questions
    Aronesty, Erik 
    earonesty at expressionanalysis.com
       
    Tue Aug 26 14:46:21 UTC 2014
    
    
  
We have a situation where 
- write performance is important
- median file size is more than 1GB
1. Is striping is the way to go for this data... to get better write-speed?
Currently I have a 2x2 distributed replicated volume, which, for a single file, has 2x read performance, but around 1x performance for writes.
2. In this scenario, would a "distributed 2x replicated 4x striped" volume would be the fastest performance that still allows a single-node failure, and the ability to easily add new nodes?   What would striping without distributing do?
3. Is this still "beta" code?   Should I avoid it until the feature is stable?   
4. Is erasure coding ever going to be on the table for a production release (eliminating the need for replicas in striped storage)?
5. Is there such thing as a gluster "meta-volume" ... which combines multiple gluster volumes into a single one... and allows the automatic moving of directories and/or files into faster/slower volumes based on frequency and types of use?   
I can imagine a nice, easily programmable ruleset:
Directories with lots of small files... marked as  distributed/replicated.  Giant files that get read very frequently?  Striped for sure.  If there's a known file-name or regex that will always be a huge write?  Mark it as striped.   Files that are hardly ever touched?   Move them to the slow storage... etc.   I imagine something like this has already been done by someone.
    
    
More information about the Gluster-users
mailing list