[Gluster-users] Max recommended brick size of 100 TB

Mon Feb 23 13:21:13 UTC 2015

Hi Frank,

Thanks for your answer. On the ZFS side I've got everything pretty much sorted out (which RAIDZ to use, ZIL or not, compression, etc). To answer your questions I will start small with only two nodes in replicated mode and add more nodes as soon as needed using distributed-replicated mode then. I plan to use a 10 Gbit/s fiber dedicated private network just between the nodes for glusterfs. 

So now what I am really wondering is: should I have two bricks per node or just one know that per node I only have one HBA controller. So if I have two bricks on one node they still would be located on exactly the same controller but on two different ZFS volumes.

On Monday, February 23, 2015 2:14 PM, Frank Rothenstein <f.rothenstein at bodden-kliniken.de> wrote:
Hi,

I read your mails but I'm not sure if I fully understand your planned
setup. How many Gluster-Nodes have you planned? How fast will be your
interconnects between them?
I think at first you should test one node itself with all possible
options that ZFS has to offer (like compression or even deduplication
and also the RAIDZ1/2/3-option). Then of course consider your needs -
more capacity vs higher throughput (meaning one or more bricks per
node). And also don't forget the LARC2- and log-options offered by ZFS
to speed up the backend.
And then test the GlusterFS. Adding another dataset/brick is always
possible, so you can easily benchmark your system.
If you stay at the same replica count there should be no loss in size
whether you set up 2 or 4 or 8 bricks (for rep 2), afaik.

I think there is no real advise for your setup, ZFS on Linux is not that
common, to "gluster" it even less...

I hope my english is good enough to let you understand, what i mean ;)

Greetings, Frank

Am Montag, den 23.02.2015, 08:47 +0000 schrieb ML mail:
> Just saw that my post below never got replied and would be very glad if someone, maybe Niels?, could comment on this. Cheers!
> 
> 
> 
> 
> On Saturday, February 7, 2015 10:13 PM, ML mail <mlnospam at yahoo.com> wrote:
> Thank you Niels for your input, that definitely makes me more curious... Now let me tell you a bit more about my intended setup. First of all my major difference is that I will not be using XFS but ZFS. Then second major difference I will not be using any hardware RAID card but one single HBA (LSI 3008 chip). My inteded ZFS setup would consist of one ZFS pool per node. This pool will have 3 virtual devices of 12 disks each (6 TB per disk) each using RAIDZ-2 (equivalent to RAID 6) for integrity. This gives me a total of 36 disks for a total of 180 TB of raw capacity.
> 
> 
> 
> I will then create one big 180 TB ZFS data set (virtual device, file system or whatever you want to call it) for my GlusterFS brick. Now as mentioned I could also create have two bricks by creating two ZFS data sets of around 90 TB each. But as everything is behind the same HBA and same ZFS pool there will not be any gain in performance nor availability from the ZFS side.
> 
> On the other hand, you mention in your mail that having two bricks per node means having two glusterfsd processes running and allows me to handle more clients. Can you tell me more about that? Will I also see any genernal performance gain? For example in terms of MB/s throughput? Also are there maybe any disadvantages of running two bricks on the same node, especially in my case?
> 
> 
> 
> 
> 
> On Saturday, February 7, 2015 10:24 AM, Niels de Vos <ndevos at redhat.com> wrote:
> On Fri, Feb 06, 2015 at 05:06:38PM +0000, ML mail wrote:
> 
> > Hello,
> > 
> > I read in the Gluster Getting Started leaflet
> > (https://lists.gnu.org/archive/html/gluster-devel/2014-01/pdf3IS0tQgBE0.pdf)
> > that the max recommended brick size should be 100 TB.
> > 
> > Once my storage server nodes filled up with disks they will have in
> > total 192 TB of storage space, does this mean I should create two
> > bricks per storage server node?
> > 
> > Note here that these two bricks would still be on the same controller
> > so I don't really see the point or advantage of having two 100 TB
> > bricks instead of one single brick of 200 TB per node. But maybe
> > someone can explain the rational here?
> 
> This is based on the recommendation that RHEL has for maximum size of
> XFS filesystems. They might have adjusted the size with more  recent
> releases, though.
> 
> However, having multiple bricks per server can help with other things
> too. Multiple processes (one per brick) could handle more clients at the
> same time. Depending on how you configure your RAID for the bricks, you
> could possibly reduce the performance loss while a RAID-set gets rebuild
> after a disk loss.
> 
> Best practise seems to be to use 12 disks per RAID-set, mostly RAID10 or
> RAID6 is advised.
> 
> HTH,
> Niels 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 

______________________________________________________________________________
BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
Sandhufe 2
18311 Ribnitz-Damgarten

Telefon: 03821-700-0
Fax:       03821-700-240

E-Mail: info at bodden-kliniken.de   Internet: http://www.bodden-kliniken.de

Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-Nr.: 081/126/00028
Aufsichtsratsvorsitzende: Carmen Schröter, Geschäftsführer: Dr. Falko Milski

Der Inhalt dieser E-Mail ist ausschließlich für den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorge- 
sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, beachten Sie bitte, dass jede Form der Veröf- 
fentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail unzulässig ist. Wir bitten Sie, sofort den 
Absender zu informieren und die E-Mail zu löschen. 

             Bodden-Kliniken Ribnitz-Damgarten GmbH 2014
*** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***