[Gluster-users] JBOD / ZFS / Flash backed

Thu Apr 12 19:25:15 UTC 2018

On 09/04/18 22:15, Vincent Royer wrote:
> Thanks,
>
> The 3 servers are new Lenovo units with redundant PS backed by two 
> huge UPS units (on for each bank of power supplies).  I think the 
> chances of losing two nodes is incredibly slim, and in that case a 
> Disaster Recovery from offsite backups would be reasonable.
>
> My requirements are about 2TB, highly available (so that I can reboot 
> one of the 3 servers without taking down services).
>
> Beyond that my focus is high performance for small I/O.
This can be a difficult case for GlusterFS, if you mean "small files", 
as the metadata lookups are relatively costly (no separate MDS with 
in-memory or memory cached database). It's ideally placed for large 
files, and small I/O within those files should be OK. just speaking from 
experience - should be fine for VMs with such loads, especially if you 
shard.
>
> So I could do a single 2TB SSD per server, or two, or many more if 
> that is "what is required".  But I don't want to waste money...

Resilience is never a waste. Skimping may well prove to be a waste of 
*your time* when you get woken up at 3am and have to fix a downed 
system. Your call entirely. I'm too old for that kind of thing, so I 
tend to push for both per-server and per-cluster redundancy. It usually 
gets approved after something "unexpected" happens the first time.

Gluster and ZFS will be fine with onboard controllers. If you have 
enough ports you'll be just fine. If you need more buy HBA's to stick in 
your PCIe slots, M1015s and M1115s on ebay perform very well and are 
still dirt cheap.

So are you using ZFS to get compression and checksumming down to the 
disk platter level? ZFS will give some gains in performance with 
compressible data and corruption protection, but, don't bother with 
dedup, I've tried it on 3 distributed filesystems and it bought less 
than 3% capacity and slammed performance. If you don't need either 
feature just stick with XFS for single-disk or software-RAIDed mirrors 
per brick. My personal opinion would be do a ZFS mirror of two SSDs per 
server, per brick, ie in your initial case, 2x2TB SSD per box in ZFS 
mirror. You can add more mirror sets later to add additional bricks.

>
> I like the idea of forgoing the RAID cards as they are quite 
> expensive, especially the capacitor backed ones.  The onboard 
> controller can handle JBOD just fine, if Gluster is OK with it!

As I also said, if said expensive card dies, and you don't have another 
one in stock, you will effectively have lost everything on that server 
until you can source a new one (or even /if/ you can).

Use the power of the software to get where you need to be, the tools are 
there...

Alex

--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
This email is not intended to, nor should it be taken to, constitute advice.
The information provided is correct to our knowledge & belief and must not
be used as a substitute for obtaining tax, regulatory, investment, legal or
any other appropriate advice.

"Transact" is operated by Integrated Financial Arrangements Ltd.
29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300.
(Registered office: as above; Registered in England and Wales under
number: 3727592). Authorised and regulated by the Financial Conduct
Authority (entered on the Financial Services Register; no. 190856).