[Gluster-users] best practices? 500+ win/mac computers, gluster, samba, new SAN, new hardware

D. Dante Lorenso dante at lorenso.com
Wed Feb 1 18:21:17 UTC 2012


On 1/28/12 6:02 PM, Brian Candler wrote:
> On Sat, Jan 28, 2012 at 05:31:28PM -0600, D. Dante Lorenso wrote:
>> Thinking about buying 8 servers with 4 x 2TB 7200 rpm SATA drives
>> (expandable to 8 drives).  Each server will have 8 network ports and
>> will be connected to a SAN switch using 4 ports link aggregated and
>> connected to a LAN switch using the other 4 ports aggregated.  The
>> servers will run CentOS 6.2 Linux.  The LAN side will run Samba and
>> export the network shares, and the SAN side will run Gluster daemon.
>
> Just a terminology issue, but Gluster isn't really a SAN, it's a distributed
> NAS.
>
> A SAN uses a block-level protocol (e.g. iSCSI), into which the client runs a
> regular filesystem like ext4 or xfs or whatever.  A NAS is a file-sharing
> protocol (e.g.  NFS).  Gluster is the latter.

I need a word to describe the switch that I'll plug all my storage 
machines into.  Distributed NAS sounds good.  Might have a few iSCSI on 
there too, however.

>> With 8 machines and 4 ports for SAN each, I need 32 ports total.
>> I'm thinking a 48 port switch would work well as a SAN back-end
>> switch giving me left over space to add iSCSI devices and backup
>> servers which need to hook into the SAN.
>
> Out of interest, why are you considering two different network fabrics? Are
> there one set of clients which are talking CIFS and a different set of
> clients using the Gluster native client?

Most of my clients (95%) are all Windows 7 workstations.  The only way I 
think I can get GlusterFS to work with Win7 is through Samba.  I was 
planning to use SMB/CIFS on the Win7 side of the network (using 2 bonded 
ports) and use Gluster native client on the storage side (using another 
2 bonded ports).

>> 4) Performance tuning.  So far, I've discovered using dd and iperf
>> to debug my transfer rates.  I use dd to test raw speed of the
>> underlying disks (should I use RAID 0, RAID 1, RAID 5 ?)
>
> Try some dd measurements onto a RAID 5 volume, especially for writing, and
> you'll find it sucks.
>
> I also suggest something like bonnie++ to get a more realistic performance
> measurement than just the dd throughput, as it will include seeks and
> filesystem operations (e.g.  file creations/deletions)

Good advice, I'll check into bonnie++.

>> Perhaps if my drives on each of the 8
>> servers are RAID 0, then I can use "replicate 2" through gluster and
>> get the "RAID 1" equivalent.  I think using replicate 2 in gluster
>> will 1/2 my network write/read speed, though.
>
> In theory Gluster replication ought to improve your read speed, since some
> clients can access one copy spindle while other clients access the other.
> But I'm not sure how much it will impact the write speed.
>
> I would however suggest that building a local RAID 0 array is probably a bad
> idea, because if one disk of the set fails, that whole filesystem is toast.
>
> Gluster does give you the option of a "distributed replicated" volume, so
> you can get both the "RAID 0" and "RAID 1" functionality.

If you have 8 drives connected to a single machine, how do you introduce 
those drives to Gluster?  I was thinking I'd combine them into a single 
volume using RAID 0 and mount that volume on a box and turn it into a 
brick.  Otherwise you have to add 8 separate bricks, right?  That's not 
better is it?

-- Dante

D. Dante Lorenso
dante at lorenso.com
972-333-4139



More information about the Gluster-users mailing list