[Gluster-devel] Architecture advice

Thu Jan 8 10:43:11 UTC 2009

Dan,

On Thu, Jan 8, 2009 at 1:39 PM, Dan Parsons <dparsons at nyip.net> wrote:
> Now that I'm upgrading to gluster 1.4/2.0, I'm going to take the time and rearchitect things.
>
> Hardware:
> Gluster servers:
> 4 blades connected via 4gbit fc to fast, dedicated storage. Each server has two bonded Gig-E links to the rest of my network, for 8gbit/s theoretical throughput.
>
> Gluster clients:
> 33 blades each with one, gig-e connection. They use local storage for OS and gluster for input/output files.
>
> Specific questions:
> (1) There are many times, in our workflow, when more than a few nodes will want the same file at the same time. This made me want to use the stripe xlator. In this way, when a client node saturates its gig-e link reading the file, each gluster server is using only 250mbit/s, leaving room for more clients. If I wasn't using stripe, this hypothetical file would be on just one server node, and it would get slammed if more than two client nodes talked to it. Is there a better way of doing this? Did I make the correct decision in using stripe xlator for this purpose? Can I achieve the same thing using just afr?

Your decision was correct keeping in mind the purpose you were trying
to achieve. Stripe would "stripe" the file across the servers hence
reads on the same file on different stripe chunks by multiple clients
will have better performance.

AFR of 1.3 had "read" load balancing, in the sense a file would be
read from one of the subvolumes based on its inode number. This
feature will be implemented in 1.4 (2.0 rather) also. But still it
will not help your case where multiple clients read the same file at
same time.

>
> (2) I would like to architect the system such that if one node goes down, the others can keep serving the data, even if overall throughput is less. This means that all data would need to be accessible from all clients. Is this something I would use afr xlator for? If so, do I even need stripe anymore, to handle my need to have multiple servers capable of sending different chunks of the same file? And how does the HA xlator play into this?

Stripe over AFR will help you case.

HA provides seamless failover over 2 or more physical connections.
Suppose you have 2 eth ports on server and client and hence 2
connections between the machines - HA makes sure that when one
connection fails the system calls are retried on the other connection.
Application will not know about the failed connection.

HA is also useful when we use server side AFRs.

>
> We have a mix of (small quantity of gigantic files) and (extremely gigantic quantity of small files), so I'm sure there will need to be some parameter tuning.
>
> Thanks in advance. If this question would be better addressed under some sort of support agreement, please let me know.

ZResearch will get in touch with you.

Regards
Krishna

>
> Dan Parsons