<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi everyone!</p>
<p>We have a gluster array of three servers supporting a large mail
server with about 10,000 e-mail accounts with the Maildir file
format. This means lots of random small file reads and writes.
Gluster's performance hasn't been great since we switched to it
from a local disk on a single server, but we're aiming for high
availability here, since simply restoring that mail from backups
(or even backing it up in the first place) takes a day or two.
Clearly, some kind of network drive is what we need, and Gluster
does the job better than every other solution we've looked at so
far.<br>
</p>
<p>The problem comes from the fact that when I set out on this
project, I'd never done any kind of performance tuning before. We
didn't need it. All three of our Gluster servers are set up in a
RAID5 array with a hot spare. I'm starting to think that the
performance woes we have all stem from this fact, and speaking to
one of my colleagues, it was suggested that Gluster can handle the
data integrity just fine on its own, so why don't we just switch
to the fastest possible type, RAID0 and completely toss out any
data integrity on each individual node in the cluster?</p>
<p>While this sounds good in theory, I'd like to know how well this
works in practice before subjecting our 10,000 e-mail clients to
this experiment. The other possibility is to switch our Gluster
nodes to RAID1 or 10, which might be faster than RAID5 while still
keeping some semblance of data integrity.<br>
</p>
</body>
</html>