[Gluster-users] Inviting comments on my plans
Shawn Heisey
gluster at elyograg.org
Mon Nov 19 16:36:11 UTC 2012
On 11/19/2012 3:18 AM, Fernando Frediani (Qube) wrote:
> Hi,
>
> I agree with the comment about Fedora and wouldn't choose it a distribution, but if you are comfortable with it go ahead as I don't think this will be the major pain.
>
> RAID: I see where you are coming from to choose not have any RAID and I have thought myself before to do the same, mainly for performance reasons, but as mentioned how are you going to handle the drive swap ? If you think you can somehow automate it please share with us as I believe it is a major performance gain running the disks independently .
There will be no automation. I'll have to do everything myself --
telling the RAID controller to make the disk available to the OS,
putting a filesystem on it, re-adding it to gluster, etc. Although
drive failure is inevitable, I do not expect it to be a common occurrence.
> What you are willing to do with XFS+BRTFS I am not quiet sure it will work as you expect. Ideally you need to use snapshots from the Distributed Filesystem otherwise you might think you are getting a consistent copy of the data and you might not as you are not supposed to be reading/writing other than on the Gluster mount.
The filesystem will be mounted on /bricks/fsname, but gluster will be
pointed at /bricks/fsname/volname. I would put snapshots in
/bricks/fsname/snapshots. Gluster would never see the snapshot data.
> Performance: Simple and short - If you can compromise one disk per host AND choose to not go with independent disks(no RAID) go with RAID 5.
> As your system grows the reads and write should (in theory) be distributed across all bricks. If you have a disk failed you can easily replace it and even in a unlikely event that you lose two disks in a server and loose its data entirely you still have a copy of it in another place and can rebuild it with a bit of patience , so no data loss.
> Also we have had more than enough reports of bad performance in Gluster for all kinds of configurations (including RAID 10) so I don't think anyone should expect Gluster to perform that well, so using RAID 5, 6 or 10 underneath shouldn't make much difference and RAID 10 only would waste space. If you are storing bulk data (multimedia, images, big files) great, it will be streamed and sequential data and it should be ok and acceptable, but if you are storing things that do a lot of small IO or Virtual machines I'm not sure if Gluster is the best choice for you and you should think carefully about it.
A big problem that I would be facing if I went with RAID5 is that I
won't initially have all drive bays populated. The server has 12 drive
bays. If I populate 8 bays per server to start out, what happens when I
need to fill in the other 4 bays?
If I make a new RAID5, then I have lost the capacity of another disk,
and I have no option other than adding at least three drives at a time.
I would not have the option of growing one disk at a time. I can
probably grow the existing RAID array, but that is a process that will
literally take days, during which the entire array is in a fragile state
with horrible performance. If others have experience with doing this on
Dell hardware and have had consistently good luck with it, then my
objection may be unfounded.
With individual disks instead of RAID, I can add one disk at a time to a
server pair.
We will be storing photo, text, and video assets, currently about 80
million of them, with most of them being photos. Each asset consists of
a main file and a handful of very small metadata files. If it's a video
asset, then we actually have several "main" files - different formats
and bitrates. We have a website that acts as a front end to all this data.
Because of other systems (MySQL and Solr), we normally do not need to
access the storage until someone wishes to see a detail page for an
individual asset, or download the asset. We have plans to migrate the
primary metadata access to another system with better performance,
possibly a NoSQL database. We will keep the metadata files around so we
have the ability to rebuild the primary system, but the goal is to only
access the storage when we are retrieving the asset for a user
download. The systems that process incoming data would obviously need
to access the storage often.
Thanks,
Shawn
More information about the Gluster-users
mailing list