[Gluster-users] What will happen if one file size exceeds the available node's harddrive capacity?

Yueyu Lin yueyu.lin at me.com
Tue May 3 09:26:03 UTC 2011


Thanks for your explanation. 
Actually I'm now designing some system like "Gluster Amazon EC2 appliance". It means all Gluster instances are virtual machines. I started from 10GB harddrive size of each virtual machine and expand the capacity according the usage automatically.  So small size Gluster instance becomes one of my concern and of course I can use large VM instances or split the huge files(like VM images :-)) to small stripes.

Certainly I will repost the problem and my thoughts to the community.gluster.org to share my experience with the whole community later. 



On May 3, 2011, at 5:20 PM, Anand Babu Periasamy wrote:

> Hi Yueyu,
> Thanks for posting answers yourself. Let me give a little bit of background about this.
> 
> It will get error message as if disk is full. If you simply copy the file to a different name and rename it back, it will get rescheduled to a different node.
> 
> Most disks are in TBs. It doesn't make sense to optimize at that level. Block layer striping is often not scalable and requires complicated backend disk structure.
> 
> If you set a limit of minimum free disk space, then GlusterFS will stop scheduling new files to any bricks exceeding this limit. You can use this remaining free space to grow existing files. You can also use volume rebalance to physically move files across and balance capacity utilization. 
> 
> Think of choosing a 128k block size and wasting disk space for 4k files. Not even disk filesystems optimize capacity utilization to fill every remaining sector. With GlusterFS, it has to cope up with the same problem at a much larger scale. Thats where the trade off is.
> 
> BTW, It will be great if you could post this question on http://community.gluster.org as well. It will  become a part of gluster knowledge base.
> 
> -AB
> 
> 
> On Tue, May 3, 2011 at 2:39 PM, Yueyu Lin <yueyu.lin at me.com> wrote:
> I just made the experiment. The answer is no. Distributed-Replicate mode won't split images for application. Application has to split the huge file manually.
> On May 3, 2011, at 4:48 PM, Yueyu Lin wrote:
> 
> > Hi, all
> >    I have a question about the capacity problem in GlusterFS cluster system.
> >    Supposedly, we have a cluster configuration like this:
> >
> >    Type: Distributed-Replicate
> >    Number of Bricks: 2 x 1 = 2
> >    Brick1: 192.168.1.150:/home/export
> >    Brick2: 192.168.1.151:/home/export
> >
> >    If there are only 15 Giga bytes available in these two servers, and I need to copy a file of 20GB to the the mounted directory. Obviously the space is not enough.
> >    Then I add two bricks of 15GB to the cluster. The structure became to:
> >
> >    Type: Distributed-Replicate
> >    Number of Bricks: 2 x 2 = 4
> >    Bricks:
> >    Brick1: 192.168.1.152:/home/export/dfsStore
> >    Brick2: 192.168.1.153:/home/export/dfsStore
> >    Brick3: 192.168.1.150:/home/export/dfsStore
> >    Brick4: 192.168.1.151:/home/export/dfsStore
> >
> >    Now I will copy the file again to the mounted directory. In client, it shows it has more than 20GB space available. But what will happen when I copy the huge file to it since every single brick doesn't have enough space to hold it.
> >
> >    Thanks a lot.
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
> 
> 
> 
> -- 
> Anand Babu Periasamy
> Blog [http://www.unlocksmith.org]
> 
> Imagination is more important than knowledge --Albert Einstein

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110503/d3c89508/attachment.html>


More information about the Gluster-users mailing list