[Gluster-users] Perfomance issue on a 90+% full file system

Tue Oct 7 21:59:30 UTC 2014

I've not used ZFS in production.  Outside of GlusterFS, how does it
handle running at 90%+ allocated?

Again, speaking outside of GlusterFS, ext3, ext4 and XFS all
classically slow down when they reach high usage rates.  Not
noticeable at all on my home media server in a family of 5, but hugely
noticeable at work when hundreds of artists and render nodes on a mix
of 1GbE and 10GbE are smashing a NAS.

GlusterFS alleviates the problem somewhat (in combination due to it's
distribution, and the fact that not all bricks hit ~90% at the same
time).  But it's still felt.

-Dan

----------------
Dan Mons
Unbreaker of broken things
Cutting Edge
http://cuttingedge.com.au

On 7 October 2014 14:56, Franco Broi <franco.broi at iongeo.com> wrote:
> Our bricks are 50TB, running ZOL, 16 disks raidz2. Works OK with Gluster
> now that they fixed xattrs.
>
> 8k writes with fsync 170MB/Sec, reads 335MB/Sec.
>
> On Tue, 2014-10-07 at 14:24 +1000, Dan Mons wrote:
>> We have 6 nodes with one brick per node (2x3 replicate-distribute).
>> 35TB per brick, for 107TB total usable.
>>
>> Not sure if our low brick count (or maybe large brick per node?)
>> contributes to the slowdown when full.
>>
>> We're looking to add more nodes by the end of the year.  After that,
>> I'll look this thread up and comment on what that's changed,
>> performance wise.
>>
>> -Dan
>>
>> ----------------
>> Dan Mons
>> Unbreaker of broken things
>> Cutting Edge
>> http://cuttingedge.com.au
>>
>>
>> On 7 October 2014 14:16, Franco Broi <franco.broi at iongeo.com> wrote:
>> >
>> > Not an issue for us, were at 92% on an 800TB distributed volume, 16
>> > bricks spread across 4 servers. Lookups can be a bit slow but raw IO
>> > hasn't changed.
>> >
>> > On Tue, 2014-10-07 at 09:16 +1000, Dan Mons wrote:
>> >> On 7 October 2014 08:56, Jeff Darcy <jdarcy at redhat.com> wrote:
>> >> > I can't think of a good reason for such a steep drop-off in GlusterFS.
>> >> > Sure, performance should degrade somewhat due to fragmenting, but not
>> >> > suddenly.  It's not like Lustre, which would do massive preallocation
>> >> > and fall apart when there was no longer enough space to do that.  It
>> >> > might be worth measuring average latency at the local-FS level, to see
>> >> > if the problem is above or below that line.
>> >>
>> >> Happens like clockwork for us.  The moment we get alerts saying the
>> >> file system has hit 90%, we get a flood of support tickets about
>> >> performance.
>> >>
>> >> It happens to a lesser degree on standard CentOS NAS units running XFS
>> >> we have around the place.  But again, I see the same sort of thing on
>> >> any file system (vendor supplied, self-built, OS and FS agnostic).
>> >> And yes, it's measurable (Munin graphs show it off nicely).
>> >>
>> >> -Dan
>> >> _______________________________________________
>> >> Gluster-users mailing list
>> >> Gluster-users at gluster.org
>> >> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> >
>> >
>
>