[Gluster-users] NFS replacement

Tue Sep 1 06:03:38 UTC 2009

Stephan von Krawczynski wrote:
> On Mon, 31 Aug 2009 19:48:46 +0530 Shehjar Tikoo 
> <shehjart at gluster.com> wrote:
> 
>> Stephan von Krawczynski wrote:
>>> Hello all,
>>> 
>>> after playing around for some weeks we decided to make some real
>>>  world tests with glusterfs. Therefore we took a nfs-client and 
>>> mounted the very same data with glusterfs. The client does some 
>>> logfile processing every 5 minutes and needs around 3,5 mins 
>>> runtime in a nfs setup. We found out that it makes no sense to 
>>> try this setup with gluster replicate as long as we do not have 
>>> the same performance in a single server setup with glusterfs. So
>>>  now we have one server mounted (halfway replicate) and would
>>> like to tune performance. Does anyone have experience with some 
>>> simple replacement like that? We had to find out that almost all
>>>  performance options have exactly zero effect. The only thing
>>> that seems to make at least some difference is read-ahead on the
>>>  server. We end up with around 4,5 - 5,5 minutes runtime of the 
>>> scripts, which is on the edge as we need something quite below 5
>>>  minutes (just like nfs was). Our goal is to maximise performance
>>>  in this setup and then try a real replication setup with two 
>>> servers. The load itselfs looks like around 100 scripts starting
>>>  at one time and processing their data.
>>> 
>>> Any ideas?
>>> 
>> What nfs server are you using? The in-kernel one?
> 
> Yes.
> 
>> You could try the unfs3booster server, which is the original unfs3 
>> with our modifications for bug fixes and slight performance 
>> improvements. It should give better performance in certain cases 
>> since it avoids the FUSE bottleneck on the server.
>> 
>> For more info, do take a look at this page: 
>> http://www.gluster.org/docs/index.php/Unfs3boosterConfiguration
>> 
>> When using unfs3booster, please use GlusterFS release 2.0.6 since 
>> that has the required changes to make booster work with NFS.
> 
> I read the docs, but I don't understand the advantage. Why should we
>  use nfs as kind of a transport layer to an underlying glusterfs 
> server, when we can easily export the service (i.e. glusterfs) 
> itself. Remember, we don't want nfs on the client any longer, but a 
> replicate setup with two servers (though we do not use it right now,
>  but nevertheless it stays our primary goal).

Ok. My answer was simply under the impression that moving to NFS
was the motive. unfs3booster-over-gluster is a better solution as
opposed to having kernel-nfs-over-gluster because of the avoidance of
the FUSE layer completely.

It sounds obvious to me
> that a nfs-over-gluster must be slower than a pure kernel-nfs. On the
>  other hand glusterfs per se may even have some advantages on the 
> network side, iff performance tuning (and of course the options 
> themselves) is well designed. The first thing we noticed is that load
>  dropped dramatically both on server and client when not using 
> kernel-nfs. Client dropped from around 20 to around 4. Server dropped
>  from around 10 to around 5. Since all boxes are pretty much 
> dedicated to their respective jobs a lot of caching is going on 
> anyways.
Thanks, that is useful information.

So I
> would not expect nfs to have advantages only because it is 
> kernel-driven. And the current numbers (loss of around 30% in 
> performance) show that nfs performance is not completely out of 
> reach.
That is true, we do have setups performing as well and in some
cases better than kernel NFS despite the replication overhead. It
is a matter of testing and arriving at a config that works for your
setup.

> 
> What advantages would you expect from using unfs3booster at all?
> 
To begin with, unfs3booster must be compared against kernel nfsd and not
against a GlusterFS-only config. So when comparing with kernel-nfsd, one
should understand that knfsd involves the FUSE layer, kernel's VFS and
network layer, all of which have their advantages and also
disadvantages, especially FUSE when using with the kernel nfsd. Those
bottlenecks with FUSE+knfsd interaction are well documented elsewhere.

unfs3booster enables you to avoid the FUSE layer, the VFS, etc and talk
directly to the network and through that, to the GlusterFS server. In
our measurements, we found that we could perform better than kernel
nfs-over-gluster by avoiding FUSE and using our own caching(io-cache),
buffering(write-behind, read-ahead) and request scheduling(io-threads).

> Another thing we really did not understand is the _negative_ effect 
> of adding iothreads on client or server. Our nfs setup needs around 
> 90 nfs kernel threads to run smoothly. Every number greater than 8 
> iothreads reduces the performance of glusterfs measurably.
> 

The main reason why knfsds need a higher number of threads is simply
because knfsd threads are highly io-bound, that is they wait for for the
disk IO to complete in order to serve each NFS request.

On the other hand, with io-threads, the right number actually depends on
the point at which io-threads are being used. For eg, if you're using
io-threads just above the posix or  features/locks, the scenario is much
like kernel nfsd threads, where each io-thread blocks till the disk IO
is complete. Is this is where you've observed that 8 iothread drop-off?
If so, then it is something we'll need to investigate.

The other place where you can you can use io-threads is on the GlusterFS
client side, it is here that the 8 thread drop-off seems possible since
the client side in GlusterFS is more CPU hungry than the server, and it
is possible that 8 io-threads are able to consume as much CPU as is
available for GlusterFS. Have you observed what the CPU usage figures
are as you increase the number of io-threads?

How many CPUs did the machine have when you observed the drop-off beyond
8 threads?

-Shehjar
>> -Shehjar