[Gluster-devel] Multiple NFS Servers (Gluster NFS in 3.x, unfsd, knfsd, etc.)

Thu Jan 7 10:22:53 UTC 2010

Shehjar Tikoo wrote:
> Gordan Bobic wrote:
>> Gordan Bobic wrote:
>>
>>>> With native NFS there'll be no need to first mount a glusterFS
>>>> FUSE based volume and then export it as NFS. The way it has been 
>>>> developed is that
>>>> any glusterfs volume in the volfile can be exported using NFS by adding
>>>> an NFS volume over it in the volfile. This is something that will 
>>>> become
>>>> clearer from the sample vol files when 3.0.1 comes out.
>>>
>>> It may be worth checking the performance of that solution vs the 
>>> performance of the standalone unfsd unbound to portmap/mountd over 
>>> mounted glfs volumes, as I discovered today that the performance 
>>> feels very similar to native knfsd and server-side AFR, but without 
>>> the fuse.ko complications of the former and the buggyness of the 
>>> latter (e.g. see bug 186: 
>>> http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=186 - that 
>>> bug has been driving me nuts since before 2.0.0 was released)
>>>
>>> I'd hate to see this be another wasted effort like booster when there 
>>> is a solution that already works.
>>>
>>>> The answer to your question is, yes, it will be possible to export your
>>>> local file system with knfsd and glusterfs distributed-replicated 
>>>> volumes
>>>> with Gluster NFS translator BUT not in the first release.
>>>
>>> See comment above. Isn't that all the more reason to double check 
>>> performance figures before even bothering?
>>>
>>> In fact, I may have just convinced myself to acquire some iozone 
>>> performance figures. Will report later.
>>
>> OK, I couldn't get iozone to report sane results. glfs was reporting 
>> things in the reasonable ball park I'd expect (between 7MB/s and 
>> 110MB/s which is what I'd expect on gigabit ethernet). NFS was 
>> reporting figures that look more like the memory bandwidth so I'd 
>> guess that FS-Cache was taking over. With O_DIRECT and O_SYNC figures 
>> were in the 700KB/s range for NFS which is clearly not sane because in 
>> actual use the two seem fairly equivalent.
>>
>> So - I did a redneck test instead - dd 64MB of /dev/zero to a file on 
>> the mounted partition.
>>
>> On writes, NFS gets 4.4MB/s, GlusterFS (server side AFR) gets 4.6MB/s. 
>> Pretty even.
>> On reads GlusterFS gets 117MB/s, NFS gets 119MB/s (on the first read 
>> after flushing the caches, after that it goes up to 600MB/s). The 
>> difference in the unbuffered readings seems to be in the sane ball 
>> park and the difference on the reads is roughly what I'd expect 
>> considering NFS is running UDP and GLFS is running TCP.
>>
>> So in conclusion - there is no performance difference between them 
>> worth speaking of. So what is the point in implementing a user-space 
>> NFS handler in glusterfsd when unfsd seems to do the job as well as 
>> glusterfsd could reasonably hope to?
> 
> A single dd, which is basically sequential IO is something even
> an undergrad OS 101 project can optimize for.  We, on the other hand,
 > are aiming higher.

Aiming higher is admirable, but a more complex test pattern that iozone 
produced caused (server) crashes and disconnects on GlusterFS, and 
FS-Cache + NFS with unfsd produced results that were so high that they 
were obviously bogus (they were much nearer RAM bandwidth than network 
bandwidth). Whether those bogus numbers really represent the real-world 
performance benefits is an interesting question, though. So if we're 
going to talk about optimization abuse, it seems a lot more effective on 
the complex test.

What test do you propose to measure the performance? Can you come up 
with something relatively nasty that favours GlusterFS client connection 
over NFS one that I can test on my network? I'm assuming you have at 
least a specific one (but presumably several) in mind where the results 
will be (spectacularly) better than unfsd's.

Or maybe I can apply another redneck solution such as doing a kernel 
build, that ought to be a pretty valid benchmark for heavy disk I/O. 
simulation. Will report back with figures on this.

> We'll be providing much better meta-data
> performance, something unfsd sucks at(..not without reason, I
> appreciate the measures it takes for ensuring correctness..) due to
> the large number of system calls it performs, much better support for
> concurrency in order to exploit the proliferating multi-cores, much
> better parallelism for multiple NFS clients where all of them are
> hammering away at the server, again something unfsd does not to do.

So, the main (only?) case for using this new, as yet unreleased NFS 
translator instead of a server-side-assembled GLFS export is to avoid 
the client machines having to install glusterfs-client and 
glusterfs-common packages (bearing in mind that fuse dependency is gone 
in 3.0.x, as long as the kernel supports fuse)?

Gordan