[Gluster-users] Gluster native mount is really slow compared to nfs

Tue Jul 11 15:16:41 UTC 2017

On 07/11/2017 08:14 AM, Jo Goossens wrote:
> RE: [Gluster-users] Gluster native mount is really slow compared to nfs
>
> Hello Joe,
>
> I really appreciate your feedback, but I already tried the opcache 
> stuff (to not valildate at all). It improves of course then, but not 
> completely somehow. Still quite slow.
>
> I did not try the mount options yet, but I will now!
>
> With nfs (doesnt matter much built-in version 3 or ganesha version 4) 
> I can even host the site perfectly fast without these extreme opcache 
> settings.
>
> I still can't understand why the nfs mount is easily 80 times faster, 
> actually no matter what options I set it seems. It's almost there is 
> something really wrong somehow...
>

Because the linux nfs client doesn't touch the network for many 
operations and are, in stead, held in the kernel's FSCache.

> I tried the ceph mount now and out of the box it's comparable with 
> gluster with nfs mount.
>
> Regards
>
> Jo
>
> BE: +32 53 599 000
>
> NL: +31 85 888 4 555
>
> https://www.hosted-power.com/
>
>
>     -----Original message-----
>     *From:* Joe Julian <joe at julianfamily.org>
>     *Sent:* Tue 11-07-2017 17:04
>     *Subject:* Re: [Gluster-users] Gluster native mount is really slow
>     compared to nfs
>     *To:* gluster-users at gluster.org;
>
>     My standard response to someone needing filesystem performance for
>     www traffic is generally, "you're doing it wrong".
>     https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/
>
>     That said, you might also look at these mount options:
>     attribute-timeout, entry-timeout, negative-timeout (set to some
>     large amount of time), and fopen-keep-cache.
>
>
>     On 07/11/2017 07:48 AM, Jo Goossens wrote:
>>
>>     Hello,
>>
>>     Here is the volume info as requested by soumya:
>>
>>     #gluster volume info www
>>     Volume Name: www
>>     Type: Replicate
>>     Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
>>     Status: Started
>>     Snapshot Count: 0
>>     Number of Bricks: 1 x 3 = 3
>>     Transport-type: tcp
>>     Bricks:
>>     Brick1: 192.168.140.41:/gluster/www
>>     Brick2: 192.168.140.42:/gluster/www
>>     Brick3: 192.168.140.43:/gluster/www
>>     Options Reconfigured:
>>     cluster.read-hash-mode: 0
>>     performance.quick-read: on
>>     performance.write-behind-window-size: 4MB
>>     server.allow-insecure: on
>>     performance.read-ahead: disable
>>     performance.readdir-ahead: on
>>     performance.io-thread-count: 64
>>     performance.io-cache: on
>>     performance.client-io-threads: on
>>     server.outstanding-rpc-limit: 128
>>     server.event-threads: 3
>>     client.event-threads: 3
>>     performance.cache-size: 32MB
>>     transport.address-family: inet
>>     nfs.disable: on
>>     nfs.addr-namelookup: off
>>     nfs.export-volumes: on
>>     nfs.rpc-auth-allow: 192.168.140.*
>>     features.cache-invalidation: on
>>     features.cache-invalidation-timeout: 600
>>     performance.stat-prefetch: on
>>     performance.cache-samba-metadata: on
>>     performance.cache-invalidation: on
>>     performance.md-cache-timeout: 600
>>     network.inode-lru-limit: 100000
>>     performance.parallel-readdir: on
>>     performance.cache-refresh-timeout: 60
>>     performance.rda-cache-limit: 50MB
>>     cluster.nufa: on
>>     network.ping-timeout: 5
>>     cluster.lookup-optimize: on
>>     cluster.quorum-type: auto
>>     I started with none of them set and I added/changed while
>>     testing. But it was always slow, by tuning some kernel parameters
>>     it improved slightly (just a few percent, nothing reasonable)
>>     I also tried ceph just to compare, I got this with default
>>     settings and no tweaks:
>>      ./smallfile_cli.py  --top /var/www/test --host-set
>>     192.168.140.41 --threads 8 --files 5000 --file-size 64
>>     --record-size 64
>>     smallfile version 3.0
>>                                hosts in test : ['192.168.140.41']
>>                        top test directory(s) : ['/var/www/test']
>>                                    operation : cleanup
>>                                 files/thread : 5000
>>                                      threads : 8
>>                record size (KB, 0 = maximum) : 64
>>                               file size (KB) : 64
>>                       file size distribution : fixed
>>                                files per dir : 100
>>                                 dirs per dir : 10
>>                   threads share directories? : N
>>                              filename prefix :
>>                              filename suffix :
>>                  hash file number into dir.? : N
>>                          fsync after modify? : N
>>               pause between files (microsec) : 0
>>                         finish all requests? : Y
>>                                   stonewall? : Y
>>                      measure response times? : N
>>                                 verify read? : Y
>>                                     verbose? : False
>>                               log to stderr? : False
>>                                ext.attr.size : 0
>>                               ext.attr.count : 0
>>                    permute host directories? : N
>>                     remote program directory : /root/smallfile-master
>>                    network thread sync. dir. :
>>     /var/www/test/network_shared
>>     starting all threads by creating starting gate file
>>     /var/www/test/network_shared/starting_gate.tmp
>>     host = 192.168.140.41,thr = 00,elapsed = 1.339621,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 01,elapsed = 1.436776,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 02,elapsed = 1.498681,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 03,elapsed = 1.483886,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 04,elapsed = 1.454833,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 05,elapsed = 1.469340,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 06,elapsed = 1.439060,files =
>>     5000,records = 0,status = ok
>>     host = 192.168.140.41,thr = 07,elapsed = 1.375074,files =
>>     5000,records = 0,status = ok
>>     total threads = 8
>>     total files = 40000
>>     100.00% of requested files processed, minimum is  70.00
>>     1.498681 sec elapsed time
>>     26690.134975 files/sec
>>
>>
>>     Regards
>>
>>     Jo
>>
>>         -----Original message-----
>>         *From:* Jo Goossens <jo.goossens at hosted-power.com>
>>         <mailto:jo.goossens at hosted-power.com>
>>         *Sent:* Tue 11-07-2017 12:15
>>         *Subject:* Re: [Gluster-users] Gluster native mount is really
>>         slow compared to nfs
>>         *To:* Soumya Koduri <skoduri at redhat.com>
>>         <mailto:skoduri at redhat.com>; gluster-users at gluster.org
>>         <mailto:gluster-users at gluster.org>;
>>         *CC:* Ambarish Soman <asoman at redhat.com>
>>         <mailto:asoman at redhat.com>;
>>
>>         Hello,
>>
>>         Here is some speedtest with a new setup we just made with
>>         gluster 3.10, there are no other differences, except
>>         glusterfs versus nfs. The nfs is about 80 times faster:
>>
>>         root at app1:~/smallfile-master# mount -t glusterfs -o
>>         use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log
>>         192.168.140.41:/www /var/www
>>         root at app1:~/smallfile-master# ./smallfile_cli.py  --top
>>         /var/www/test --host-set 192.168.140.41 --threads 8 --files
>>         500 --file-size 64 --record-size 64
>>         smallfile version 3.0
>>                                    hosts in test : ['192.168.140.41']
>>                            top test directory(s) : ['/var/www/test']
>>                                        operation : cleanup
>>                                     files/thread : 500
>>                                          threads : 8
>>                    record size (KB, 0 = maximum) : 64
>>                                   file size (KB) : 64
>>                           file size distribution : fixed
>>                                    files per dir : 100
>>                                     dirs per dir : 10
>>                       threads share directories? : N
>>                                  filename prefix :
>>                                  filename suffix :
>>                      hash file number into dir.? : N
>>                              fsync after modify? : N
>>                   pause between files (microsec) : 0
>>                             finish all requests? : Y
>>                                       stonewall? : Y
>>                          measure response times? : N
>>                                     verify read? : Y
>>                                         verbose? : False
>>                                   log to stderr? : False
>>                                    ext.attr.size : 0
>>                                   ext.attr.count : 0
>>                        permute host directories? : N
>>                         remote program directory : /root/smallfile-master
>>                        network thread sync. dir. :
>>         /var/www/test/network_shared
>>         starting all threads by creating starting gate file
>>         /var/www/test/network_shared/starting_gate.tmp
>>         host = 192.168.140.41,thr = 00,elapsed = 68.845450,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 01,elapsed = 67.601088,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 02,elapsed = 58.677994,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 03,elapsed = 65.901922,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 04,elapsed = 66.971720,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 05,elapsed = 71.245102,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 06,elapsed = 67.574845,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 07,elapsed = 54.263242,files =
>>         500,records = 0,status = ok
>>         total threads = 8
>>         total files = 4000
>>         100.00% of requested files processed, minimum is  70.00
>>         71.245102 sec elapsed time
>>         56.144211 files/sec
>>         umount /var/www
>>         root at app1:~/smallfile-master# mount -t nfs -o tcp
>>         192.168.140.41:/www /var/www
>>         root at app1:~/smallfile-master# ./smallfile_cli.py  --top
>>         /var/www/test --host-set 192.168.140.41 --threads 8 --files
>>         500 --file-size 64 --record-size 64
>>         smallfile version 3.0
>>                                    hosts in test : ['192.168.140.41']
>>                            top test directory(s) : ['/var/www/test']
>>                                        operation : cleanup
>>                                     files/thread : 500
>>                                          threads : 8
>>                    record size (KB, 0 = maximum) : 64
>>                                   file size (KB) : 64
>>                           file size distribution : fixed
>>                                    files per dir : 100
>>                                     dirs per dir : 10
>>                       threads share directories? : N
>>                                  filename prefix :
>>                                  filename suffix :
>>                      hash file number into dir.? : N
>>                              fsync after modify? : N
>>                   pause between files (microsec) : 0
>>                             finish all requests? : Y
>>                                       stonewall? : Y
>>                          measure response times? : N
>>                                     verify read? : Y
>>                                         verbose? : False
>>                                   log to stderr? : False
>>                                    ext.attr.size : 0
>>                                   ext.attr.count : 0
>>                        permute host directories? : N
>>                         remote program directory : /root/smallfile-master
>>                        network thread sync. dir. :
>>         /var/www/test/network_shared
>>         starting all threads by creating starting gate file
>>         /var/www/test/network_shared/starting_gate.tmp
>>         host = 192.168.140.41,thr = 00,elapsed = 0.962424,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 01,elapsed = 0.942673,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 02,elapsed = 0.940622,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 03,elapsed = 0.915218,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 04,elapsed = 0.934349,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 05,elapsed = 0.922466,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 06,elapsed = 0.954381,files =
>>         500,records = 0,status = ok
>>         host = 192.168.140.41,thr = 07,elapsed = 0.946127,files =
>>         500,records = 0,status = ok
>>         total threads = 8
>>         total files = 4000
>>         100.00% of requested files processed, minimum is  70.00
>>         0.962424 sec elapsed time
>>         4156.173189 files/sec
>>
>>             -----Original message-----
>>             *From:* Jo Goossens <jo.goossens at hosted-power.com>
>>             <mailto:jo.goossens at hosted-power.com>
>>             *Sent:* Tue 11-07-2017 11:26
>>             *Subject:* Re: [Gluster-users] Gluster native mount is
>>             really slow compared to nfs
>>             *To:* gluster-users at gluster.org
>>             <mailto:gluster-users at gluster.org>; Soumya Koduri
>>             <skoduri at redhat.com> <mailto:skoduri at redhat.com>;
>>             *CC:* Ambarish Soman <asoman at redhat.com>
>>             <mailto:asoman at redhat.com>;
>>
>>             Hi all,
>>
>>             One more thing, we have 3 apps servers with the gluster
>>             on it, replicated on 3 different gluster nodes. (So the
>>             gluster nodes are app servers at the same time). We could
>>             actually almost work locally if we wouldn't need to have
>>             the same files on the 3 nodes and redundancy :)
>>
>>             Initial cluster was created like this:
>>
>>             gluster volume create www replica 3 transport tcp
>>             192.168.140.41:/gluster/www 192.168.140.42:/gluster/www
>>             192.168.140.43:/gluster/www force
>>             gluster volume set www network.ping-timeout 5
>>             gluster volume set www performance.cache-size 1024MB
>>             gluster volume set www nfs.disable on # No need for NFS
>>             currently
>>             gluster volume start www
>>             To my understanding it still wouldn't explain why nfs has
>>             such great performance compared to native ...
>>
>>             Regards
>>
>>             Jo
>>
>>
>>                 -----Original message-----
>>                 *From:* Soumya Koduri <skoduri at redhat.com>
>>                 <mailto:skoduri at redhat.com>
>>                 *Sent:* Tue 11-07-2017 11:16
>>                 *Subject:* Re: [Gluster-users] Gluster native mount
>>                 is really slow compared to nfs
>>                 *To:* Jo Goossens <jo.goossens at hosted-power.com>
>>                 <mailto:jo.goossens at hosted-power.com>;
>>                 gluster-users at gluster.org
>>                 <mailto:gluster-users at gluster.org>;
>>                 *CC:* Ambarish Soman <asoman at redhat.com>
>>                 <mailto:asoman at redhat.com>; Karan Sandha
>>                 <ksandha at redhat.com> <mailto:ksandha at redhat.com>;
>>                 + Ambarish
>>
>>                 On 07/11/2017 02:31 PM, Jo Goossens wrote:
>>                 > Hello,
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > We tried tons of settings to get a php app running
>>                 on a native gluster
>>                 > mount:
>>                 >
>>                 >
>>                 >
>>                 > e.g.: 192.168.140.41:/www /var/www glusterfs
>>                 >
>>                 defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
>>                 > 0 0
>>                 >
>>                 >
>>                 >
>>                 > I tried some mount variants in order to speed up
>>                 things without luck.
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > After that I tried nfs (native gluster nfs 3 and
>>                 ganesha nfs 4), it was
>>                 > a crazy performance difference.
>>                 >
>>                 >
>>                 >
>>                 > e.g.: 192.168.140.41:/www /var/www nfs4
>>                 defaults,_netdev 0 0
>>                 >
>>                 >
>>                 >
>>                 > I tried a test like this to confirm the slowness:
>>                 >
>>                 >
>>                 >
>>                 > ./smallfile_cli.py  --top /var/www/test --host-set
>>                 192.168.140.41
>>                 > --threads 8 --files 5000 --file-size 64
>>                 --record-size 64
>>                 >
>>                 > This test finished in around 1.5 seconds with NFS
>>                 and in more than 250
>>                 > seconds without nfs (can't remember exact numbers,
>>                 but I reproduced it
>>                 > several times for both).
>>                 >
>>                 > With the native gluster mount the php app had
>>                 loading times of over 10
>>                 > seconds, with the nfs mount the php app loaded
>>                 around 1 second maximum
>>                 > and even less. (reproduced several times)
>>                 >
>>                 >
>>                 >
>>                 > I tried all kind of performance settings and
>>                 variants of this but not
>>                 > helped , the difference stayed huge, here are some
>>                 of the settings
>>                 > played with in random order:
>>                 >
>>
>>                 Request Ambarish & Karan (cc'ed who have been working
>>                 on evaluating
>>                 performance of various access protocols gluster
>>                 supports) to look at the
>>                 below settings and provide inputs.
>>
>>                 Thanks,
>>                 Soumya
>>
>>                 >
>>                 >
>>                 > gluster volume set www features.cache-invalidation on
>>                 > gluster volume set www
>>                 features.cache-invalidation-timeout 600
>>                 > gluster volume set www performance.stat-prefetch on
>>                 > gluster volume set www
>>                 performance.cache-samba-metadata on
>>                 > gluster volume set www
>>                 performance.cache-invalidation on
>>                 > gluster volume set www performance.md-cache-timeout 600
>>                 > gluster volume set www network.inode-lru-limit 250000
>>                 >
>>                 > gluster volume set www
>>                 performance.cache-refresh-timeout 60
>>                 > gluster volume set www performance.read-ahead disable
>>                 > gluster volume set www performance.readdir-ahead on
>>                 > gluster volume set www performance.parallel-readdir on
>>                 > gluster volume set www
>>                 performance.write-behind-window-size 4MB
>>                 > gluster volume set www performance.io-thread-count 64
>>                 >
>>                 > gluster volume set www performance.client-io-threads on
>>                 >
>>                 > gluster volume set www performance.cache-size 1GB
>>                 > gluster volume set www performance.quick-read on
>>                 > gluster volume set www performance.flush-behind on
>>                 > gluster volume set www performance.write-behind on
>>                 > gluster volume set www nfs.disable on
>>                 >
>>                 > gluster volume set www client.event-threads 3
>>                 > gluster volume set www server.event-threads 3
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > The NFS ha adds a lot of complexity which we
>>                 wouldn't need at all in our
>>                 > setup, could you please explain what is going on
>>                 here? Is NFS the only
>>                 > solution to get acceptable performance? Did I miss
>>                 one crucial settting
>>                 > perhaps?
>>                 >
>>                 >
>>                 >
>>                 > We're really desperate, thanks a lot for your help!
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > PS: We tried with gluster 3.11 and 3.8 on Debian,
>>                 both had terrible
>>                 > performance when not used with nfs.
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > Kind regards
>>                 >
>>                 > Jo Goossens
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 >
>>                 > _______________________________________________
>>                 > Gluster-users mailing list
>>                 > Gluster-users at gluster.org
>>                 <mailto:Gluster-users at gluster.org>
>>                 > http://lists.gluster.org/mailman/listinfo/gluster-users
>>                 >
>>
>>               _______________________________________________  Gluster-users mailing listGluster-users at gluster.org <mailto:Gluster-users at gluster.org>   http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>         _______________________________________________  Gluster-users mailing listGluster-users at gluster.org <mailto:Gluster-users at gluster.org>   http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>     _______________________________________________ Gluster-users mailing listGluster-users at gluster.org <mailto:Gluster-users at gluster.org>  http://lists.gluster.org/mailman/listinfo/gluster-users
>
>     _______________________________________________
>       Gluster-users mailing list
>       Gluster-users at gluster.org
>       http://lists.gluster.org/mailman/listinfo/gluster-users
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170711/005adf93/attachment.html>