[Gluster-users] Gluster native mount is really slow compared to nfs

Joe Julian joe at julianfamily.org
Tue Jul 11 15:04:24 UTC 2017


My standard response to someone needing filesystem performance for www 
traffic is generally, "you're doing it wrong". 
https://joejulian.name/blog/optimizing-web-performance-with-glusterfs/

That said, you might also look at these mount options: 
attribute-timeout, entry-timeout, negative-timeout (set to some large 
amount of time), and fopen-keep-cache.


On 07/11/2017 07:48 AM, Jo Goossens wrote:
> RE: [Gluster-users] Gluster native mount is really slow compared to nfs
>
> Hello,
>
> Here is the volume info as requested by soumya:
>
> #gluster volume info www
> Volume Name: www
> Type: Replicate
> Volume ID: 5d64ee36-828a-41fa-adbf-75718b954aff
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.140.41:/gluster/www
> Brick2: 192.168.140.42:/gluster/www
> Brick3: 192.168.140.43:/gluster/www
> Options Reconfigured:
> cluster.read-hash-mode: 0
> performance.quick-read: on
> performance.write-behind-window-size: 4MB
> server.allow-insecure: on
> performance.read-ahead: disable
> performance.readdir-ahead: on
> performance.io-thread-count: 64
> performance.io-cache: on
> performance.client-io-threads: on
> server.outstanding-rpc-limit: 128
> server.event-threads: 3
> client.event-threads: 3
> performance.cache-size: 32MB
> transport.address-family: inet
> nfs.disable: on
> nfs.addr-namelookup: off
> nfs.export-volumes: on
> nfs.rpc-auth-allow: 192.168.140.*
> features.cache-invalidation: on
> features.cache-invalidation-timeout: 600
> performance.stat-prefetch: on
> performance.cache-samba-metadata: on
> performance.cache-invalidation: on
> performance.md-cache-timeout: 600
> network.inode-lru-limit: 100000
> performance.parallel-readdir: on
> performance.cache-refresh-timeout: 60
> performance.rda-cache-limit: 50MB
> cluster.nufa: on
> network.ping-timeout: 5
> cluster.lookup-optimize: on
> cluster.quorum-type: auto
> I started with none of them set and I added/changed while testing. But 
> it was always slow, by tuning some kernel parameters it improved 
> slightly (just a few percent, nothing reasonable)
> I also tried ceph just to compare, I got this with default settings 
> and no tweaks:
>  ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 
> --threads 8 --files 5000 --file-size 64 --record-size 64
> smallfile version 3.0
>                            hosts in test : ['192.168.140.41']
>                    top test directory(s) : ['/var/www/test']
>                                operation : cleanup
>                             files/thread : 5000
>                                  threads : 8
>            record size (KB, 0 = maximum) : 64
>                           file size (KB) : 64
>                   file size distribution : fixed
>                            files per dir : 100
>                             dirs per dir : 10
>               threads share directories? : N
>                          filename prefix :
>                          filename suffix :
>              hash file number into dir.? : N
>                      fsync after modify? : N
>           pause between files (microsec) : 0
>                     finish all requests? : Y
>                               stonewall? : Y
>                  measure response times? : N
>                             verify read? : Y
>                                 verbose? : False
>                           log to stderr? : False
>                            ext.attr.size : 0
>                           ext.attr.count : 0
>                permute host directories? : N
>                 remote program directory : /root/smallfile-master
>                network thread sync. dir. : /var/www/test/network_shared
> starting all threads by creating starting gate file 
> /var/www/test/network_shared/starting_gate.tmp
> host = 192.168.140.41,thr = 00,elapsed = 1.339621,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 01,elapsed = 1.436776,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 02,elapsed = 1.498681,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 03,elapsed = 1.483886,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 04,elapsed = 1.454833,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 05,elapsed = 1.469340,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 06,elapsed = 1.439060,files = 5000,records 
> = 0,status = ok
> host = 192.168.140.41,thr = 07,elapsed = 1.375074,files = 5000,records 
> = 0,status = ok
> total threads = 8
> total files = 40000
> 100.00% of requested files processed, minimum is  70.00
> 1.498681 sec elapsed time
> 26690.134975 files/sec
>
>
> Regards
>
> Jo
>
>     -----Original message-----
>     *From:* Jo Goossens <jo.goossens at hosted-power.com>
>     *Sent:* Tue 11-07-2017 12:15
>     *Subject:* Re: [Gluster-users] Gluster native mount is really slow
>     compared to nfs
>     *To:* Soumya Koduri <skoduri at redhat.com>; gluster-users at gluster.org;
>     *CC:* Ambarish Soman <asoman at redhat.com>;
>
>     Hello,
>
>     Here is some speedtest with a new setup we just made with gluster
>     3.10, there are no other differences, except glusterfs versus nfs.
>     The nfs is about 80 times faster:
>
>     root at app1:~/smallfile-master# mount -t glusterfs -o
>     use-readdirp=no,log-level=WARNING,log-file=/var/log/glusterxxx.log
>     192.168.140.41:/www /var/www
>     root at app1:~/smallfile-master# ./smallfile_cli.py  --top
>     /var/www/test --host-set 192.168.140.41 --threads 8 --files 500
>     --file-size 64 --record-size 64
>     smallfile version 3.0
>                                hosts in test : ['192.168.140.41']
>                        top test directory(s) : ['/var/www/test']
>                                    operation : cleanup
>                                 files/thread : 500
>                                      threads : 8
>                record size (KB, 0 = maximum) : 64
>                               file size (KB) : 64
>                       file size distribution : fixed
>                                files per dir : 100
>                                 dirs per dir : 10
>                   threads share directories? : N
>                              filename prefix :
>                              filename suffix :
>                  hash file number into dir.? : N
>                          fsync after modify? : N
>               pause between files (microsec) : 0
>                         finish all requests? : Y
>                                   stonewall? : Y
>                      measure response times? : N
>                                 verify read? : Y
>                                     verbose? : False
>                               log to stderr? : False
>                                ext.attr.size : 0
>                               ext.attr.count : 0
>                    permute host directories? : N
>                     remote program directory : /root/smallfile-master
>                    network thread sync. dir. :
>     /var/www/test/network_shared
>     starting all threads by creating starting gate file
>     /var/www/test/network_shared/starting_gate.tmp
>     host = 192.168.140.41,thr = 00,elapsed = 68.845450,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 01,elapsed = 67.601088,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 02,elapsed = 58.677994,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 03,elapsed = 65.901922,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 04,elapsed = 66.971720,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 05,elapsed = 71.245102,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 06,elapsed = 67.574845,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 07,elapsed = 54.263242,files =
>     500,records = 0,status = ok
>     total threads = 8
>     total files = 4000
>     100.00% of requested files processed, minimum is  70.00
>     71.245102 sec elapsed time
>     56.144211 files/sec
>     umount /var/www
>     root at app1:~/smallfile-master# mount -t nfs -o tcp
>     192.168.140.41:/www /var/www
>     root at app1:~/smallfile-master# ./smallfile_cli.py  --top
>     /var/www/test --host-set 192.168.140.41 --threads 8 --files 500
>     --file-size 64 --record-size 64
>     smallfile version 3.0
>                                hosts in test : ['192.168.140.41']
>                        top test directory(s) : ['/var/www/test']
>                                    operation : cleanup
>                                 files/thread : 500
>                                      threads : 8
>                record size (KB, 0 = maximum) : 64
>                               file size (KB) : 64
>                       file size distribution : fixed
>                                files per dir : 100
>                                 dirs per dir : 10
>                   threads share directories? : N
>                              filename prefix :
>                              filename suffix :
>                  hash file number into dir.? : N
>                          fsync after modify? : N
>               pause between files (microsec) : 0
>                         finish all requests? : Y
>                                   stonewall? : Y
>                      measure response times? : N
>                                 verify read? : Y
>                                     verbose? : False
>                               log to stderr? : False
>                                ext.attr.size : 0
>                               ext.attr.count : 0
>                    permute host directories? : N
>                     remote program directory : /root/smallfile-master
>                    network thread sync. dir. :
>     /var/www/test/network_shared
>     starting all threads by creating starting gate file
>     /var/www/test/network_shared/starting_gate.tmp
>     host = 192.168.140.41,thr = 00,elapsed = 0.962424,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 01,elapsed = 0.942673,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 02,elapsed = 0.940622,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 03,elapsed = 0.915218,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 04,elapsed = 0.934349,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 05,elapsed = 0.922466,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 06,elapsed = 0.954381,files =
>     500,records = 0,status = ok
>     host = 192.168.140.41,thr = 07,elapsed = 0.946127,files =
>     500,records = 0,status = ok
>     total threads = 8
>     total files = 4000
>     100.00% of requested files processed, minimum is  70.00
>     0.962424 sec elapsed time
>     4156.173189 files/sec
>
>         -----Original message-----
>         *From:* Jo Goossens <jo.goossens at hosted-power.com>
>         *Sent:* Tue 11-07-2017 11:26
>         *Subject:* Re: [Gluster-users] Gluster native mount is really
>         slow compared to nfs
>         *To:* gluster-users at gluster.org; Soumya Koduri
>         <skoduri at redhat.com>;
>         *CC:* Ambarish Soman <asoman at redhat.com>;
>
>         Hi all,
>
>         One more thing, we have 3 apps servers with the gluster on it,
>         replicated on 3 different gluster nodes. (So the gluster nodes
>         are app servers at the same time). We could actually almost
>         work locally if we wouldn't need to have the same files on the
>         3 nodes and redundancy :)
>
>         Initial cluster was created like this:
>
>         gluster volume create www replica 3 transport tcp
>         192.168.140.41:/gluster/www 192.168.140.42:/gluster/www
>         192.168.140.43:/gluster/www force
>         gluster volume set www network.ping-timeout 5
>         gluster volume set www performance.cache-size 1024MB
>         gluster volume set www nfs.disable on # No need for NFS currently
>         gluster volume start www
>         To my understanding it still wouldn't explain why nfs has such
>         great performance compared to native ...
>
>         Regards
>
>         Jo
>
>
>             -----Original message-----
>             *From:* Soumya Koduri <skoduri at redhat.com>
>             *Sent:* Tue 11-07-2017 11:16
>             *Subject:* Re: [Gluster-users] Gluster native mount is
>             really slow compared to nfs
>             *To:* Jo Goossens <jo.goossens at hosted-power.com>;
>             gluster-users at gluster.org;
>             *CC:* Ambarish Soman <asoman at redhat.com>; Karan Sandha
>             <ksandha at redhat.com>;
>             + Ambarish
>
>             On 07/11/2017 02:31 PM, Jo Goossens wrote:
>             > Hello,
>             >
>             >
>             >
>             >
>             >
>             > We tried tons of settings to get a php app running on a
>             native gluster
>             > mount:
>             >
>             >
>             >
>             > e.g.: 192.168.140.41:/www /var/www glusterfs
>             >
>             defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
>             > 0 0
>             >
>             >
>             >
>             > I tried some mount variants in order to speed up things
>             without luck.
>             >
>             >
>             >
>             >
>             >
>             > After that I tried nfs (native gluster nfs 3 and ganesha
>             nfs 4), it was
>             > a crazy performance difference.
>             >
>             >
>             >
>             > e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
>             >
>             >
>             >
>             > I tried a test like this to confirm the slowness:
>             >
>             >
>             >
>             > ./smallfile_cli.py  --top /var/www/test --host-set
>             192.168.140.41
>             > --threads 8 --files 5000 --file-size 64 --record-size 64
>             >
>             > This test finished in around 1.5 seconds with NFS and in
>             more than 250
>             > seconds without nfs (can't remember exact numbers, but I
>             reproduced it
>             > several times for both).
>             >
>             > With the native gluster mount the php app had loading
>             times of over 10
>             > seconds, with the nfs mount the php app loaded around 1
>             second maximum
>             > and even less. (reproduced several times)
>             >
>             >
>             >
>             > I tried all kind of performance settings and variants of
>             this but not
>             > helped , the difference stayed huge, here are some of
>             the settings
>             > played with in random order:
>             >
>
>             Request Ambarish & Karan (cc'ed who have been working on
>             evaluating
>             performance of various access protocols gluster supports)
>             to look at the
>             below settings and provide inputs.
>
>             Thanks,
>             Soumya
>
>             >
>             >
>             > gluster volume set www features.cache-invalidation on
>             > gluster volume set www
>             features.cache-invalidation-timeout 600
>             > gluster volume set www performance.stat-prefetch on
>             > gluster volume set www performance.cache-samba-metadata on
>             > gluster volume set www performance.cache-invalidation on
>             > gluster volume set www performance.md-cache-timeout 600
>             > gluster volume set www network.inode-lru-limit 250000
>             >
>             > gluster volume set www performance.cache-refresh-timeout 60
>             > gluster volume set www performance.read-ahead disable
>             > gluster volume set www performance.readdir-ahead on
>             > gluster volume set www performance.parallel-readdir on
>             > gluster volume set www
>             performance.write-behind-window-size 4MB
>             > gluster volume set www performance.io-thread-count 64
>             >
>             > gluster volume set www performance.client-io-threads on
>             >
>             > gluster volume set www performance.cache-size 1GB
>             > gluster volume set www performance.quick-read on
>             > gluster volume set www performance.flush-behind on
>             > gluster volume set www performance.write-behind on
>             > gluster volume set www nfs.disable on
>             >
>             > gluster volume set www client.event-threads 3
>             > gluster volume set www server.event-threads 3
>             >
>             >
>             >
>             >
>             >
>             >
>             > The NFS ha adds a lot of complexity which we wouldn't
>             need at all in our
>             > setup, could you please explain what is going on here?
>             Is NFS the only
>             > solution to get acceptable performance? Did I miss one
>             crucial settting
>             > perhaps?
>             >
>             >
>             >
>             > We're really desperate, thanks a lot for your help!
>             >
>             >
>             >
>             >
>             >
>             > PS: We tried with gluster 3.11 and 3.8 on Debian, both
>             had terrible
>             > performance when not used with nfs.
>             >
>             >
>             >
>             >
>             >
>             >
>             >
>             > Kind regards
>             >
>             > Jo Goossens
>             >
>             >
>             >
>             >
>             >
>             >
>             >
>             >
>             >
>             > _______________________________________________
>             > Gluster-users mailing list
>             > Gluster-users at gluster.org
>             > http://lists.gluster.org/mailman/listinfo/gluster-users
>             >
>
>           _______________________________________________
>           Gluster-users mailing list
>           Gluster-users at gluster.org
>           http://lists.gluster.org/mailman/listinfo/gluster-users
>
>     _______________________________________________
>       Gluster-users mailing list
>       Gluster-users at gluster.org
>       http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170711/da012618/attachment.html>


More information about the Gluster-users mailing list