[Gluster-users] Gluster very poor performance when copying small files (1x (2+1) = 3, SSD)

Rik Theys Rik.Theys at esat.kuleuven.be
Mon Mar 19 09:37:57 UTC 2018


Hi,

I've done some similar tests and experience similar performance issues
(see my 'gluster for home directories?' thread on the list).

If I read your mail correctly, you are comparing an NFS mount of the
brick disk against a gluster mount (using the fuse client)?

Which options do you have set on the NFS export (sync or async)?

>From my tests, I concluded that the issue was not bandwidth but latency.
Gluster will only return an IO operation once all bricks have confirmed
that the data is on disk. If you are using a fuse mount, you might
compare with using the 'direct-io-mode=disable' option on the client
might help (no experience with this).

In our tests, I've used NFS-ganesha to serve the gluster volume over
NFS. This makes things even worse as NFS-ganesha has no "async" mode,
which makes performance terrible.

If you find a magic knob to make glusterfs fast on small-file workloads,
do let me know!

Regards,

Rik

On 03/18/2018 11:13 PM, Sam McLeod wrote:
> Howdy all,
> 
> We're experiencing terrible small file performance when copying or
> moving files on gluster clients.
> 
> In the example below, Gluster is taking 6mins~ to copy 128MB / 21,000
> files sideways on a client, doing the same thing on NFS (which I know is
> a totally different solution etc. etc.) takes approximately 10-15
> seconds(!).
> 
> Any advice for tuning the volume or XFS settings would be greatly
> appreciated.
> 
> Hopefully I've included enough relevant information below.
> 
> 
> ## Gluster Client
> 
> root at gluster-client:/mnt/gluster_perf_test/  # du -sh .
> 127M    .
> root at gluster-client:/mnt/gluster_perf_test/  # find . -type f | wc -l
> 21791
> root at gluster-client:/mnt/gluster_perf_test/  # du 9584toto9584.txt
> 4    9584toto9584.txt
> 
> 
> root at gluster-client:/mnt/gluster_perf_test/  # time cp -a private
> private_perf_test
> 
> real    5m51.862s
> user    0m0.862s
> sys    0m8.334s
> 
> root at gluster-client:/mnt/gluster_perf_test/ # time rm -rf private_perf_test/
> 
> real    0m49.702s
> user    0m0.087s
> sys    0m0.958s
> 
> 
> ## Hosts
> 
> - 16x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz per Gluster host / client
> - Storage: iSCSI provisioned (via 10Gbit DAC/Fibre), SSD disk, 50K R/RW
> 4k IOP/s, 400MB/s per Gluster host
> - Volumes are replicated across two hosts and one arbiter only host
> - Networking is 10Gbit DAC/Fibre between Gluster hosts and clients
> - 18GB DDR4 ECC memory
> 
> ## Volume Info
> 
> root at gluster-host-01:~ # gluster pool list
> UUID          Hostname                        State
> ad02970b-e2aa-4ca8-998c-bd10d5970faa  gluster-host-02.fqdn Connected
> ea116a94-c19e-48db-b108-0be3ae622e2e  gluster-host-03.fqdn Connected
> 2e855c25-e7ac-4ff6-be85-e8bcc6f45ee4  localhost                      
> Connected
> 
> root at gluster-host-01:~ # gluster volume info uat_storage
> 
> Volume Name: uat_storage
> Type: Replicate
> Volume ID: 7918f1c5-5031-47b8-b054-56f6f0c569a2
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster-host-01.fqdn:/mnt/gluster-storage/uat_storage
> Brick2: gluster-host-02.fqdn:/mnt/gluster-storage/uat_storage
> Brick3: gluster-host-03.fqdn:/mnt/gluster-storage/uat_storage (arbiter)
> Options Reconfigured:
> performance.rda-cache-limit: 256MB
> network.inode-lru-limit: 50000
> server.outstanding-rpc-limit: 256
> performance.client-io-threads: true
> nfs.disable: on
> transport.address-family: inet
> client.event-threads: 8
> cluster.eager-lock: true
> cluster.favorite-child-policy: size
> cluster.lookup-optimize: true
> cluster.readdir-optimize: true
> cluster.use-compound-fops: true
> diagnostics.brick-log-level: ERROR
> diagnostics.client-log-level: ERROR
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: true
> network.ping-timeout: 15
> performance.cache-invalidation: true
> performance.cache-max-file-size: 6MB
> performance.cache-refresh-timeout: 60
> performance.cache-size: 1024MB
> performance.io <http://performance.io>-thread-count: 16
> performance.md-cache-timeout: 600
> performance.stat-prefetch: true
> performance.write-behind-window-size: 256MB
> server.event-threads: 8
> transport.listen-backlog: 2048
> 
> root at gluster-host-01:~ # xfs_info /dev/mapper/gluster-storage-unlocked
> meta-data=/dev/mapper/gluster-storage-unlocked isize=512    agcount=4,
> agsize=196607360 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=0 spinodes=0
> data     =                       bsize=4096   blocks=786429440, imaxpct=5
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=8192   ascii-ci=0 ftype=1
> log      =internal               bsize=4096   blocks=383998, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 
> 
> --
> Sam McLeod (protoporpoise on IRC)
> https://smcleod.net
> https://twitter.com/s_mcleod
> 
> Words are my own opinions and do not necessarily represent those of
> my employer or partners.
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 


-- 
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>


More information about the Gluster-users mailing list