[Gluster-users] Gluster not saturating 10gb network

Alex Crow acrow at integrafin.co.uk
Tue Aug 9 18:53:21 UTC 2016


Your replica 2 result is pretty damn good IMHO, you would always expect
at the very most 1/2 the write speed than a local write to brick
storage. Not sure why a 1 brick volume doesn't approach your native
though - it could be that FUSE overhead caps you at <1GB/s in your setup.

AFAIK there is work being done (AFR2?) to offload the replication from
the client to the server. I could just be dreaming though, so I'll leave
others to chip in.

Alex


On 09/08/16 18:24, Дмитрий Глушенок wrote:
> Hi,
>
> Same problem on 3.8.1. Even on loopback interface (traffic not leaves gluster node):
>
> Writing locally to replica 2 volume (each brick is separate local RAID6): 613 MB/sec
> Writing locally to 1-brick volume: 877 MB/sec
> Writing locally to the brick itself (directly to XFS): 1400 MB/sec
>
> Tests were performed using fio with following settings:
>
> bs=4096k
> ioengine=libaio
> iodepth=32
> direct=0
> runtime=600
> directory=/R1
> numjobs=1
> rw=write
> size=40g
>
> Even with direct=1 the brick itself gives 1400 MB/sec.
>
> 1-brick volume profiling below:
>
> # gluster volume profile test-data-03 info
> Brick: gluster-01:/R1/test-data-03
> -----------------------------------------------
> Cumulative Stats:
>    Block Size:             131072b+              262144b+ 
>  No. of Reads:                    0                     0 
> No. of Writes:               889072                    20 
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
>  ---------   -----------   -----------   -----------   ------------        ----
>       0.00       0.00 us       0.00 us       0.00 us              3     RELEASE
>     100.00     122.96 us      67.00 us   42493.00 us         208598       WRITE
>  
>     Duration: 1605 seconds
>    Data Read: 0 bytes
> Data Written: 116537688064 bytes
>  
> Interval 0 Stats:
>    Block Size:             131072b+              262144b+ 
>  No. of Reads:                    0                     0 
> No. of Writes:               889072                    20 
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
>  ---------   -----------   -----------   -----------   ------------        ----
>       0.00       0.00 us       0.00 us       0.00 us              3     RELEASE
>     100.00     122.96 us      67.00 us   42493.00 us         208598       WRITE
>  
>     Duration: 1605 seconds
>    Data Read: 0 bytes
> Data Written: 116537688064 bytes
>  
> #
>
> As you can see all writes are performed using 128 KB block size. And it looks like a bottleneck. Which was discussed previously btw: http://www.gluster.org/pipermail/gluster-devel/2013-March/038821.html
>
> Using GFAPI to access the volume shows better speed, but still far from raw brick. fio tests with ioengine=gfapi gives following:
>
> Writing locally to replica 2 volume (each brick is separate local RAID6): 680 MB/sec
> Writing locally to 1-brick volume: 960 MB/sec
>
>
> Accorging to 1-brick volume profile 128 KB blocks no more used:
>
> # gluster volume profile tzk-data-03 info
> Brick: j-gluster-01.vcod.jet.su:/R1/tzk-data-03
> -----------------------------------------------
> Cumulative Stats:
>    Block Size:            4194304b+ 
>  No. of Reads:                    0 
> No. of Writes:                 9211 
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
>  ---------   -----------   -----------   -----------   ------------        ----
>     100.00    2237.67 us    1880.00 us    5785.00 us           8701       WRITE
>  
>     Duration: 49 seconds
>    Data Read: 0 bytes
> Data Written: 38633734144 bytes
>  
> Interval 0 Stats:
>    Block Size:            4194304b+ 
>  No. of Reads:                    0 
> No. of Writes:                 9211 
>  %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop
>  ---------   -----------   -----------   -----------   ------------        ----
>     100.00    2237.67 us    1880.00 us    5785.00 us           8701       WRITE
>  
>     Duration: 49 seconds
>    Data Read: 0 bytes
> Data Written: 38633734144 bytes
>  
> [root at j-gluster-01 ~]# 
>
> So, it may be worth to try using NFS Ganesha with GFAPI plugin.
>
>
>> 3 авг. 2016 г., в 9:40, Kaamesh Kamalaaharan <kaamesh at novocraft.com> написал(а):
>>
>> Hi , 
>> I have gluster 3.6.2 installed on my server network. Due to internal issues we are not allowed to upgrade the gluster version. All the clients are on the same version of gluster. When transferring files  to/from the clients or between my nodes over the 10gb network, the transfer rate is capped at 450Mb/s .Is there any way to increase the transfer speeds for gluster mounts? 
>>
>> Our server setup is as following:
>>
>> 2 gluster servers -gfs1 and gfs2
>>  volume name : gfsvolume
>> 3 clients - hpc1, hpc2,hpc3
>> gluster volume mounted on /export/gfsmount/
>>
>>
>>
>> The following is the average results what i did so far:
>>
>> 1) test bandwith with iperf between all machines - 9.4 GiB/s
>> 2) test write speed with dd 
>> dd if=/dev/zero of=/export/gfsmount/testfile bs=1G count=1
>>
>> result=399Mb/s
>>
>> 3) test read speed with dd
>> dd if=/export/gfsmount/testfile of=/dev/zero bs=1G count=1
>>
>> result=284MB/s
>>
>> My gluster volume configuration:
>>  
>> Volume Name: gfsvolume
>> Type: Replicate
>> Volume ID: a29bd2fb-b1ef-4481-be10-c2f4faf4059b
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs1:/export/sda/brick
>> Brick2: gfs2:/export/sda/brick
>> Options Reconfigured:
>> performance.quick-read: off
>> network.ping-timeout: 30
>> network.frame-timeout: 90
>> performance.cache-max-file-size: 2MB
>> cluster.server-quorum-type: none
>> nfs.addr-namelookup: off
>> nfs.trusted-write: off
>> performance.write-behind-window-size: 4MB
>> cluster.data-self-heal-algorithm: diff
>> performance.cache-refresh-timeout: 60
>> performance.cache-size: 1GB
>> cluster.quorum-type: fixed
>> auth.allow: 172.*
>> cluster.quorum-count: 1
>> diagnostics.latency-measurement: on
>> diagnostics.count-fop-hits: on
>> cluster.server-quorum-ratio: 50%
>>
>> Any help would be appreciated. 
>> Thanks,
>> Kaamesh
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> --
> Dmitry Glushenok
> Jet Infosystems
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

--
This message is intended only for the addressee and may contain
confidential information. Unless you are that person, you may not
disclose its contents or use it in any way and are requested to delete
the message along with any attachments and notify us immediately.
This email is not intended to, nor should it be taken to, constitute advice.
The information provided is correct to our knowledge & belief and must not
be used as a substitute for obtaining tax, regulatory, investment, legal or
any other appropriate advice.

"Transact" is operated by Integrated Financial Arrangements Ltd.
29 Clement's Lane, London EC4N 7AE. Tel: (020) 7608 4900 Fax: (020) 7608 5300.
(Registered office: as above; Registered in England and Wales under
number: 3727592). Authorised and regulated by the Financial Conduct
Authority (entered on the Financial Services Register; no. 190856).


More information about the Gluster-users mailing list