[Gluster-users] slow write perf for disperse volume

Tue Apr 25 08:49:06 UTC 2017

2017-04-25 9:03 GMT+02:00 Xavier Hernandez <xhernandez at datalab.es>:

> Hi Ingard,
>
> On 24/04/17 14:43, Ingard Mevåg wrote:
>
>> I've done some more testing with tc and introduced latency on one of my
>> testservers. With 9ms latency artificially introduced using tc ( sudo tc
>> qdisc add dev bond0 root netem delay 9ms ) to a testserver in the same
>> DC as the disperse volume servers I get more or less the same throughput
>> as I do when testing DC1 <-> DC2 (which has ~9ms ping).
>>
>> I know distribute volumes were more sensitive to latency in the past. At
>> least I can max out a 1gig link with 9-10ms latency when using
>> distribute. Disperse seems to max at 12-14MB/s with 8-10ms latency.
>>
>
> A pure distributed volume is a simple configuration that simply forwards a
> request to one of the bricks. No additional overhead is needed.
>

Well, we've still got gluster 3.0 running on an old cluster and this
cluster is also dead slow when mounted at the other DC - About the same
perf as get with disperse on 3.10. So some work has been done to make
distribute volumes work better with increased latency.

>
> However a dispersed 4+2 volume needs to talk simultaneously to 6 bricks,
> meaning 6 network round-trips for every request. Additionally it needs to
> keep integrity so one or more additional requests are needed.
>

The number of bricks doesnt appear to affect the throughput. I've tried
different variations of data and redundancy bricks, but the throughput
seems to stay the same. For instance 8+4 compared to 4+2 has double amount
of connections, but half the throughput per connection.

> If network latency is high, all these requests contribute to increase the
> overall request latency, limiting the throughput.
>
> Have you tried a replica 2 or 3 ? it uses very similar integrity
> mechanisms so it'll also add some latency. Maybe not so much as a dispersed
> 4+2, but it should be perceptible.
>

We're after capacity with this setup.

> Another test to confirm that the limitation is caused by latency is to do
> multiple writes in parallel. Each write will be limited by the latency, but
> the aggregated throughput should saturate the bandwidth, specially on a 1Gb
> ethernet.
>

That has been confirmed.

>
> Even better performance can be achieved if you distribute the writes to
> multiple clients or mount points (assuming they are not writing to the same
> file).
>
> Xavi
>
>
>> ingard
>>
>> 2017-04-24 14:03 GMT+02:00 Ingard Mevåg <ingard at jotta.no
>> <mailto:ingard at jotta.no>>:
>>
>>     I can confirm mounting the disperse volume locally on one of the
>>     three servers i got 211 MB/s with dd if=/dev/zero of=./local.dd.test
>>     bs=1M count=10000.
>>
>>     Its not very good concidering 10gig network, but at least 20x better
>>     than 10-12MB/s
>>
>>     2017-04-24 13:53 GMT+02:00 Pranith Kumar Karampuri
>>     <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>
>>         +Ashish
>>
>>         Ashish,
>>                Could you help Ingard? Do let me know what you find.
>>
>>         On Mon, Apr 24, 2017 at 4:50 PM, Ingard Mevåg <ingard at jotta.no
>>         <mailto:ingard at jotta.no>> wrote:
>>
>>             Hi. I can't see a fuse thread at all. Please see attached
>>             screenshot of top process with threads. Keep in mind this is
>>             from inside the container.
>>
>>             2017-04-24 12:17 GMT+02:00 Pranith Kumar Karampuri
>>             <pkarampu at redhat.com <mailto:pkarampu at redhat.com>>:
>>
>>                 We were able to saturate hardware with EC as well. Could
>>                 you check 'top' in threaded mode to see if fuse thread
>>                 is saturated when you run dd?
>>
>>                 On Mon, Apr 24, 2017 at 3:27 PM, Ingard Mevåg
>>                 <ingard at jotta.no <mailto:ingard at jotta.no>> wrote:
>>
>>                     Hi
>>                     I've been playing with disperse volumes the past
>>                     week, and so far i can not get more than 12MB/s when
>>                     i do a write test. I've tried a distributed volume
>>                     on the same bricks and gotten close to gigabit
>>                     speeds. iperf confirms gigabit speeds to all three
>>                     servers in the storage pool.
>>
>>                     The three storage servers have 10gig nics (connected
>>                     to the same switch). The client is for a now a
>>                     docker container in a 2nd DC (latency roughly 8-9 ms).
>>
>>                     dpkg -l|grep -i gluster
>>                     ii  glusterfs-client
>>                     3.10.1-ubuntu1~xenial1          amd64
>>                      clustered file-system (client package)
>>                     ii  glusterfs-common
>>                     3.10.1-ubuntu1~xenial1          amd64
>>                      GlusterFS common libraries and translator modules
>>                     ii  glusterfs-server
>>                     3.10.1-ubuntu1~xenial1          amd64
>>                      clustered file-system (server package)
>>
>>                     $ gluster volume info
>>
>>                     Volume Name: DFS-ARCHIVE-001
>>                     Type: Disperse
>>                     Volume ID: 1497bc85-cb47-4123-8f91-a07f55c11dcc
>>                     Status: Started
>>                     Snapshot Count: 0
>>                     Number of Bricks: 1 x (4 + 2) = 6
>>                     Transport-type: tcp
>>                     Bricks:
>>                     Brick1: dna-001:/mnt/data01/brick
>>                     Brick2: dna-001:/mnt/data02/brick
>>                     Brick3: dna-002:/mnt/data01/brick
>>                     Brick4: dna-002:/mnt/data02/brick
>>                     Brick5: dna-003:/mnt/data01/brick
>>                     Brick6: dna-003:/mnt/data02/brick
>>                     Options Reconfigured:
>>                     transport.address-family: inet
>>                     nfs.disable: on
>>
>>                     Anyone know the reason for the slow speeds on
>>                     disperse vs distribute?
>>
>>                     kind regards
>>                     ingard
>>
>>                     _______________________________________________
>>                     Gluster-users mailing list
>>                     Gluster-users at gluster.org
>>                     <mailto:Gluster-users at gluster.org>
>>                     http://lists.gluster.org/mailm
>> an/listinfo/gluster-users
>>                     <http://lists.gluster.org/mail
>> man/listinfo/gluster-users>
>>
>>
>>
>>
>>                 --
>>                 Pranith
>>
>>
>>
>>
>>             --
>>             Ingard Mevåg
>>             Driftssjef
>>             Jottacloud
>>
>>             Mobil: +47 450 22 834 <tel:+47%20450%2022%20834>
>>             E-post: ingard at jottacloud.com <mailto:ingard at jottacloud.com>
>>             Webside: www.jottacloud.com <http://www.jottacloud.com>
>>
>>
>>
>>
>>         --
>>         Pranith
>>
>>
>>
>>
>>     --
>>     Ingard Mevåg
>>     Driftssjef
>>     Jottacloud
>>
>>     Mobil: +47 450 22 834 <tel:+47%20450%2022%20834>
>>     E-post: ingard at jottacloud.com <mailto:ingard at jottacloud.com>
>>     Webside: www.jottacloud.com <http://www.jottacloud.com>
>>
>>
>>
>>
>> --
>> Ingard Mevåg
>> Driftssjef
>> Jottacloud
>>
>> Mobil: +47 450 22 834
>> E-post: ingard at jottacloud.com <mailto:ingard at jottacloud.com>
>> Webside: www.jottacloud.com <http://www.jottacloud.com>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170425/ab7a41d1/attachment.html>