[Gluster-users] Slow performance of gluster volume

Krutika Dhananjay kdhananj at redhat.com
Wed Sep 6 08:15:15 UTC 2017


Do you see any improvement with 3.11.1 as that has a patch that improves
perf for this kind of a workload

Also, could you disable eager-lock and check if that helps? I see that max
time is being spent in acquiring locks.

-Krutika

On Wed, Sep 6, 2017 at 1:38 PM, Abi Askushi <rightkicktech at gmail.com> wrote:

> Hi Krutika,
>
> Is it anything in the profile indicating what is causing this bottleneck?
> In case i can collect any other info let me know.
>
> Thanx
>
> On Sep 5, 2017 13:27, "Abi Askushi" <rightkicktech at gmail.com> wrote:
>
> Hi Krutika,
>
> Attached the profile stats. I enabled profiling then ran some dd tests.
> Also 3 Windows VMs are running on top this volume but did not do any stress
> testing on the VMs. I have left the profiling enabled in case more time is
> needed for useful stats.
>
> Thanx
>
> On Tue, Sep 5, 2017 at 12:48 PM, Krutika Dhananjay <kdhananj at redhat.com>
> wrote:
>
>> OK my understanding is that with preallocated disks the performance with
>> and without shard will be the same.
>>
>> In any case, please attach the volume profile[1], so we can see what else
>> is slowing things down.
>>
>> -Krutika
>>
>> [1] - https://gluster.readthedocs.io/en/latest/Administrator%20Gui
>> de/Monitoring%20Workload/#running-glusterfs-volume-profile-command
>>
>> On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at gmail.com>
>> wrote:
>>
>>> Hi Krutika,
>>>
>>> I already have a preallocated disk on VM.
>>> Now I am checking performance with dd on the hypervisors which have the
>>> gluster volume configured.
>>>
>>> I tried also several values of shard-block-size and I keep getting the
>>> same low values on write performance.
>>> Enabling client-io-threads also did not have any affect.
>>>
>>> The version of gluster I am using is glusterfs 3.8.12 built on May 11
>>> 2017 18:46:20.
>>> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster
>>> as storage.
>>>
>>> Below are the current settings:
>>>
>>>
>>> Volume Name: vms
>>> Type: Replicate
>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 1 x (2 + 1) = 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gluster0:/gluster/vms/brick
>>> Brick2: gluster1:/gluster/vms/brick
>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>> Options Reconfigured:
>>> server.event-threads: 4
>>> client.event-threads: 4
>>> performance.client-io-threads: on
>>> features.shard-block-size: 512MB
>>> cluster.granular-entry-heal: enable
>>> performance.strict-o-direct: on
>>> network.ping-timeout: 30
>>> storage.owner-gid: 36
>>> storage.owner-uid: 36
>>> user.cifs: off
>>> features.shard: on
>>> cluster.shd-wait-qlength: 10000
>>> cluster.shd-max-threads: 8
>>> cluster.locking-scheme: granular
>>> cluster.data-self-heal-algorithm: full
>>> cluster.server-quorum-type: server
>>> cluster.quorum-type: auto
>>> cluster.eager-lock: enable
>>> network.remote-dio: off
>>> performance.low-prio-threads: 32
>>> performance.stat-prefetch: on
>>> performance.io-cache: off
>>> performance.read-ahead: off
>>> performance.quick-read: off
>>> transport.address-family: inet
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: on
>>>
>>>
>>> I observed that when testing with dd if=/dev/zero of=testfile bs=1G
>>> count=1 I get 65MB/s on the vms gluster volume (and the network traffic
>>> between the servers reaches ~ 500Mbps), while when testing with dd
>>> if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a
>>> consistent 10MB/s and the network traffic hardly reaching 100Mbps.
>>>
>>> Any other things one can do?
>>>
>>> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com>
>>> wrote:
>>>
>>>> I'm assuming you are using this volume to store vm images, because I
>>>> see shard in the options list.
>>>>
>>>> Speaking from shard translator's POV, one thing you can do to improve
>>>> performance is to use preallocated images.
>>>> This will at least eliminate the need for shard to perform multiple
>>>> steps as part of the writes - such as creating the shard and then writing
>>>> to it and then updating the aggregated file size - all of which require one
>>>> network call each, which further get blown up once they reach AFR
>>>> (replicate) into many more network calls.
>>>>
>>>> Second, I'm assuming you're using the default shard block size of 4MB
>>>> (you can confirm this using `gluster volume get <VOL> shard-block-size`).
>>>> In our tests, we've found that larger shard sizes perform better. So maybe
>>>> change the shard-block-size to 64MB (`gluster volume set <VOL>
>>>> shard-block-size 64MB`).
>>>>
>>>> Third, keep stat-prefetch enabled. We've found that qemu sends quite a
>>>> lot of [f]stats which can be served from the (md)cache to improve
>>>> performance. So enable that.
>>>>
>>>> Also, could you also enable client-io-threads and see if that improves
>>>> performance?
>>>>
>>>> Which version of gluster are you using BTW?
>>>>
>>>> -Krutika
>>>>
>>>>
>>>> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a gluster volume used to host several VMs (managed through
>>>>> oVirt).
>>>>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit
>>>>> network for the storage.
>>>>>
>>>>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
>>>>> oflag=direct) out of the volume (e.g. writing at /root/) the performance of
>>>>> the dd is reported to be ~ 700MB/s, which is quite decent. When testing the
>>>>> dd on the gluster volume I get ~ 43 MB/s which way lower from the previous.
>>>>> When testing with dd the gluster volume, the network traffic was not
>>>>> exceeding 450 Mbps on the network interface. I would expect to reach near
>>>>> 900 Mbps considering that there is 1 Gbit of bandwidth available. This
>>>>> results having VMs with very slow performance (especially on their write
>>>>> operations).
>>>>>
>>>>> The full details of the volume are below. Any advise on what can be
>>>>> tweaked will be highly appreciated.
>>>>>
>>>>> Volume Name: vms
>>>>> Type: Replicate
>>>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: gluster0:/gluster/vms/brick
>>>>> Brick2: gluster1:/gluster/vms/brick
>>>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>>>> Options Reconfigured:
>>>>> cluster.granular-entry-heal: enable
>>>>> performance.strict-o-direct: on
>>>>> network.ping-timeout: 30
>>>>> storage.owner-gid: 36
>>>>> storage.owner-uid: 36
>>>>> user.cifs: off
>>>>> features.shard: on
>>>>> cluster.shd-wait-qlength: 10000
>>>>> cluster.shd-max-threads: 8
>>>>> cluster.locking-scheme: granular
>>>>> cluster.data-self-heal-algorithm: full
>>>>> cluster.server-quorum-type: server
>>>>> cluster.quorum-type: auto
>>>>> cluster.eager-lock: enable
>>>>> network.remote-dio: off
>>>>> performance.low-prio-threads: 32
>>>>> performance.stat-prefetch: off
>>>>> performance.io-cache: off
>>>>> performance.read-ahead: off
>>>>> performance.quick-read: off
>>>>> transport.address-family: inet
>>>>> performance.readdir-ahead: on
>>>>> nfs.disable: on
>>>>> nfs.export-volumes: on
>>>>>
>>>>>
>>>>> Thanx,
>>>>> Alex
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170906/e194217b/attachment.html>


More information about the Gluster-users mailing list