[Gluster-users] Slow performance of gluster volume

Abi Askushi rightkicktech at gmail.com
Wed Sep 6 11:20:35 UTC 2017


I tried to follow step from
https://wiki.centos.org/SpecialInterestGroup/Storage to install latest
gluster on the first node.
It installed 3.10 and not 3.11. I am not sure how to install 3.11 without
compiling it.
Then when tried to start the gluster on the node the bricks were reported
down (the other 2 nodes have still 3.8). No sure why. The logs were showing
the below (even after rebooting the server):

[2017-09-06 10:56:09.023777] E [rpcsvc.c:557:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
[2017-09-06 10:56:09.024122] E [server-helpers.c:395:server_alloc_frame]
(-->/lib64/libgfrpc.so.0(rpcsvc_handle_rpc_call+0x325) [0x7f2d0ec20905]
-->/usr/lib64/glusterfs/3.10.5/xlator/protocol/server.so(+0x3006b)
[0x7f2cfa4bf06b]
-->/usr/lib64/glusterfs/3.10.5/xlator/protocol/server.so(+0xdb34)
[0x7f2cfa49cb34] ) 0-server: invalid argument: client [Invalid argument]

Do I need to upgrade all nodes before I attempt to start the gluster
services?
I reverted the first node back to 3.8 at the moment and all restored.
Also tests with eager lock disabled did not make any difference.




On Wed, Sep 6, 2017 at 11:15 AM, Krutika Dhananjay <kdhananj at redhat.com>
wrote:

> Do you see any improvement with 3.11.1 as that has a patch that improves
> perf for this kind of a workload
>
> Also, could you disable eager-lock and check if that helps? I see that max
> time is being spent in acquiring locks.
>
> -Krutika
>
> On Wed, Sep 6, 2017 at 1:38 PM, Abi Askushi <rightkicktech at gmail.com>
> wrote:
>
>> Hi Krutika,
>>
>> Is it anything in the profile indicating what is causing this bottleneck?
>> In case i can collect any other info let me know.
>>
>> Thanx
>>
>> On Sep 5, 2017 13:27, "Abi Askushi" <rightkicktech at gmail.com> wrote:
>>
>> Hi Krutika,
>>
>> Attached the profile stats. I enabled profiling then ran some dd tests.
>> Also 3 Windows VMs are running on top this volume but did not do any stress
>> testing on the VMs. I have left the profiling enabled in case more time is
>> needed for useful stats.
>>
>> Thanx
>>
>> On Tue, Sep 5, 2017 at 12:48 PM, Krutika Dhananjay <kdhananj at redhat.com>
>> wrote:
>>
>>> OK my understanding is that with preallocated disks the performance with
>>> and without shard will be the same.
>>>
>>> In any case, please attach the volume profile[1], so we can see what
>>> else is slowing things down.
>>>
>>> -Krutika
>>>
>>> [1] - https://gluster.readthedocs.io/en/latest/Administrator%20Gui
>>> de/Monitoring%20Workload/#running-glusterfs-volume-profile-command
>>>
>>> On Tue, Sep 5, 2017 at 2:32 PM, Abi Askushi <rightkicktech at gmail.com>
>>> wrote:
>>>
>>>> Hi Krutika,
>>>>
>>>> I already have a preallocated disk on VM.
>>>> Now I am checking performance with dd on the hypervisors which have the
>>>> gluster volume configured.
>>>>
>>>> I tried also several values of shard-block-size and I keep getting the
>>>> same low values on write performance.
>>>> Enabling client-io-threads also did not have any affect.
>>>>
>>>> The version of gluster I am using is glusterfs 3.8.12 built on May 11
>>>> 2017 18:46:20.
>>>> The setup is a set of 3 Centos 7.3 servers and ovirt 4.1, using gluster
>>>> as storage.
>>>>
>>>> Below are the current settings:
>>>>
>>>>
>>>> Volume Name: vms
>>>> Type: Replicate
>>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gluster0:/gluster/vms/brick
>>>> Brick2: gluster1:/gluster/vms/brick
>>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>>> Options Reconfigured:
>>>> server.event-threads: 4
>>>> client.event-threads: 4
>>>> performance.client-io-threads: on
>>>> features.shard-block-size: 512MB
>>>> cluster.granular-entry-heal: enable
>>>> performance.strict-o-direct: on
>>>> network.ping-timeout: 30
>>>> storage.owner-gid: 36
>>>> storage.owner-uid: 36
>>>> user.cifs: off
>>>> features.shard: on
>>>> cluster.shd-wait-qlength: 10000
>>>> cluster.shd-max-threads: 8
>>>> cluster.locking-scheme: granular
>>>> cluster.data-self-heal-algorithm: full
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> cluster.eager-lock: enable
>>>> network.remote-dio: off
>>>> performance.low-prio-threads: 32
>>>> performance.stat-prefetch: on
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> transport.address-family: inet
>>>> performance.readdir-ahead: on
>>>> nfs.disable: on
>>>> nfs.export-volumes: on
>>>>
>>>>
>>>> I observed that when testing with dd if=/dev/zero of=testfile bs=1G
>>>> count=1 I get 65MB/s on the vms gluster volume (and the network traffic
>>>> between the servers reaches ~ 500Mbps), while when testing with dd
>>>> if=/dev/zero of=testfile bs=1G count=1 *oflag=direct *I get a
>>>> consistent 10MB/s and the network traffic hardly reaching 100Mbps.
>>>>
>>>> Any other things one can do?
>>>>
>>>> On Tue, Sep 5, 2017 at 5:57 AM, Krutika Dhananjay <kdhananj at redhat.com>
>>>> wrote:
>>>>
>>>>> I'm assuming you are using this volume to store vm images, because I
>>>>> see shard in the options list.
>>>>>
>>>>> Speaking from shard translator's POV, one thing you can do to improve
>>>>> performance is to use preallocated images.
>>>>> This will at least eliminate the need for shard to perform multiple
>>>>> steps as part of the writes - such as creating the shard and then writing
>>>>> to it and then updating the aggregated file size - all of which require one
>>>>> network call each, which further get blown up once they reach AFR
>>>>> (replicate) into many more network calls.
>>>>>
>>>>> Second, I'm assuming you're using the default shard block size of 4MB
>>>>> (you can confirm this using `gluster volume get <VOL> shard-block-size`).
>>>>> In our tests, we've found that larger shard sizes perform better. So maybe
>>>>> change the shard-block-size to 64MB (`gluster volume set <VOL>
>>>>> shard-block-size 64MB`).
>>>>>
>>>>> Third, keep stat-prefetch enabled. We've found that qemu sends quite a
>>>>> lot of [f]stats which can be served from the (md)cache to improve
>>>>> performance. So enable that.
>>>>>
>>>>> Also, could you also enable client-io-threads and see if that improves
>>>>> performance?
>>>>>
>>>>> Which version of gluster are you using BTW?
>>>>>
>>>>> -Krutika
>>>>>
>>>>>
>>>>> On Tue, Sep 5, 2017 at 4:32 AM, Abi Askushi <rightkicktech at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have a gluster volume used to host several VMs (managed through
>>>>>> oVirt).
>>>>>> The volume is a replica 3 with arbiter and the 3 servers use 1 Gbit
>>>>>> network for the storage.
>>>>>>
>>>>>> When testing with dd (dd if=/dev/zero of=testfile bs=1G count=1
>>>>>> oflag=direct) out of the volume (e.g. writing at /root/) the performance of
>>>>>> the dd is reported to be ~ 700MB/s, which is quite decent. When testing the
>>>>>> dd on the gluster volume I get ~ 43 MB/s which way lower from the previous.
>>>>>> When testing with dd the gluster volume, the network traffic was not
>>>>>> exceeding 450 Mbps on the network interface. I would expect to reach near
>>>>>> 900 Mbps considering that there is 1 Gbit of bandwidth available. This
>>>>>> results having VMs with very slow performance (especially on their write
>>>>>> operations).
>>>>>>
>>>>>> The full details of the volume are below. Any advise on what can be
>>>>>> tweaked will be highly appreciated.
>>>>>>
>>>>>> Volume Name: vms
>>>>>> Type: Replicate
>>>>>> Volume ID: 4513340d-7919-498b-bfe0-d836b5cea40b
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 1 x (2 + 1) = 3
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: gluster0:/gluster/vms/brick
>>>>>> Brick2: gluster1:/gluster/vms/brick
>>>>>> Brick3: gluster2:/gluster/vms/brick (arbiter)
>>>>>> Options Reconfigured:
>>>>>> cluster.granular-entry-heal: enable
>>>>>> performance.strict-o-direct: on
>>>>>> network.ping-timeout: 30
>>>>>> storage.owner-gid: 36
>>>>>> storage.owner-uid: 36
>>>>>> user.cifs: off
>>>>>> features.shard: on
>>>>>> cluster.shd-wait-qlength: 10000
>>>>>> cluster.shd-max-threads: 8
>>>>>> cluster.locking-scheme: granular
>>>>>> cluster.data-self-heal-algorithm: full
>>>>>> cluster.server-quorum-type: server
>>>>>> cluster.quorum-type: auto
>>>>>> cluster.eager-lock: enable
>>>>>> network.remote-dio: off
>>>>>> performance.low-prio-threads: 32
>>>>>> performance.stat-prefetch: off
>>>>>> performance.io-cache: off
>>>>>> performance.read-ahead: off
>>>>>> performance.quick-read: off
>>>>>> transport.address-family: inet
>>>>>> performance.readdir-ahead: on
>>>>>> nfs.disable: on
>>>>>> nfs.export-volumes: on
>>>>>>
>>>>>>
>>>>>> Thanx,
>>>>>> Alex
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170906/bc7b4fcb/attachment.html>


More information about the Gluster-users mailing list