[Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5

Krutika Dhananjay kdhananj at redhat.com
Mon Apr 1 05:56:11 UTC 2019


Adding back gluster-users
Comments inline ...

On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar <olaf.buitelaar at gmail.com>
wrote:

> Dear Krutika,
>
>
>
> 1. I’ve made 2 profile runs of around 10 minutes (see files
> profile_data.txt and profile_data2.txt). Looking at it, most time seems be
> spent at the  fop’s fsync and readdirp.
>
> Unfortunate I don’t have the profile info for the 3.12.15 version so it’s
> a bit hard to compare.
>
> One additional thing I do notice on 1 machine (10.32.9.5) the iowait time
> increased a lot, from an average below the 1% it’s now around the 12% after
> the upgrade.
>
> So first suspicion with be lighting strikes twice, and I’ve also just now
> a bad disk, but that doesn’t appear to be the case, since all smart status
> report ok.
>
> Also dd shows performance I would more or less expect;
>
> dd if=/dev/zero of=/data/test_file  bs=100M count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1G count=1  oflag=dsync
>
> 1+0 records in
>
> 1+0 records out
>
> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s
>
> if=/dev/urandom of=/data/test_file  bs=1024 count=1000000
>
> 1000000+0 records in
>
> 1000000+0 records out
>
> 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s
>
> dd if=/dev/zero of=/data/test_file  bs=1024 count=1000000
>
> 1000000+0 records in
>
> 1000000+0 records out
>
> 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s
>
> When I disable this brick (service glusterd stop; pkill glusterfsd)
> performance in gluster is better, but not on par with what it was. Also the
> cpu usages on the “neighbor” nodes which hosts the other bricks in the same
> subvolume increases quite a lot in this case, which I wouldn’t expect
> actually since they shouldn't handle much more work, except flagging shards
> to heal. Iowait  also goes to idle once gluster is stopped, so it’s for
> sure gluster which waits for io.
>
>
>

So I see that FSYNC %-latency is on the higher side. And I also noticed you
don't have direct-io options enabled on the volume.
Could you set the following options on the volume -
# gluster volume set <VOLNAME> network.remote-dio off
# gluster volume set <VOLNAME> performance.strict-o-direct on
and also disable choose-local
# gluster volume set <VOLNAME> cluster.choose-local off

let me know if this helps.

2. I’ve attached the mnt log and volume info, but I couldn’t find anything
> relevant in in those logs. I think this is because we run the VM’s with
> libgfapi;
>
> [root at ovirt-host-01 ~]# engine-config  -g LibgfApiSupported
>
> LibgfApiSupported: true version: 4.2
>
> LibgfApiSupported: true version: 4.1
>
> LibgfApiSupported: true version: 4.3
>
> And I can confirm the qemu process is invoked with the gluster:// address
> for the images.
>
> The message is logged in the /var/lib/libvert/qemu/<machine>  file, which
> I’ve also included. For a sample case see around; 2019-03-28 20:20:07
>
> Which has the error; E [MSGID: 133010]
> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
> [Stale file handle]
>

Could you also attach the brick logs for this volume?


>
> 3. yes I see multiple instances for the same brick directory, like;
>
> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id
> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p
> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid
> -S /var/run/gluster/452591c9165945d9.socket --brick-name
> /data/gfs/bricks/brick1/ovirt-core -l
> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log
> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1
> --process-name brick --brick-port 49154 --xlator-option
> ovirt-core-server.listen-port=49154
>
>
>
> I’ve made an export of the output of ps from the time I observed these
> multiple processes.
>
> In addition the brick_mux bug as noted by Atin. I might also have another
> possible cause, as ovirt moves nodes from none-operational state or
> maintenance state to active/activating, it also seems to restart gluster,
> however I don’t have direct proof for this theory.
>
>
>

+Atin Mukherjee <amukherj at redhat.com> ^^
+Mohit Agrawal <moagrawa at redhat.com>  ^^

-Krutika

Thanks Olaf
>
> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola <sbonazzo at redhat.com
> >:
>
>>
>>
>> Il giorno gio 28 mar 2019 alle ore 17:48 <olaf.buitelaar at gmail.com> ha
>> scritto:
>>
>>> Dear All,
>>>
>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While
>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a
>>> different experience. After first trying a test upgrade on a 3 node setup,
>>> which went fine. i headed to upgrade the 9 node production platform,
>>> unaware of the backward compatibility issues between gluster 3.12.15 ->
>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start.
>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata
>>> was missing or couldn't be accessed. Restoring this file by getting a good
>>> copy of the underlying bricks, removing the file from the underlying bricks
>>> where the file was 0 bytes and mark with the stickybit, and the
>>> corresponding gfid's. Removing the file from the mount point, and copying
>>> back the file on the mount point. Manually mounting the engine domain,  and
>>> manually creating the corresponding symbolic links in /rhev/data-center and
>>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was
>>> root.root), i was able to start the HA engine again. Since the engine was
>>> up again, and things seemed rather unstable i decided to continue the
>>> upgrade on the other nodes suspecting an incompatibility in gluster
>>> versions, i thought would be best to have them all on the same version
>>> rather soonish. However things went from bad to worse, the engine stopped
>>> again, and all vm’s stopped working as well.  So on a machine outside the
>>> setup and restored a backup of the engine taken from version 4.2.8 just
>>> before the upgrade. With this engine I was at least able to start some vm’s
>>> again, and finalize the upgrade. Once the upgraded, things didn’t stabilize
>>> and also lose 2 vm’s during the process due to image corruption. After
>>> figuring out gluster 5.3 had quite some issues I was as lucky to see
>>> gluster 5.5 was about to be released, on the moment the RPM’s were
>>> available I’ve installed those. This helped a lot in terms of stability,
>>> for which I’m very grateful! However the performance is unfortunate
>>> terrible, it’s about 15% of what the performance was running gluster
>>> 3.12.15. It’s strange since a simple dd shows ok performance, but our
>>> actual workload doesn’t. While I would expect the performance to be better,
>>> due to all improvements made since gluster version 3.12. Does anybody share
>>> the same experience?
>>> I really hope gluster 6 will soon be tested with ovirt and released, and
>>> things start to perform and stabilize again..like the good old days. Of
>>> course when I can do anything, I’m happy to help.
>>>
>>
>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track the
>> rebase on Gluster 6.
>>
>>
>>
>>>
>>> I think the following short list of issues we have after the migration;
>>> Gluster 5.5;
>>> -       Poor performance for our workload (mostly write dependent)
>>> -       VM’s randomly pause on unknown storage errors, which are “stale
>>> file’s”. corresponding log; Lookup on shard 797 failed. Base file gfid =
>>> 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle]
>>> -       Some files are listed twice in a directory (probably related the
>>> stale file issue?)
>>> Example;
>>> ls -la
>>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/
>>> total 3081
>>> drwxr-x---.  2 vdsm kvm    4096 Mar 18 11:34 .
>>> drwxr-xr-x. 13 vdsm kvm    4096 Mar 19 09:42 ..
>>> -rw-rw----.  1 vdsm kvm 1048576 Mar 28 12:55
>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c
>>> -rw-rw----.  1 vdsm kvm 1048576 Mar 28 12:55
>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c
>>> -rw-rw----.  1 vdsm kvm 1048576 Jan 27  2018
>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease
>>> -rw-r--r--.  1 vdsm kvm     290 Jan 27  2018
>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta
>>> -rw-r--r--.  1 vdsm kvm     290 Jan 27  2018
>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta
>>>
>>> - brick processes sometimes starts multiple times. Sometimes I’ve 5
>>> brick processes for a single volume. Killing all glusterfsd’s for the
>>> volume on the machine and running gluster v start <vol> force usually just
>>> starts one after the event, from then on things look all right.
>>>
>>>
>> May I kindly ask to open bugs on Gluster for above issues at
>> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ?
>> Sahina?
>>
>>
>>> Ovirt 4.3.2.1-1.el7
>>> -       All vms images ownership are changed to root.root after the vm
>>> is shutdown, probably related to;
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only scoped
>>> to the HA engine. I’m still in compatibility mode 4.2 for the cluster and
>>> for the vm’s, but upgraded to version ovirt 4.3.2
>>>
>>
>> Ryan?
>>
>>
>>> -       The network provider is set to ovn, which is fine..actually
>>> cool, only the “ovs-vswitchd” is a CPU hog, and utilizes 100%
>>>
>>
>> Miguel? Dominik?
>>
>>
>>> -       It seems on all nodes vdsm tries to get the the stats for the HA
>>> engine, which is filling the logs with (not sure if this is new);
>>> [api.virt] FINISH getStats return={'status': {'message': "Virtual
>>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}",
>>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9
>>> (api:54)
>>>
>>
>> Simone?
>>
>>
>>> -       It seems the package os_brick [root] managedvolume not
>>> supported: Managed Volume Not Supported. Missing package os-brick.:
>>> ('Cannot import os_brick',) (caps:149)  which fills the vdsm.log, but for
>>> this I also saw another message, so I suspect this will already be resolved
>>> shortly
>>> -       The machine I used to run the backup HA engine, doesn’t want to
>>> get removed from the hosted-engine –vm-status, not even after running;
>>> hosted-engine --clean-metadata --host-id=10 --force-clean or hosted-engine
>>> --clean-metadata --force-clean from the machine itself.
>>>
>>
>> Simone?
>>
>>
>>>
>>> Think that's about it.
>>>
>>> Don’t get me wrong, I don’t want to rant, I just wanted to share my
>>> experience and see where things can made better.
>>>
>>
>> If not already done, can you please open bugs for above issues at
>> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ?
>>
>>
>>>
>>>
>>> Best Olaf
>>> _______________________________________________
>>> Users mailing list -- users at ovirt.org
>>> To unsubscribe send an email to users-leave at ovirt.org
>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>> oVirt Code of Conduct:
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives:
>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/
>>>
>>
>>
>> --
>>
>> SANDRO BONAZZOLA
>>
>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
>>
>> Red Hat EMEA <https://www.redhat.com/>
>>
>> sbonazzo at redhat.com
>> <https://red.ht/sig>
>>
> _______________________________________________
> Users mailing list -- users at ovirt.org
> To unsubscribe send an email to users-leave at ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190401/de00ac1c/attachment.html>


More information about the Gluster-users mailing list