From sankarshan.mukhopadhyay at gmail.com Mon Apr 1 00:23:29 2019 From: sankarshan.mukhopadhyay at gmail.com (Sankarshan Mukhopadhyay) Date: Mon, 1 Apr 2019 05:53:29 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References: Message-ID: Quite a considerable amount of detail here. Thank you! On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham wrote: > > Hello Gluster users, > > As you all aware that glusterfs-6 is out, we would like to inform you > that, we have spent a significant amount of time in testing > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. > There were xlators (and respective options to enable/disable them) > added and deprecated in glusterfs-6 from various versions [1]. > > We had to check the following upgrade scenarios for all such options > Identified in [1]: > 1) option never enabled and upgraded > 2) option enabled and then upgraded > 3) option enabled and then disabled and then upgraded > > We weren't manually able to check all the combinations for all the options. > So the options involving enabling and disabling xlators were prioritized. > The below are the result of the ones tested. > > Never enabled and upgraded: > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > Enabled and upgraded: > Tested for tier which is deprecated, It is not a recommended upgrade. > As expected the volume won't be consumable and will have a few more > issues as well. > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > Enabled, disabled before upgrade. > Tested for tier with 3.12 and the upgrade went fine. > > There is one common issue to note in every upgrade. The node being > upgraded is going into disconnected state. You have to flush the iptables > and the restart glusterd on all nodes to fix this. > Is this something that is written in the upgrade notes? I do not seem to recall, if not, I'll send a PR > The testing for enabling new options is still pending. The new options > won't cause as much issues as the deprecated ones so this was put at > the end of the priority list. It would be nice to get contributions > for this. > Did the range of tests lead to any new issues? > For the disable testing, tier was used as it covers most of the xlator > that was removed. And all of these tests were done on a replica 3 volume. > I'm not sure if the Glusto team is reading this, but it would be pertinent to understand if the approach you have taken can be converted into a form of automated testing pre-release. > Note: This is only for upgrade testing of the newly added and removed > xlators. Does not involve the normal tests for the xlator. > > If you have any questions, please feel free to reach us. > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing > > Regards, > Hari and Sanju. From hgowtham at redhat.com Mon Apr 1 04:58:21 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Mon, 1 Apr 2019 10:28:21 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References:

Message-ID: Comments inline. On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay wrote: > > Quite a considerable amount of detail here. Thank you! > > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham wrote: > > > > Hello Gluster users, > > > > As you all aware that glusterfs-6 is out, we would like to inform you > > that, we have spent a significant amount of time in testing > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. > > There were xlators (and respective options to enable/disable them) > > added and deprecated in glusterfs-6 from various versions [1]. > > > > We had to check the following upgrade scenarios for all such options > > Identified in [1]: > > 1) option never enabled and upgraded > > 2) option enabled and then upgraded > > 3) option enabled and then disabled and then upgraded > > > > We weren't manually able to check all the combinations for all the options. > > So the options involving enabling and disabling xlators were prioritized. > > The below are the result of the ones tested. > > > > Never enabled and upgraded: > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > > > Enabled and upgraded: > > Tested for tier which is deprecated, It is not a recommended upgrade. > > As expected the volume won't be consumable and will have a few more > > issues as well. > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > > > Enabled, disabled before upgrade. > > Tested for tier with 3.12 and the upgrade went fine. > > > > There is one common issue to note in every upgrade. The node being > > upgraded is going into disconnected state. You have to flush the iptables > > and the restart glusterd on all nodes to fix this. > > > > Is this something that is written in the upgrade notes? I do not seem > to recall, if not, I'll send a PR No this wasn't mentioned in the release notes. PRs are welcome. > > > The testing for enabling new options is still pending. The new options > > won't cause as much issues as the deprecated ones so this was put at > > the end of the priority list. It would be nice to get contributions > > for this. > > > > Did the range of tests lead to any new issues? Yes. In the first round of testing we found an issue and had to postpone the release of 6 until the fix was made available. https://bugzilla.redhat.com/show_bug.cgi?id=1684029 And then we tested it again after this patch was made available. and came across this: https://bugzilla.redhat.com/show_bug.cgi?id=1694010 Have mentioned this in the second mail as to how to over this situation for now until the fix is available. > > > For the disable testing, tier was used as it covers most of the xlator > > that was removed. And all of these tests were done on a replica 3 volume. > > > > I'm not sure if the Glusto team is reading this, but it would be > pertinent to understand if the approach you have taken can be > converted into a form of automated testing pre-release. I don't have an answer for this, have CCed Vijay. He might have an idea. > > > Note: This is only for upgrade testing of the newly added and removed > > xlators. Does not involve the normal tests for the xlator. > > > > If you have any questions, please feel free to reach us. > > > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing > > > > Regards, > > Hari and Sanju. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Regards, Hari Gowtham. From hgowtham at redhat.com Mon Apr 1 05:15:37 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Mon, 1 Apr 2019 10:45:37 +0530 Subject: [Gluster-users] upgrade best practices In-Reply-To: <9c792d30-0e79-98f7-6b76-9d168c947078@redhat.com> References: <629338fe8720f63420d43fa72cc7b080ba213a4c.camel@gmail.com> <9c792d30-0e79-98f7-6b76-9d168c947078@redhat.com> Message-ID: Hi, As mentioned above you need not stop the whole cluster and then upgrade and restart the gluster processes. We did do the basic rolling upgrade test with replica volume. And things turned out fine. There was this minor issue: https://bugzilla.redhat.com/show_bug.cgi?id=1694010 To overcome this, you will have to check if your upgraded node is getting disconnect. if it does, then you will have to 1) stop glusterd service on all the nodes (only glusterd) 2) flush the iptables (iptables -F) 3) start glusterd If you are fine with stopping your service and upgrading all nodes at the same time, You can go ahead with that as well. On Sun, Mar 31, 2019 at 11:02 PM Soumya Koduri wrote: > > > > On 3/29/19 10:39 PM, Poornima Gurusiddaiah wrote: > > > > > > On Fri, Mar 29, 2019, 10:03 PM Jim Kinney > > wrote: > > > > Currently running 3.12 on Centos 7.6. Doing cleanups on split-brain > > and out of sync, need heal files. > > > > We need to migrate the three replica servers to gluster v. 5 or 6. > > Also will need to upgrade about 80 clients as well. Given that a > > complete removal of gluster will not touch the 200+TB of data on 12 > > volumes, we are looking at doing that process, Stop all clients, > > stop all glusterd services, remove all of it, install new version, > > setup new volumes from old bricks, install new clients, mount > > everything. > > > > We would like to get some better performance from nfs-ganesha mounts > > but that doesn't look like an option (not done any parameter tweaks > > in testing yet). At a bare minimum, we would like to minimize the > > total downtime of all systems. > > Could you please be more specific here? As in are you looking for better > performance during upgrade process or in general? Compared to 3.12, > there are lot of perf improvements done in both glusterfs and esp., > nfs-ganesha (latest stable - V2.7.x) stack. If you could provide more > information about your workloads (for eg., large-file,small-files, > metadata-intensive) , we can make some recommendations wrt to configuration. > > Thanks, > Soumya > > > > > Does this process make more sense than a version upgrade process to > > 4.1, then 5, then 6? What "gotcha's" do I need to be ready for? I > > have until late May to prep and test on old, slow hardware with a > > small amount of files and volumes. > > > > > > You can directly upgrade from 3.12 to 6.x. I would suggest that rather > > than deleting and creating Gluster volume. +Hari and +Sanju for further > > guidelines on upgrade, as they recently did upgrade tests. +Soumya to > > add to the nfs-ganesha aspect. > > > > Regards, > > Poornima > > > > -- > > > > James P. Kinney III > > > > Every time you stop a school, you will have to build a jail. What you > > gain at one end you lose at the other. It's like feeding a dog on his > > own tail. It won't fatten the dog. > > - Speech 11/23/1900 Mark Twain > > > > http://heretothereideas.blogspot.com/ > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > -- Regards, Hari Gowtham. From kdhananj at redhat.com Mon Apr 1 05:56:11 2019 From: kdhananj at redhat.com (Krutika Dhananjay) Date: Mon, 1 Apr 2019 11:26:11 +0530 Subject: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5 In-Reply-To: References: <20190328164716.27693.35887@mail.ovirt.org>

Message-ID: Adding back gluster-users Comments inline ... On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar wrote: > Dear Krutika, > > > > 1. I?ve made 2 profile runs of around 10 minutes (see files > profile_data.txt and profile_data2.txt). Looking at it, most time seems be > spent at the fop?s fsync and readdirp. > > Unfortunate I don?t have the profile info for the 3.12.15 version so it?s > a bit hard to compare. > > One additional thing I do notice on 1 machine (10.32.9.5) the iowait time > increased a lot, from an average below the 1% it?s now around the 12% after > the upgrade. > > So first suspicion with be lighting strikes twice, and I?ve also just now > a bad disk, but that doesn?t appear to be the case, since all smart status > report ok. > > Also dd shows performance I would more or less expect; > > dd if=/dev/zero of=/data/test_file bs=100M count=1 oflag=dsync > > 1+0 records in > > 1+0 records out > > 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s > > dd if=/dev/zero of=/data/test_file bs=1G count=1 oflag=dsync > > 1+0 records in > > 1+0 records out > > 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s > > if=/dev/urandom of=/data/test_file bs=1024 count=1000000 > > 1000000+0 records in > > 1000000+0 records out > > 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s > > dd if=/dev/zero of=/data/test_file bs=1024 count=1000000 > > 1000000+0 records in > > 1000000+0 records out > > 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s > > When I disable this brick (service glusterd stop; pkill glusterfsd) > performance in gluster is better, but not on par with what it was. Also the > cpu usages on the ?neighbor? nodes which hosts the other bricks in the same > subvolume increases quite a lot in this case, which I wouldn?t expect > actually since they shouldn't handle much more work, except flagging shards > to heal. Iowait also goes to idle once gluster is stopped, so it?s for > sure gluster which waits for io. > > > So I see that FSYNC %-latency is on the higher side. And I also noticed you don't have direct-io options enabled on the volume. Could you set the following options on the volume - # gluster volume set network.remote-dio off # gluster volume set performance.strict-o-direct on and also disable choose-local # gluster volume set cluster.choose-local off let me know if this helps. 2. I?ve attached the mnt log and volume info, but I couldn?t find anything > relevant in in those logs. I think this is because we run the VM?s with > libgfapi; > > [root at ovirt-host-01 ~]# engine-config -g LibgfApiSupported > > LibgfApiSupported: true version: 4.2 > > LibgfApiSupported: true version: 4.1 > > LibgfApiSupported: true version: 4.3 > > And I can confirm the qemu process is invoked with the gluster:// address > for the images. > > The message is logged in the /var/lib/libvert/qemu/ file, which > I?ve also included. For a sample case see around; 2019-03-28 20:20:07 > > Which has the error; E [MSGID: 133010] > [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on > shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c > [Stale file handle] > Could you also attach the brick logs for this volume? > > 3. yes I see multiple instances for the same brick directory, like; > > /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id > ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p > /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid > -S /var/run/gluster/452591c9165945d9.socket --brick-name > /data/gfs/bricks/brick1/ovirt-core -l > /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log > --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1 > --process-name brick --brick-port 49154 --xlator-option > ovirt-core-server.listen-port=49154 > > > > I?ve made an export of the output of ps from the time I observed these > multiple processes. > > In addition the brick_mux bug as noted by Atin. I might also have another > possible cause, as ovirt moves nodes from none-operational state or > maintenance state to active/activating, it also seems to restart gluster, > however I don?t have direct proof for this theory. > > > +Atin Mukherjee ^^ +Mohit Agrawal ^^ -Krutika Thanks Olaf > > Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola >: > >> >> >> Il giorno gio 28 mar 2019 alle ore 17:48 ha >> scritto: >> >>> Dear All, >>> >>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While >>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a >>> different experience. After first trying a test upgrade on a 3 node setup, >>> which went fine. i headed to upgrade the 9 node production platform, >>> unaware of the backward compatibility issues between gluster 3.12.15 -> >>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start. >>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata >>> was missing or couldn't be accessed. Restoring this file by getting a good >>> copy of the underlying bricks, removing the file from the underlying bricks >>> where the file was 0 bytes and mark with the stickybit, and the >>> corresponding gfid's. Removing the file from the mount point, and copying >>> back the file on the mount point. Manually mounting the engine domain, and >>> manually creating the corresponding symbolic links in /rhev/data-center and >>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was >>> root.root), i was able to start the HA engine again. Since the engine was >>> up again, and things seemed rather unstable i decided to continue the >>> upgrade on the other nodes suspecting an incompatibility in gluster >>> versions, i thought would be best to have them all on the same version >>> rather soonish. However things went from bad to worse, the engine stopped >>> again, and all vm?s stopped working as well. So on a machine outside the >>> setup and restored a backup of the engine taken from version 4.2.8 just >>> before the upgrade. With this engine I was at least able to start some vm?s >>> again, and finalize the upgrade. Once the upgraded, things didn?t stabilize >>> and also lose 2 vm?s during the process due to image corruption. After >>> figuring out gluster 5.3 had quite some issues I was as lucky to see >>> gluster 5.5 was about to be released, on the moment the RPM?s were >>> available I?ve installed those. This helped a lot in terms of stability, >>> for which I?m very grateful! However the performance is unfortunate >>> terrible, it?s about 15% of what the performance was running gluster >>> 3.12.15. It?s strange since a simple dd shows ok performance, but our >>> actual workload doesn?t. While I would expect the performance to be better, >>> due to all improvements made since gluster version 3.12. Does anybody share >>> the same experience? >>> I really hope gluster 6 will soon be tested with ovirt and released, and >>> things start to perform and stabilize again..like the good old days. Of >>> course when I can do anything, I?m happy to help. >>> >> >> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track the >> rebase on Gluster 6. >> >> >> >>> >>> I think the following short list of issues we have after the migration; >>> Gluster 5.5; >>> - Poor performance for our workload (mostly write dependent) >>> - VM?s randomly pause on unknown storage errors, which are ?stale >>> file?s?. corresponding log; Lookup on shard 797 failed. Base file gfid = >>> 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle] >>> - Some files are listed twice in a directory (probably related the >>> stale file issue?) >>> Example; >>> ls -la >>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/ >>> total 3081 >>> drwxr-x---. 2 vdsm kvm 4096 Mar 18 11:34 . >>> drwxr-xr-x. 13 vdsm kvm 4096 Mar 19 09:42 .. >>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>> -rw-rw----. 1 vdsm kvm 1048576 Jan 27 2018 >>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease >>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>> >>> - brick processes sometimes starts multiple times. Sometimes I?ve 5 >>> brick processes for a single volume. Killing all glusterfsd?s for the >>> volume on the machine and running gluster v start force usually just >>> starts one after the event, from then on things look all right. >>> >>> >> May I kindly ask to open bugs on Gluster for above issues at >> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ? >> Sahina? >> >> >>> Ovirt 4.3.2.1-1.el7 >>> - All vms images ownership are changed to root.root after the vm >>> is shutdown, probably related to; >>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only scoped >>> to the HA engine. I?m still in compatibility mode 4.2 for the cluster and >>> for the vm?s, but upgraded to version ovirt 4.3.2 >>> >> >> Ryan? >> >> >>> - The network provider is set to ovn, which is fine..actually >>> cool, only the ?ovs-vswitchd? is a CPU hog, and utilizes 100% >>> >> >> Miguel? Dominik? >> >> >>> - It seems on all nodes vdsm tries to get the the stats for the HA >>> engine, which is filling the logs with (not sure if this is new); >>> [api.virt] FINISH getStats return={'status': {'message': "Virtual >>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}", >>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9 >>> (api:54) >>> >> >> Simone? >> >> >>> - It seems the package os_brick [root] managedvolume not >>> supported: Managed Volume Not Supported. Missing package os-brick.: >>> ('Cannot import os_brick',) (caps:149) which fills the vdsm.log, but for >>> this I also saw another message, so I suspect this will already be resolved >>> shortly >>> - The machine I used to run the backup HA engine, doesn?t want to >>> get removed from the hosted-engine ?vm-status, not even after running; >>> hosted-engine --clean-metadata --host-id=10 --force-clean or hosted-engine >>> --clean-metadata --force-clean from the machine itself. >>> >> >> Simone? >> >> >>> >>> Think that's about it. >>> >>> Don?t get me wrong, I don?t want to rant, I just wanted to share my >>> experience and see where things can made better. >>> >> >> If not already done, can you please open bugs for above issues at >> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ? >> >> >>> >>> >>> Best Olaf >>> _______________________________________________ >>> Users mailing list -- users at ovirt.org >>> To unsubscribe send an email to users-leave at ovirt.org >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/ >>> >> >> >> -- >> >> SANDRO BONAZZOLA >> >> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >> >> Red Hat EMEA >> >> sbonazzo at redhat.com >> >> > _______________________________________________ > Users mailing list -- users at ovirt.org > To unsubscribe send an email to users-leave at ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users at ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From revirii at googlemail.com Mon Apr 1 07:41:51 2019 From: revirii at googlemail.com (Hu Bert) Date: Mon, 1 Apr 2019 09:41:51 +0200 Subject: [Gluster-users] Lots of connections on clients - appropriate values for various thread parameters In-Reply-To: References:

Message-ID: Good morning, it seems like setting performance.quick-read to off (context: increased network traffic https://bugzilla.redhat.com/show_bug.cgi?id=1673058) solved the main problem. See those 2 munin graphs, especially network and iowait on March 24th and 31st (high traffic days); param was set to off on March 26th. network: https://abload.de/img/client-internal-netwoh3kh7.png cpu: https://abload.de/img/client-cpu-iowaitatkfc.png I'll keep watching this, but hopefully the problems have disappeared. Awaiting glusterfs v5.6 with the bugfix; then, after re-enabling quick-read, i'll check again. Regards, Hubert Am Fr., 29. M?rz 2019 um 07:47 Uhr schrieb Hu Bert : > > Hi Raghavendra, > > i'll try to gather the information you need, hopefully this weekend. > > One thing i've done this week: deactivate performance.quick-read > (https://bugzilla.redhat.com/show_bug.cgi?id=1673058), which > (according to munin) ended in a massive drop in network traffic and a > slightly lower iowait. Maybe that has helped already. We'll see. > > performance.nl-cache is deactivated due to unreadable > files/directories; we have a highly concurrent workload. There are > some nginx backend webservers that check if a requested file exists in > the glusterfs filesystem; i counted the log entries, this can be up to > 5 million entries a day; about 2/3 of the files are found in the > filesystem, they get delivered to the frontend; if not: the nginx's > send the request via round robin to 3 backend tomcats, and they have > to check whether a directory exists or not (and then create it and the > requested files). So it happens that tomcatA creates a directory and a > file in it, and within (milli)seconds tomcatB+C create additional > files in this dir. > > Deactivating nl-cache helped to solve this issue, after having > conversation with Nithya and Ravishankar. Just wanted to explain that. > > > Thx so far, > Hubert > > Am Fr., 29. M?rz 2019 um 06:29 Uhr schrieb Raghavendra Gowdappa > : > > > > +Gluster-users > > > > Sorry about the delay. There is nothing suspicious about per thread CPU utilization of glusterfs process. However looking at the volume profile attached I see huge number of lookups. I think if we cutdown the number of lookups probably we'll see improvements in performance. I need following information: > > > > * dump of fuse traffic under heavy load (use --dump-fuse option while mounting) > > * client volume profile for the duration of heavy load - https://docs.gluster.org/en/latest/Administrator%20Guide/Performance%20Testing/ > > * corresponding brick volume profile > > > > Basically I need to find out > > * whether these lookups are on existing files or non-existent files > > * whether they are on directories or files > > * why/whether md-cache or kernel attribute cache or nl-cache will help to cut down lookups. > > > > regards, > > Raghavendra > > > > On Mon, Mar 25, 2019 at 12:13 PM Hu Bert wrote: > >> > >> Hi Raghavendra, > >> > >> sorry, this took a while. The last weeks the weather was bad -> less > >> traffic, but this weekend there was a massive peak. I made 3 profiles > >> with top, but at first look there's nothing special here. > >> > >> I also made a gluster profile (on one of the servers) at a later > >> moment. Maybe that helps. I also added some munin graphics from 2 of > >> the clients and 1 graphic of server network, just to show how massive > >> the problem is. > >> > >> Just wondering if the high io wait is related to the high network > >> traffic bug (https://bugzilla.redhat.com/show_bug.cgi?id=1673058); if > >> so, i could deactivate performance.quick-read and check if there is > >> less iowait. If that helps: wonderful - and yearningly awaiting > >> updated packages (e.g. v5.6). If not: maybe we have to switch from our > >> normal 10TB hdds (raid10) to SSDs if the problem is based on slow > >> hardware in the use case of small files (images). > >> > >> > >> Thx, > >> Hubert > >> > >> Am Mo., 4. M?rz 2019 um 16:59 Uhr schrieb Raghavendra Gowdappa > >> : > >> > > >> > Were you seeing high Io-wait when you captured the top output? I guess not as you mentioned the load increases during weekend. Please note that this data has to be captured when you are experiencing problems. > >> > > >> > On Mon, Mar 4, 2019 at 8:02 PM Hu Bert wrote: > >> >> > >> >> Hi, > >> >> sending the link directly to you and not the list, you can distribute > >> >> if necessary. the command ran for about half a minute. Is that enough? > >> >> More? Less? > >> >> > >> >> https://download.outdooractive.com/top.output.tar.gz > >> >> > >> >> Am Mo., 4. M?rz 2019 um 15:21 Uhr schrieb Raghavendra Gowdappa > >> >> : > >> >> > > >> >> > > >> >> > > >> >> > On Mon, Mar 4, 2019 at 7:47 PM Raghavendra Gowdappa wrote: > >> >> >> > >> >> >> > >> >> >> > >> >> >> On Mon, Mar 4, 2019 at 4:26 PM Hu Bert wrote: > >> >> >>> > >> >> >>> Hi Raghavendra, > >> >> >>> > >> >> >>> at the moment iowait and cpu consumption is quite low, the main > >> >> >>> problems appear during the weekend (high traffic, especially on > >> >> >>> sunday), so either we have to wait until next sunday or use a time > >> >> >>> machine ;-) > >> >> >>> > >> >> >>> I made a screenshot of top (https://abload.de/img/top-hvvjt2.jpg) and > >> >> >>> a text output (https://pastebin.com/TkTWnqxt), maybe that helps. Seems > >> >> >>> like processes like glfs_fuseproc (>204h) and glfs_epoll (64h for each > >> >> >>> process) consume a lot of CPU (uptime 24 days). Is that already > >> >> >>> helpful? > >> >> >> > >> >> >> > >> >> >> Not much. The TIME field just says the amount of time the thread has been executing. Since its a long standing mount, we can expect such large values. But, the value itself doesn't indicate whether the thread itself was overloaded at any (some) interval(s). > >> >> >> > >> >> >> Can you please collect output of following command and send back the collected data? > >> >> >> > >> >> >> # top -bHd 3 > top.output > >> >> > > >> >> > > >> >> > Please collect this on problematic mounts and bricks. > >> >> > > >> >> >> > >> >> >>> > >> >> >>> > >> >> >>> Hubert > >> >> >>> > >> >> >>> Am Mo., 4. M?rz 2019 um 11:31 Uhr schrieb Raghavendra Gowdappa > >> >> >>> : > >> >> >>> > > >> >> >>> > what is the per thread CPU usage like on these clients? With highly concurrent workloads we've seen single thread that reads requests from /dev/fuse (fuse reader thread) becoming bottleneck. Would like to know what is the cpu usage of this thread looks like (you can use top -H). > >> >> >>> > > >> >> >>> > On Mon, Mar 4, 2019 at 3:39 PM Hu Bert wrote: > >> >> >>> >> > >> >> >>> >> Good morning, > >> >> >>> >> > >> >> >>> >> we use gluster v5.3 (replicate with 3 servers, 2 volumes, raid10 as > >> >> >>> >> brick) with at the moment 10 clients; 3 of them do heavy I/O > >> >> >>> >> operations (apache tomcats, read+write of (small) images). These 3 > >> >> >>> >> clients have a quite high I/O wait (stats from yesterday) as can be > >> >> >>> >> seen here: > >> >> >>> >> > >> >> >>> >> client: https://abload.de/img/client1-cpu-dayulkza.png > >> >> >>> >> server: https://abload.de/img/server1-cpu-dayayjdq.png > >> >> >>> >> > >> >> >>> >> The iowait in the graphics differ a lot. I checked netstat for the > >> >> >>> >> different clients; the other clients have 8 open connections: > >> >> >>> >> https://pastebin.com/bSN5fXwc > >> >> >>> >> > >> >> >>> >> 4 for each server and each volume. The 3 clients with the heavy I/O > >> >> >>> >> have (at the moment) according to netstat 170, 139 and 153 > >> >> >>> >> connections. An example for one client can be found here: > >> >> >>> >> https://pastebin.com/2zfWXASZ > >> >> >>> >> > >> >> >>> >> gluster volume info: https://pastebin.com/13LXPhmd > >> >> >>> >> gluster volume status: https://pastebin.com/cYFnWjUJ > >> >> >>> >> > >> >> >>> >> I just was wondering if the iowait is based on the clients and their > >> >> >>> >> workflow: requesting a lot of files (up to hundreds per second), > >> >> >>> >> opening a lot of connections and the servers aren't able to answer > >> >> >>> >> properly. Maybe something can be tuned here? > >> >> >>> >> > >> >> >>> >> Especially the server|client.event-threads (both set to 4) and > >> >> >>> >> performance.(high|normal|low|least)-prio-threads (all at default value > >> >> >>> >> 16) and performance.io-thread-count (32) options, maybe these aren't > >> >> >>> >> properly configured for up to 170 client connections. > >> >> >>> >> > >> >> >>> >> Both servers and clients have a Xeon CPU (6 cores, 12 threads), a 10 > >> >> >>> >> GBit connection and 128G (servers) respectively 256G (clients) RAM. > >> >> >>> >> Enough power :-) > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> Thx for reading && best regards, > >> >> >>> >> > >> >> >>> >> Hubert > >> >> >>> >> _______________________________________________ > >> >> >>> >> Gluster-users mailing list > >> >> >>> >> Gluster-users at gluster.org > >> >> >>> >> https://lists.gluster.org/mailman/listinfo/gluster-users From jim.kinney at gmail.com Mon Apr 1 16:15:10 2019 From: jim.kinney at gmail.com (Jim Kinney) Date: Mon, 01 Apr 2019 12:15:10 -0400 Subject: [Gluster-users] upgrade best practices In-Reply-To: <9c792d30-0e79-98f7-6b76-9d168c947078@redhat.com> References: <629338fe8720f63420d43fa72cc7b080ba213a4c.camel@gmail.com> <9c792d30-0e79-98f7-6b76-9d168c947078@redhat.com> Message-ID: On Sun, 2019-03-31 at 23:01 +0530, Soumya Koduri wrote: > On 3/29/19 10:39 PM, Poornima Gurusiddaiah wrote: > > On Fri, Mar 29, 2019, 10:03 PM Jim Kinney > > wrote: > > Currently running 3.12 on Centos 7.6. Doing cleanups on split- > > brain and out of sync, need heal files. > > We need to migrate the three replica servers to gluster v. 5 or > > 6. Also will need to upgrade about 80 clients as well. Given > > that a complete removal of gluster will not touch the 200+TB of > > data on 12 volumes, we are looking at doing that process, Stop > > all clients, stop all glusterd services, remove all of it, > > install new version, setup new volumes from old bricks, install > > new clients, mount everything. > > We would like to get some better performance from nfs-ganesha > > mounts but that doesn't look like an option (not done any > > parameter tweaks in testing yet). At a bare minimum, we would > > like to minimize the total downtime of all systems. > > Could you please be more specific here? As in are you looking for > better performance during upgrade process or in general? Compared to > 3.12, there are lot of perf improvements done in both glusterfs and > esp., nfs-ganesha (latest stable - V2.7.x) stack. If you could > provide more information about your workloads (for eg., large- > file,small-files, metadata-intensive) , we can make some > recommendations wrt to configuration. Sure. More details: We are (soon to be) running a three-node replica only gluster service (2 nodes now, third is racked and ready for sync and being added to gluster cluster). Each node has 2 external drive arrays plus one internal. Each node has 40G IB plus 40G IP connections (plans to upgrade to 100G). We currently have 9 volumes and each is 7TB up to 50TB of space. Each volume is a mix of thousands of large (>1GB) and tens of thousands of small (~100KB) plus thousands inbetween. Currently we have a 13-node computational cluster with varying GPU abilities that mounts all of these volumes using gluster-fuse. Writes are slow and reads are also as if from a single server. I have data from a test setup (not anywhere near the capacity of the production system - just for testing commands and recoveries) that indicates raw NFS is much faster but no gluster, gluster-fuse is much slower. We have mmap issues with python and fuse-mounted locations. Converting to NFS solves this. We have tinkered with kernel settings to handle oom-killer so it will no longer drop glusterfs when an errant job eat all the ram (set oom_score_adj - -1000 for all glusterfs pids). We would like to transition (smoothly!!) to gluster 5 or 6 with nfs- ganesha 2.7 and see some performance improvements. We will be using corosync and pacemaker for NFS failover. It would be fantastic be able to saturate a 10G IPoIB (or 40G IB !) connection to each compute node in the current computational cluster. Right now we absolutely can't get much write speed ( copy a 6.2GB file from host to gluster storage took 1m 21s. cp from disk to /dev/null is 7s). cp from gluster to /dev/null is 1.0m (same 6.2GB file). That's a 10Gbps IPoIB connection at only 800Mbps. We would like to do things like enable SSL encryption of all data flows (we deal with PHI data in a HIPAA-regulated setting) but are concerned about performance. We are running dual Intel Xeon E5-2630L (12 physical cores each @ 2.4GHz) and 128GB RAM in each server node. We have 170 users. About 20 are active at any time. The current setting on /home (others are similar if not identical, maybe nfs-disable is true for others): gluster volume get home allOption Value ------ -- --- cluster.lookup- unhashed on cluste r.lookup- optimize off cluste r.min-free- disk 10% cluster. min-free- inodes 5% cluster. rebalance- stats off cluster.s ubvols-per- directory (null) cluster.rea ddir- optimize off cluster .rsync-hash- regex (null) cluster.ex tra-hash- regex (null) cluster.dh t-xattr- name trusted.glusterfs.dht cluster.r andomize-hash-range-by- gfid off cluster.rebal- throttle normal clust er.lock- migration off clus ter.local-volume- name (null) cluster.weig hted- rebalance on cluster. switch- pattern (null) cluste r.entry-change- log on cluster.read -subvolume (null) clu ster.read-subvolume-index - 1 cluster.read-hash- mode 1 cluster.b ackground-self-heal- count 8 cluster.metadata- self- heal on cluster.data- self- heal on cluster.e ntry-self- heal on cluster.se lf-heal- daemon enable cluster.h eal- timeout 600 clus ter.self-heal-window- size 1 cluster.data- change- log on cluster.met adata-change- log on cluster.data- self-heal- algorithm (null) cluster.eager- lock on dispe rse.eager- lock on cluste r.quorum- type none cluste r.quorum- count (null) cluste r.choose- local true cluste r.self-heal-readdir- size 1KB cluster.post-op- delay- secs 1 cluster.ensur e- durability on cluste r.consistent- metadata no cluster.he al-wait-queue- length 128 cluster.favorit e-child- policy none cluster.stripe -block- size 128KB cluster.stri pe- coalesce true diagno stics.latency- measurement off diagnostics .dump-fd- stats off diagnostics .count-fop- hits off diagnostics.b rick-log- level INFO diagnostics.c lient-log- level INFO diagnostics.br ick-sys-log- level CRITICAL diagnostics.clien t-sys-log- level CRITICAL diagnostics.brick- logger (null) diagnosti cs.client- logger (null) diagnostic s.brick-log- format (null) diagnostics.c lient-log- format (null) diagnostics.br ick-log-buf- size 5 diagnostics.clien t-log-buf- size 5 diagnostics.brick- log-flush- timeout 120 diagnostics.client- log-flush- timeout 120 diagnostics.stats- dump- interval 0 diagnostics.fo p-sample- interval 0 diagnostics.st ats-dump- format json diagnostics.fo p-sample-buf- size 65535 diagnostics.stats- dnscache-ttl- sec 86400 performance.cache-max- file- size 0 performance.cache- min-file- size 0 performance.cache- refresh- timeout 1 performance.cache -priority performa nce.cache- size 32MB performan ce.io-thread- count 16 performance.h igh-prio- threads 16 performance.n ormal-prio- threads 16 performance.low -prio- threads 16 performance. least-prio- threads 1 performance.en able-least- priority on performance.cach e- size 128MB performan ce.flush- behind on performan ce.nfs.flush- behind on performance.w rite-behind-window- size 1MB performance.resync- failed-syncs-after- fsyncoff performance.nfs.write- behind-window- size1MB performance.strict-o- direct off performance. nfs.strict-o- direct off performance.stri ct-write- ordering off performance.nfs. strict-write- ordering off performance.lazy- open yes performa nce.read-after- open no performance.re ad-ahead-page- count 4 performance.md- cache- timeout 1 performance. cache-swift- metadata true performance.cac he-samba- metadata false performance.cac he-capability- xattrs true performance.cache- ima- xattrs true features.encr yption off encr yption.master- key (null) encryptio n.data-key- size 256 encryption. block- size 4096 network. frame- timeout 1800 netwo rk.ping- timeout 42 netw ork.tcp-window- size (null) features.l ock- heal off featu res.grace- timeout 10 networ k.remote- dio disable client .event- threads 2 clie nt.tcp-user- timeout 0 client. keepalive- time 20 client.k eepalive- interval 2 client.k eepalive- count 9 network. tcp-window- size (null) network.in ode-lru- limit 16384 auth.allo w * auth.reject (null) transport.keepalive 1 server.allow- insecure (null) serv er.root- squash off ser ver.anonuid 65534 server.anongid 65534 server.statedump- path /var/run/gluster server.o utstanding-rpc- limit 64 features.lock- heal off featu res.grace- timeout 10 server .ssl (null) auth.ssl- allow * server.manage- gids off serve r.dynamic- auth on client .send- gids on ser ver.gid- timeout 300 se rver.own- thread (null) se rver.event- threads 1 serv er.tcp-user- timeout 0 server. keepalive- time 20 server.k eepalive- interval 2 server.k eepalive- count 9 transpor t.listen- backlog 10 ssl.own- cert (null) ssl.private- key (null) ssl .ca- list (null) ssl.crl- path (null) ssl.certificate- depth (null) ssl.cip her- list (null) ss l.dh- param (null) ssl.ec- curve (null) performance.write- behind on performan ce.read- ahead on performa nce.readdir- ahead off performance .io- cache on perfor mance.quick- read on performan ce.open- behind on performa nce.nl- cache off perfor mance.stat- prefetch on performa nce.client-io- threads off performance.n fs.write- behind on performance.n fs.read- ahead off performance. nfs.io- cache off performanc e.nfs.quick- read off performance.n fs.stat- prefetch off performance. nfs.io- threads off performanc e.force- readdirp true performan ce.cache- invalidation false features. uss off features.snapshot- directory .snaps features. show-snapshot- directory off network.compre ssion off netwo rk.compression.window-size - 15 network.compression.mem- level 8 network.compres sion.min- size 0 network.compres sion.compression-level - 1 network.compression.debug false features.limit- usage (null) featur es.default-soft- limit 80% features.soft -timeout 60 feat ures.hard- timeout 5 featu res.alert- time 86400 featur es.quota-deem- statfs off geo- replication.indexing off geo- replication.indexing off geo-replication.ignore-pid- check off geo- replication.ignore-pid- check off features.quota off features. inode- quota off featur es.bitrot disable debug.trace off debug.log- history no d ebug.log- file no d ebug.exclude- ops (null) debug .include- ops (null) debug .error- gen off deb ug.error- failure (null) deb ug.error- number (null) deb ug.random- failure off debu g.error- fops (null) nfs .enable- ino32 no nf s.mem- factor 15 nfs.export- dirs on nf s.export- volumes on nf s.addr- namelookup off nfs.dynamic- volumes off nfs .register-with- portmap on nfs.outst anding-rpc- limit 16 nfs.port 2049 nf s.rpc-auth- unix on nfs. rpc-auth- null on nfs. rpc-auth- allow all nfs. rpc-auth- reject none nfs. ports- insecure off n fs.trusted- sync off nfs .trusted- write off nfs .volume-access read- write nfs.export- dir nf s.disable off nfs.nlm on nfs.acl on nfs.mount- udp off n fs.mount- rmtab /var/lib/glusterd/nfs/rmtab n fs.rpc- statd /sbin/rpc.statd nfs.server-aux- gids off nfs.dr c off nfs.drc- size 0x20000 nfs.read-size (1 * 1048576ULL) nfs.write- size (1 * 1048576ULL) nfs.readdir- size (1 * 1048576ULL) nfs.rdirplus on nfs.exports-auth- enable (null) nfs.auth -refresh-interval- sec (null) nfs.auth-cache- ttl- sec (null) features.r ead- only off featu res.worm off features.worm-file- level off features.d efault-retention- period 120 features.retention -mode relax features. auto-commit- period 180 storage.linu x- aio off stora ge.batch-fsync-mode reverse- fsync storage.batch-fsync-delay- usec 0 storage.owner- uid - 1 storage.owner- gid - 1 storage.node-uuid- pathinfo off storage.h ealth-check- interval 30 storage.buil d- pgfid on stora ge.gfid2path on storage.gfid2path- separator : storage.b d- aio off cl uster.server-quorum- type off cluster.serve r-quorum- ratio 0 changelog.cha ngelog off chan gelog.changelog- dir (null) changelog.e ncoding ascii ch angelog.rollover- time 15 changelog. fsync- interval 5 changel og.changelog-barrier- timeout 120 changelog.capture- del- path off features.barr ier disable feat ures.barrier- timeout 120 features .trash off features.trash- dir .trashcan featur es.trash-eliminate- path (null) features.trash- max- filesize 5MB features.t rash-internal- op off cluster.enable- shared- storage disable cluster.write -freq- threshold 0 cluster.re ad-freq- threshold 0 cluster.t ier- pause off clus ter.tier-promote- frequency 120 cluster.tier -demote- frequency 3600 cluster.wat ermark- hi 90 cluster.w atermark- low 75 cluster.t ier- mode cache clus ter.tier-max-promote-file- size 0 cluster.tier-max- mb 4000 cluster. tier-max- files 10000 cluster. tier-query- limit 100 cluster.ti er- compact on clus ter.tier-hot-compact- frequency 604800 cluster.tier- cold-compact- frequency 604800 features.ctr- enabled off feat ures.record- counters off feature s.ctr-record-metadata- heat off features.ctr_link_co nsistency off features.ct r_lookupheal_link_timeout 300 fe atures.ctr_lookupheal_inode_timeout 300 features.ctr-sql-db- cachesize 12500 features.ct r-sql-db-wal- autocheckpoint 25000 features.selinu x on locks. trace off locks.mandatory- locking off cluster .disperse-self-heal- daemon enable cluster.quorum- reads no client .bind- insecure (null) fea tures.shard off features.shard-block- size 64MB features.scr ub- throttle lazy featur es.scrub- freq biweekly featur es.scrub false features.expiry- time 120 feature s.cache- invalidation off featur es.cache-invalidation- timeout 60 features.leases off features.l ease-lock-recall- timeout 60 disperse.backgroun d- heals 8 disperse.he al-wait- qlength 128 cluster.he al- timeout 600 dht. force- readdirp on d isperse.read-policy gfid- hash cluster.shd-max- threads 1 cluster .shd-wait- qlength 1024 cluster. locking- scheme full cluster .granular-entry- heal no features.locks -revocation- secs 0 features.locks- revocation-clear- all false features.locks- revocation-max- blocked 0 features.locks- monkey- unlocking false disperse.shd- max- threads 1 disperse .shd-wait- qlength 1024 disperse. cpu- extensions auto disp erse.self-heal-window- size 1 cluster.use- compound- fops off performance. parallel- readdir off performance. rda-request- size 131072 performance.rda -low- wmark 4096 performance .rda-high- wmark 128KB performance. rda-cache- limit 10MB performance.n l-cache-positive- entry false performance.nl-cache- limit 10MB performance. nl-cache- timeout 60 cluster.bric k- multiplex off clust er.max-bricks-per- process 0 disperse.optim istic-change- log on cluster.halo- enabled False clus ter.halo-shd-max- latency 99999 cluster.halo -nfsd-max- latency 5 cluster.halo- max- latency 5 cluster. halo-max-replicas > Thanks,Soumya > > Does this process make more sense than a version upgrade > > process to 4.1, then 5, then 6? What "gotcha's" do I need to be > > ready for? I have until late May to prep and test on old, slow > > hardware with a small amount of files and volumes. > > > > You can directly upgrade from 3.12 to 6.x. I would suggest that > > rather than deleting and creating Gluster volume. +Hari and +Sanju > > for further guidelines on upgrade, as they recently did upgrade > > tests. +Soumya to add to the nfs-ganesha aspect. > > Regards,Poornima > > -- > > James P. Kinney III > > Every time you stop a school, you will have to build a jail. > > What you gain at one end you lose at the other. It's like > > feeding a dog on his own tail. It won't fatten the dog. - > > Speech 11/23/1900 Mark Twain > > http://heretothereideas.blogspot.com/ > > > > _______________________________________________ Gluster- > > users mailing list Gluster-users at gluster.org > users at gluster.org> > > https://lists.gluster.org/mailman/listinfo/gluster-users > > -- James P. Kinney III Every time you stop a school, you will have to build a jail. What you gain at one end you lose at the other. It's like feeding a dog on his own tail. It won't fatten the dog. - Speech 11/23/1900 Mark Twain http://heretothereideas.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomfite at gmail.com Mon Apr 1 18:27:17 2019 From: tomfite at gmail.com (Tom Fite) Date: Mon, 1 Apr 2019 14:27:17 -0400 Subject: [Gluster-users] Rsync in place of heal after brick failure Message-ID: Hi all, I have a very large (65 TB) brick in a replica 2 volume that needs to be re-copied from scratch. A heal will take a very long time with performance degradation on the volume so I investigated using rsync to do the brunt of the work. The command: rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 /data/brick1/ Running with -H assures that the hard links in .glusterfs are preserved, and -X preserves all of gluster's extended attributes. I've tested this on my test environment as follows: 1. Stop glusterd and kill procs 2. Move brick volume to backup dir 3. Run rsync 4. Start glusterd 5. Observe gluster status All appears to be working correctly. Gluster status reports all bricks online, all data is accessible in the volume, and I don't see any errors in the logs. Anybody else have experience trying this? Thanks -Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From skoduri at redhat.com Mon Apr 1 18:37:18 2019 From: skoduri at redhat.com (Soumya Koduri) Date: Tue, 2 Apr 2019 00:07:18 +0530 Subject: [Gluster-users] upgrade best practices In-Reply-To: References: <629338fe8720f63420d43fa72cc7b080ba213a4c.camel@gmail.com> <9c792d30-0e79-98f7-6b76-9d168c947078@redhat.com> Message-ID: <3ca551bb-cfbf-d079-e3a2-6e3226a07618@redhat.com> Thanks for the details. Response inline - On 4/1/19 9:45 PM, Jim Kinney wrote: > On Sun, 2019-03-31 at 23:01 +0530, Soumya Koduri wrote: >> >> On 3/29/19 10:39 PM, Poornima Gurusiddaiah wrote: >>> >>> >>> On Fri, Mar 29, 2019, 10:03 PM Jim Kinney < >>> jim.kinney at gmail.com >>> >>> >>> >> jim.kinney at gmail.com >>> >>> >> wrote: >>> >>> Currently running 3.12 on Centos 7.6. Doing cleanups on split-brain >>> and out of sync, need heal files. >>> >>> We need to migrate the three replica servers to gluster v. 5 or 6. >>> Also will need to upgrade about 80 clients as well. Given that a >>> complete removal of gluster will not touch the 200+TB of data on 12 >>> volumes, we are looking at doing that process, Stop all clients, >>> stop all glusterd services, remove all of it, install new version, >>> setup new volumes from old bricks, install new clients, mount >>> everything. >>> >>> We would like to get some better performance from nfs-ganesha mounts >>> but that doesn't look like an option (not done any parameter tweaks >>> in testing yet). At a bare minimum, we would like to minimize the >>> total downtime of all systems. >> >> Could you please be more specific here? As in are you looking for better >> performance during upgrade process or in general? Compared to 3.12, >> there are lot of perf improvements done in both glusterfs and esp., >> nfs-ganesha (latest stable - V2.7.x) stack. If you could provide more >> information about your workloads (for eg., large-file,small-files, >> metadata-intensive) , we can make some recommendations wrt to configuration. > > Sure. More details: > > We are (soon to be) running a three-node replica only gluster service (2 > nodes now, third is racked and ready for sync and being added to gluster > cluster). Each node has 2 external drive arrays plus one internal. Each > node has 40G IB plus 40G IP connections (plans to upgrade to 100G). We > currently have 9 volumes and each is 7TB up to 50TB of space. Each > volume is a mix of thousands of large (>1GB) and tens of thousands of > small (~100KB) plus thousands inbetween. > > Currently we have a 13-node computational cluster with varying GPU > abilities that mounts all of these volumes using gluster-fuse. Writes > are slow and reads are also as if from a single server. I have data from > a test setup (not anywhere near the capacity of the production system - > just for testing commands and recoveries) that indicates raw NFS is much > faster but no gluster, gluster-fuse is much slower. We have mmap issues > with python and fuse-mounted locations. Converting to NFS solves this. > We have tinkered with kernel settings to handle oom-killer so it will no > longer drop glusterfs when an errant job eat all the ram (set > oom_score_adj - -1000 for all glusterfs pids). Have you tried tuning any perf parameters? From the volume options you have shared below, I see that there is scope to improve performance (for eg., by enabling md-cache parameters and parallel-readdir, metadata related operations latency can be improved). Request Poornima, Xavi or Du to comment on recommended values for better I/O throughput for your workload. > > We would like to transition (smoothly!!) to gluster 5 or 6 with > nfs-ganesha 2.7 and see some performance improvements. We will be using > corosync and pacemaker for NFS failover. It would be fantastic be able > to saturate a 10G IPoIB (or 40G IB !) connection to each compute node in > the current computational cluster. Right now we absolutely can't get > much write speed ( copy a 6.2GB file from host to gluster storage took > 1m 21s. cp from disk to /dev/null is 7s). cp from gluster to /dev/null > is 1.0m (same 6.2GB file). That's a 10Gbps IPoIB connection at only 800Mbps. Few things to note here - * The volume option "nfs.disable" command refers to GluscterNFS service which is being deprecated and not enabled by default in the latest gluster versions available (like in gluster 5 & 6). We recommend NFS-Ganesha and hence this option needs to be turned on (to disable GlusterNFS) * Starting from Gluster 3.11 , HA configuration bits for NFS-Ganesha have been removed from gluster codebase. So you would need to either manually configure any HA service on top of NFS-Ganesha servers or use storhaug [1] to configure the same. * Coming to technical aspects, by switching to 'NFS', you could benefit from heavy caching done by NFS client and few other optimizations it does. Even NFS-Ganesha server does metadata caching and resides on the same nodes as the glusterfs servers. Apart from these, NFS-Ganesha acts like any other glusterfs client (but by making use of libgfapi and not fuse mount). It would be interesting to check if and how much improvement you get with 'NFS' when compared to fuse protocol for your workload. Please let us know when you have the test environment ready. Will make recommendations wrt to few settings for NFS-Ganesha server and client. Thanks, Soumya [1] https://github.com/linux-ha-storage/storhaug > > We would like to do things like enable SSL encryption of all data flows > (we deal with PHI data in a HIPAA-regulated setting) but are concerned > about performance. We are running dual Intel Xeon ?E5-2630L?(12 physical > cores each @ 2.4GHz) and 128GB RAM in each server node. We have 170 > users. About 20 are active at any time. > > The current setting on /home (others are similar if not identical, maybe > nfs-disable is true for others): > > gluster volume get home all > Option??????????????????????????????????Value > ------??????????????????????????????????----- > cluster.lookup-unhashed?????????????????on > cluster.lookup-optimize?????????????????off > cluster.min-free-disk???????????????????10% > cluster.min-free-inodes?????????????????5% > cluster.rebalance-stats?????????????????off > cluster.subvols-per-directory???????????(null) > cluster.readdir-optimize????????????????off > cluster.rsync-hash-regex????????????????(null) > cluster.extra-hash-regex????????????????(null) > cluster.dht-xattr-name??????????????????trusted.glusterfs.dht > cluster.randomize-hash-range-by-gfid????off > cluster.rebal-throttle??????????????????normal > cluster.lock-migration??????????????????off > cluster.local-volume-name???????????????(null) > cluster.weighted-rebalance??????????????on > cluster.switch-pattern??????????????????(null) > cluster.entry-change-log????????????????on > cluster.read-subvolume??????????????????(null) > cluster.read-subvolume-index????????????-1 > cluster.read-hash-mode??????????????????1 > cluster.background-self-heal-count??????8 > cluster.metadata-self-heal??????????????on > cluster.data-self-heal??????????????????on > cluster.entry-self-heal?????????????????on > cluster.self-heal-daemon????????????????enable > cluster.heal-timeout????????????????????600 > cluster.self-heal-window-size???????????1 > cluster.data-change-log?????????????????on > cluster.metadata-change-log?????????????on > cluster.data-self-heal-algorithm????????(null) > cluster.eager-lock??????????????????????on > disperse.eager-lock?????????????????????on > cluster.quorum-type?????????????????????none > cluster.quorum-count????????????????????(null) > cluster.choose-local????????????????????true > cluster.self-heal-readdir-size??????????1KB > cluster.post-op-delay-secs??????????????1 > cluster.ensure-durability???????????????on > cluster.consistent-metadata?????????????no > cluster.heal-wait-queue-length??????????128 > cluster.favorite-child-policy???????????none > cluster.stripe-block-size???????????????128KB > cluster.stripe-coalesce?????????????????true > diagnostics.latency-measurement?????????off > diagnostics.dump-fd-stats???????????????off > diagnostics.count-fop-hits??????????????off > diagnostics.brick-log-level?????????????INFO > diagnostics.client-log-level????????????INFO > diagnostics.brick-sys-log-level?????????CRITICAL > diagnostics.client-sys-log-level????????CRITICAL > diagnostics.brick-logger????????????????(null) > diagnostics.client-logger???????????????(null) > diagnostics.brick-log-format????????????(null) > diagnostics.client-log-format???????????(null) > diagnostics.brick-log-buf-size??????????5 > diagnostics.client-log-buf-size?????????5 > diagnostics.brick-log-flush-timeout?????120 > diagnostics.client-log-flush-timeout????120 > diagnostics.stats-dump-interval?????????0 > diagnostics.fop-sample-interval?????????0 > diagnostics.stats-dump-format???????????json > diagnostics.fop-sample-buf-size?????????65535 > diagnostics.stats-dnscache-ttl-sec??????86400 > performance.cache-max-file-size?????????0 > performance.cache-min-file-size?????????0 > performance.cache-refresh-timeout???????1 > performance.cache-priority > performance.cache-size??????????????????32MB > performance.io-thread-count?????????????16 > performance.high-prio-threads???????????16 > performance.normal-prio-threads?????????16 > performance.low-prio-threads????????????16 > performance.least-prio-threads??????????1 > performance.enable-least-priority???????on > performance.cache-size??????????????????128MB > performance.flush-behind????????????????on > performance.nfs.flush-behind????????????on > performance.write-behind-window-size????1MB > performance.resync-failed-syncs-after-fsyncoff > performance.nfs.write-behind-window-size1MB > performance.strict-o-direct?????????????off > performance.nfs.strict-o-direct?????????off > performance.strict-write-ordering???????off > performance.nfs.strict-write-ordering???off > performance.lazy-open???????????????????yes > performance.read-after-open?????????????no > performance.read-ahead-page-count???????4 > performance.md-cache-timeout????????????1 > performance.cache-swift-metadata????????true > performance.cache-samba-metadata????????false > performance.cache-capability-xattrs?????true > performance.cache-ima-xattrs????????????true > features.encryption?????????????????????off > encryption.master-key???????????????????(null) > encryption.data-key-size????????????????256 > encryption.block-size???????????????????4096 > network.frame-timeout???????????????????1800 > network.ping-timeout????????????????????42 > network.tcp-window-size?????????????????(null) > features.lock-heal??????????????????????off > features.grace-timeout??????????????????10 > network.remote-dio??????????????????????disable > client.event-threads????????????????????2 > client.tcp-user-timeout?????????????????0 > client.keepalive-time???????????????????20 > client.keepalive-interval???????????????2 > client.keepalive-count??????????????????9 > network.tcp-window-size?????????????????(null) > network.inode-lru-limit?????????????????16384 > auth.allow??????????????????????????????* > auth.reject?????????????????????????????(null) > transport.keepalive?????????????????????1 > server.allow-insecure???????????????????(null) > server.root-squash??????????????????????off > server.anonuid??????????????????????????65534 > server.anongid??????????????????????????65534 > server.statedump-path???????????????????/var/run/gluster > server.outstanding-rpc-limit????????????64 > features.lock-heal??????????????????????off > features.grace-timeout??????????????????10 > server.ssl??????????????????????????????(null) > auth.ssl-allow??????????????????????????* > server.manage-gids??????????????????????off > server.dynamic-auth?????????????????????on > client.send-gids????????????????????????on > server.gid-timeout??????????????????????300 > server.own-thread???????????????????????(null) > server.event-threads????????????????????1 > server.tcp-user-timeout?????????????????0 > server.keepalive-time???????????????????20 > server.keepalive-interval???????????????2 > server.keepalive-count??????????????????9 > transport.listen-backlog????????????????10 > ssl.own-cert????????????????????????????(null) > ssl.private-key?????????????????????????(null) > ssl.ca-list?????????????????????????????(null) > ssl.crl-path????????????????????????????(null) > ssl.certificate-depth???????????????????(null) > ssl.cipher-list?????????????????????????(null) > ssl.dh-param????????????????????????????(null) > ssl.ec-curve????????????????????????????(null) > performance.write-behind????????????????on > performance.read-ahead??????????????????on > performance.readdir-ahead???????????????off > performance.io-cache????????????????????on > performance.quick-read??????????????????on > performance.open-behind?????????????????on > performance.nl-cache????????????????????off > performance.stat-prefetch???????????????on > performance.client-io-threads???????????off > performance.nfs.write-behind????????????on > performance.nfs.read-ahead??????????????off > performance.nfs.io-cache????????????????off > performance.nfs.quick-read??????????????off > performance.nfs.stat-prefetch???????????off > performance.nfs.io-threads??????????????off > performance.force-readdirp??????????????true > performance.cache-invalidation??????????false > features.uss????????????????????????????off > features.snapshot-directory?????????????.snaps > features.show-snapshot-directory????????off > network.compression?????????????????????off > network.compression.window-size?????????-15 > network.compression.mem-level???????????8 > network.compression.min-size????????????0 > network.compression.compression-level???-1 > network.compression.debug???????????????false > features.limit-usage????????????????????(null) > features.default-soft-limit?????????????80% > features.soft-timeout???????????????????60 > features.hard-timeout???????????????????5 > features.alert-time?????????????????????86400 > features.quota-deem-statfs??????????????off > geo-replication.indexing????????????????off > geo-replication.indexing????????????????off > geo-replication.ignore-pid-check????????off > geo-replication.ignore-pid-check????????off > features.quota??????????????????????????off > features.inode-quota????????????????????off > features.bitrot?????????????????????????disable > debug.trace?????????????????????????????off > debug.log-history???????????????????????no > debug.log-file??????????????????????????no > debug.exclude-ops???????????????????????(null) > debug.include-ops???????????????????????(null) > debug.error-gen?????????????????????????off > debug.error-failure?????????????????????(null) > debug.error-number??????????????????????(null) > debug.random-failure????????????????????off > debug.error-fops????????????????????????(null) > nfs.enable-ino32????????????????????????no > nfs.mem-factor??????????????????????????15 > nfs.export-dirs?????????????????????????on > nfs.export-volumes??????????????????????on > nfs.addr-namelookup?????????????????????off > nfs.dynamic-volumes?????????????????????off > nfs.register-with-portmap???????????????on > nfs.outstanding-rpc-limit???????????????16 > nfs.port????????????????????????????????2049 > nfs.rpc-auth-unix???????????????????????on > nfs.rpc-auth-null???????????????????????on > nfs.rpc-auth-allow??????????????????????all > nfs.rpc-auth-reject?????????????????????none > nfs.ports-insecure??????????????????????off > nfs.trusted-sync????????????????????????off > nfs.trusted-write???????????????????????off > nfs.volume-access???????????????????????read-write > nfs.export-dir > nfs.disable?????????????????????????????off > nfs.nlm?????????????????????????????????on > nfs.acl?????????????????????????????????on > nfs.mount-udp???????????????????????????off > nfs.mount-rmtab?????????????????????????/var/lib/glusterd/nfs/rmtab > nfs.rpc-statd???????????????????????????/sbin/rpc.statd > nfs.server-aux-gids?????????????????????off > nfs.drc?????????????????????????????????off > nfs.drc-size????????????????????????????0x20000 > nfs.read-size???????????????????????????(1 * 1048576ULL) > nfs.write-size??????????????????????????(1 * 1048576ULL) > nfs.readdir-size????????????????????????(1 * 1048576ULL) > nfs.rdirplus????????????????????????????on > nfs.exports-auth-enable?????????????????(null) > nfs.auth-refresh-interval-sec???????????(null) > nfs.auth-cache-ttl-sec??????????????????(null) > features.read-only??????????????????????off > features.worm???????????????????????????off > features.worm-file-level????????????????off > features.default-retention-period???????120 > features.retention-mode?????????????????relax > features.auto-commit-period?????????????180 > storage.linux-aio???????????????????????off > storage.batch-fsync-mode????????????????reverse-fsync > storage.batch-fsync-delay-usec??????????0 > storage.owner-uid???????????????????????-1 > storage.owner-gid???????????????????????-1 > storage.node-uuid-pathinfo??????????????off > storage.health-check-interval???????????30 > storage.build-pgfid?????????????????????on > storage.gfid2path???????????????????????on > storage.gfid2path-separator?????????????: > storage.bd-aio??????????????????????????off > cluster.server-quorum-type??????????????off > cluster.server-quorum-ratio?????????????0 > changelog.changelog?????????????????????off > changelog.changelog-dir?????????????????(null) > changelog.encoding??????????????????????ascii > changelog.rollover-time?????????????????15 > changelog.fsync-interval????????????????5 > changelog.changelog-barrier-timeout?????120 > changelog.capture-del-path??????????????off > features.barrier????????????????????????disable > features.barrier-timeout????????????????120 > features.trash??????????????????????????off > features.trash-dir??????????????????????.trashcan > features.trash-eliminate-path???????????(null) > features.trash-max-filesize?????????????5MB > features.trash-internal-op??????????????off > cluster.enable-shared-storage???????????disable > cluster.write-freq-threshold????????????0 > cluster.read-freq-threshold?????????????0 > cluster.tier-pause??????????????????????off > cluster.tier-promote-frequency??????????120 > cluster.tier-demote-frequency???????????3600 > cluster.watermark-hi????????????????????90 > cluster.watermark-low???????????????????75 > cluster.tier-mode???????????????????????cache > cluster.tier-max-promote-file-size??????0 > cluster.tier-max-mb?????????????????????4000 > cluster.tier-max-files??????????????????10000 > cluster.tier-query-limit????????????????100 > cluster.tier-compact????????????????????on > cluster.tier-hot-compact-frequency??????604800 > cluster.tier-cold-compact-frequency?????604800 > features.ctr-enabled????????????????????off > features.record-counters????????????????off > features.ctr-record-metadata-heat???????off > features.ctr_link_consistency???????????off > features.ctr_lookupheal_link_timeout????300 > features.ctr_lookupheal_inode_timeout???300 > features.ctr-sql-db-cachesize???????????12500 > features.ctr-sql-db-wal-autocheckpoint??25000 > features.selinux????????????????????????on > locks.trace?????????????????????????????off > locks.mandatory-locking?????????????????off > cluster.disperse-self-heal-daemon???????enable > cluster.quorum-reads????????????????????no > client.bind-insecure????????????????????(null) > features.shard??????????????????????????off > features.shard-block-size???????????????64MB > features.scrub-throttle?????????????????lazy > features.scrub-freq?????????????????????biweekly > features.scrub??????????????????????????false > features.expiry-time????????????????????120 > features.cache-invalidation?????????????off > features.cache-invalidation-timeout?????60 > features.leases?????????????????????????off > features.lease-lock-recall-timeout??????60 > disperse.background-heals???????????????8 > disperse.heal-wait-qlength??????????????128 > cluster.heal-timeout????????????????????600 > dht.force-readdirp??????????????????????on > disperse.read-policy????????????????????gfid-hash > cluster.shd-max-threads?????????????????1 > cluster.shd-wait-qlength????????????????1024 > cluster.locking-scheme??????????????????full > cluster.granular-entry-heal?????????????no > features.locks-revocation-secs??????????0 > features.locks-revocation-clear-all?????false > features.locks-revocation-max-blocked???0 > features.locks-monkey-unlocking?????????false > disperse.shd-max-threads????????????????1 > disperse.shd-wait-qlength???????????????1024 > disperse.cpu-extensions?????????????????auto > disperse.self-heal-window-size??????????1 > cluster.use-compound-fops???????????????off > performance.parallel-readdir????????????off > performance.rda-request-size????????????131072 > performance.rda-low-wmark???????????????4096 > performance.rda-high-wmark??????????????128KB > performance.rda-cache-limit?????????????10MB > performance.nl-cache-positive-entry?????false > performance.nl-cache-limit??????????????10MB > performance.nl-cache-timeout????????????60 > cluster.brick-multiplex?????????????????off > cluster.max-bricks-per-process??????????0 > disperse.optimistic-change-log??????????on > cluster.halo-enabled????????????????????False > cluster.halo-shd-max-latency????????????99999 > cluster.halo-nfsd-max-latency???????????5 > cluster.halo-max-latency????????????????5 > cluster.halo-max-replicas >> >> Thanks, >> Soumya >> >>> >>> Does this process make more sense than a version upgrade process to >>> 4.1, then 5, then 6? What "gotcha's" do I need to be ready for? I >>> have until late May to prep and test on old, slow hardware with a >>> small amount of files and volumes. >>> >>> >>> You can directly upgrade from 3.12 to 6.x. I would suggest that rather >>> than deleting and creating Gluster volume. +Hari and +Sanju for further >>> guidelines on upgrade, as they recently did upgrade tests. +Soumya to >>> add to the nfs-ganesha aspect. >>> >>> Regards, >>> Poornima >>> >>> -- >>> >>> James P. Kinney III >>> >>> Every time you stop a school, you will have to build a jail. What you >>> gain at one end you lose at the other. It's like feeding a dog on his >>> own tail. It won't fatten the dog. >>> - Speech 11/23/1900 Mark Twain >>> >>> >>> http://heretothereideas.blogspot.com/ >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> >>> Gluster-users at gluster.org >>> >>> >> Gluster-users at gluster.org >>> >>> > >>> >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> > -- > > James P. Kinney III > > Every time you stop a school, you will have to build a jail. What you > gain at one end you lose at the other. It's like feeding a dog on his > own tail. It won't fatten the dog. > - Speech 11/23/1900 Mark Twain > > http://heretothereideas.blogspot.com/ > From jim.kinney at gmail.com Mon Apr 1 20:23:22 2019 From: jim.kinney at gmail.com (Jim Kinney) Date: Mon, 01 Apr 2019 16:23:22 -0400 Subject: [Gluster-users] Rsync in place of heal after brick failure In-Reply-To: References: Message-ID: <2a200cc0272ad0a89763f1ff5646e1772eae205e.camel@gmail.com> Nice! I didn't use -H -X and the system had to do some clean up. I'll add this in my next migration progress as I move 120TB to new hard drives. On Mon, 2019-04-01 at 14:27 -0400, Tom Fite wrote: > Hi all, > I have a very large (65 TB) brick in a replica 2 volume that needs to > be re-copied from scratch. A heal will take a very long time with > performance degradation on the volume so I investigated using rsync > to do the brunt of the work. > > The command: > > rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 > /data/brick1/ > > Running with -H assures that the hard links in .glusterfs are > preserved, and -X preserves all of gluster's extended attributes. > > I've tested this on my test environment as follows: > > 1. Stop glusterd and kill procs > 2. Move brick volume to backup dir > 3. Run rsync > 4. Start glusterd > 5. Observe gluster status > > All appears to be working correctly. Gluster status reports all > bricks online, all data is accessible in the volume, and I don't see > any errors in the logs. > > Anybody else have experience trying this? > > Thanks > -Tom > > _______________________________________________Gluster-users mailing > listGluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- James P. Kinney III Every time you stop a school, you will have to build a jail. What you gain at one end you lose at the other. It's like feeding a dog on his own tail. It won't fatten the dog. - Speech 11/23/1900 Mark Twain http://heretothereideas.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgurusid at redhat.com Tue Apr 2 01:55:53 2019 From: pgurusid at redhat.com (Poornima Gurusiddaiah) Date: Tue, 2 Apr 2019 07:25:53 +0530 Subject: [Gluster-users] Rsync in place of heal after brick failure In-Reply-To: References: Message-ID: You could also try xfsdump and xfsrestore if you brick filesystem is xfs and the destination disk can be attached locally? This will be much faster. Regards, Poornima On Tue, Apr 2, 2019, 12:05 AM Tom Fite wrote: > Hi all, > > I have a very large (65 TB) brick in a replica 2 volume that needs to be > re-copied from scratch. A heal will take a very long time with performance > degradation on the volume so I investigated using rsync to do the brunt of > the work. > > The command: > > rsync -av -H -X --numeric-ids --progress server1:/data/brick1/gv0 > /data/brick1/ > > Running with -H assures that the hard links in .glusterfs are preserved, > and -X preserves all of gluster's extended attributes. > > I've tested this on my test environment as follows: > > 1. Stop glusterd and kill procs > 2. Move brick volume to backup dir > 3. Run rsync > 4. Start glusterd > 5. Observe gluster status > > All appears to be working correctly. Gluster status reports all bricks > online, all data is accessible in the volume, and I don't see any errors in > the logs. > > Anybody else have experience trying this? > > Thanks > -Tom > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From francois.duport at hotmail.fr Mon Apr 1 07:50:41 2019 From: francois.duport at hotmail.fr (=?utf-8?B?RnJhbsOnb2lzIER1cG9ydA==?=) Date: Mon, 1 Apr 2019 07:50:41 +0000 Subject: [Gluster-users] Cross-compiling GlusterFS Message-ID: Hi, I try to cross-compile GlusterFS because I don't want my embedded client Todo it and each time I reset my client. So I want the compile application to be in my rom image. That said in appearance I did succeeded in my cross compilation but when I do a 'make Destdir='pwd'/out install' in out there's only a 'lib' and a 'include' folder. I can't find the associated bin folder. Can you help me with that? Thanks Best regards Fran?ois -------------- next part -------------- An HTML attachment was scrubbed... URL: From ravishankar at redhat.com Tue Apr 2 08:50:42 2019 From: ravishankar at redhat.com (Ravishankar N) Date: Tue, 2 Apr 2019 14:20:42 +0530 Subject: [Gluster-users] Cross-compiling GlusterFS In-Reply-To: References: Message-ID: <8c934a87-2653-20e8-0252-fcbd7e37b0f4@redhat.com> On 01/04/19 1:20 PM, Fran?ois Duport wrote: > Hi, > > I try to cross-compile GlusterFS because I don't want my embedded > client Todo it and each time I reset my client. So I want the compile > application to be in my rom image. > > That said in appearance I did succeeded in my cross compilation but > when I do a 'make Destdir='pwd'/out install' in out there's only a > 'lib' and a 'include' folder. I can't find the associated bin folder. I did not attempt a cross compile but `make install DESTDIR=/tmp/DELETE/` did put everything including the binaries inside /tmp/DELETE on a local install. Perhaps you could search the verbose output during the install for names of binaries to see if and where they are getting installed. For example, scrolling through the output and searching for glfsheal, I see /usr/bin/mkdir -p '/tmp/DELETE//usr/local/sbin' ? /bin/sh ../../libtool?? --mode=install /usr/bin/install -c glfsheal '/tmp/DELETE//usr/local/sbin' libtool: install: /usr/bin/install -c .libs/glfsheal /tmp/DELETE//usr/local/sbin/glfsheal Hope that helps. Ravi > > Can you help me with that? > Thanks > > Best regards > Fran?ois > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From atin.mukherjee83 at gmail.com Tue Apr 2 09:53:54 2019 From: atin.mukherjee83 at gmail.com (Atin Mukherjee) Date: Tue, 2 Apr 2019 15:23:54 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References:

Message-ID: On Mon, 1 Apr 2019 at 10:28, Hari Gowtham wrote: > Comments inline. > > On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay > wrote: > > > > Quite a considerable amount of detail here. Thank you! > > > > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham > wrote: > > > > > > Hello Gluster users, > > > > > > As you all aware that glusterfs-6 is out, we would like to inform you > > > that, we have spent a significant amount of time in testing > > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > > > > > As glusterfs-6 has got in a lot of changes, we wanted to test those > portions. > > > There were xlators (and respective options to enable/disable them) > > > added and deprecated in glusterfs-6 from various versions [1]. > > > > > > We had to check the following upgrade scenarios for all such options > > > Identified in [1]: > > > 1) option never enabled and upgraded > > > 2) option enabled and then upgraded > > > 3) option enabled and then disabled and then upgraded > > > > > > We weren't manually able to check all the combinations for all the > options. > > > So the options involving enabling and disabling xlators were > prioritized. > > > The below are the result of the ones tested. > > > > > > Never enabled and upgraded: > > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > > > > > Enabled and upgraded: > > > Tested for tier which is deprecated, It is not a recommended upgrade. > > > As expected the volume won't be consumable and will have a few more > > > issues as well. > > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > > > > > Enabled, disabled before upgrade. > > > Tested for tier with 3.12 and the upgrade went fine. > > > > > > There is one common issue to note in every upgrade. The node being > > > upgraded is going into disconnected state. You have to flush the > iptables > > > and the restart glusterd on all nodes to fix this. > > > > > > > Is this something that is written in the upgrade notes? I do not seem > > to recall, if not, I'll send a PR > > No this wasn't mentioned in the release notes. PRs are welcome. > > > > > > The testing for enabling new options is still pending. The new options > > > won't cause as much issues as the deprecated ones so this was put at > > > the end of the priority list. It would be nice to get contributions > > > for this. > > > > > > > Did the range of tests lead to any new issues? > > Yes. In the first round of testing we found an issue and had to postpone > the > release of 6 until the fix was made available. > https://bugzilla.redhat.com/show_bug.cgi?id=1684029 > > And then we tested it again after this patch was made available. > and came across this: > https://bugzilla.redhat.com/show_bug.cgi?id=1694010 This isn?t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release. > > Have mentioned this in the second mail as to how to over this situation > for now until the fix is available. > > > > > > For the disable testing, tier was used as it covers most of the xlator > > > that was removed. And all of these tests were done on a replica 3 > volume. > > > > > > > I'm not sure if the Glusto team is reading this, but it would be > > pertinent to understand if the approach you have taken can be > > converted into a form of automated testing pre-release. > > I don't have an answer for this, have CCed Vijay. > He might have an idea. > > > > > > Note: This is only for upgrade testing of the newly added and removed > > > xlators. Does not involve the normal tests for the xlator. > > > > > > If you have any questions, please feel free to reach us. > > > > > > [1] > https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing > > > > > > Regards, > > > Hari and Sanju. > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Regards, > Hari Gowtham. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > -- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: From nux at li.nux.ro Tue Apr 2 15:37:17 2019 From: nux at li.nux.ro (Nux!) Date: Tue, 2 Apr 2019 16:37:17 +0100 (BST) Subject: [Gluster-users] Prioritise local bricks for IO? In-Reply-To: References: <29221907.583.1553599314586.JavaMail.zimbra@li.nux.ro>

Message-ID: <383369409.4472.1554219437440.JavaMail.zimbra@li.nux.ro> Ok, cool, thanks. So.. no go. Any other ideas on how to accomplish task then? -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ----- Original Message ----- > From: "Nithya Balachandran" > To: "Poornima Gurusiddaiah" > Cc: "Nux!" , "gluster-users" , "Gluster Devel" > Sent: Thursday, 28 March, 2019 09:38:16 > Subject: Re: [Gluster-users] Prioritise local bricks for IO? > On Wed, 27 Mar 2019 at 20:27, Poornima Gurusiddaiah > wrote: > >> This feature is not under active development as it was not used widely. >> AFAIK its not supported feature. >> +Nithya +Raghavendra for further clarifications. >> > > This is not actively supported - there has been no work done on this > feature for a long time. > > Regards, > Nithya > >> >> Regards, >> Poornima >> >> On Wed, Mar 27, 2019 at 12:33 PM Lucian wrote: >> >>> Oh, that's just what the doctor ordered! >>> Hope it works, thanks >>> >>> On 27 March 2019 03:15:57 GMT, Vlad Kopylov wrote: >>>> >>>> I don't remember if it still in works >>>> NUFA >>>> >>>> https://github.com/gluster/glusterfs-specs/blob/master/done/Features/nufa.md >>>> >>>> v >>>> >>>> On Tue, Mar 26, 2019 at 7:27 AM Nux! wrote: >>>> >>>>> Hello, >>>>> >>>>> I'm trying to set up a distributed backup storage (no replicas), but >>>>> I'd like to prioritise the local bricks for any IO done on the volume. >>>>> This will be a backup stor, so in other words, I'd like the files to be >>>>> written locally if there is space, so as to save the NICs for other traffic. >>>>> >>>>> Anyone knows how this might be achievable, if at all? >>>>> >>>>> -- >>>>> Sent from the Delta quadrant using Borg technology! >>>>> >>>>> Nux! >>>>> www.nux.ro >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users From ykaul at redhat.com Tue Apr 2 18:16:02 2019 From: ykaul at redhat.com (Yaniv Kaul) Date: Tue, 2 Apr 2019 21:16:02 +0300 Subject: [Gluster-users] [Gluster-devel] Prioritise local bricks for IO? In-Reply-To: <383369409.4472.1554219437440.JavaMail.zimbra@li.nux.ro> References: <29221907.583.1553599314586.JavaMail.zimbra@li.nux.ro>

<383369409.4472.1554219437440.JavaMail.zimbra@li.nux.ro> Message-ID: On Tue, Apr 2, 2019 at 6:37 PM Nux! wrote: > Ok, cool, thanks. So.. no go. > > Any other ideas on how to accomplish task then? > While not a solution, I believe https://review.gluster.org/#/c/glusterfs/+/21333/ - read selection based on latency, is an interesting path towards this. (Of course, you'd need later also add write...) Y. > -- > Sent from the Delta quadrant using Borg technology! > > Nux! > www.nux.ro > > ----- Original Message ----- > > From: "Nithya Balachandran" > > To: "Poornima Gurusiddaiah" > > Cc: "Nux!" , "gluster-users" , > "Gluster Devel" > > Sent: Thursday, 28 March, 2019 09:38:16 > > Subject: Re: [Gluster-users] Prioritise local bricks for IO? > > > On Wed, 27 Mar 2019 at 20:27, Poornima Gurusiddaiah > > > wrote: > > > >> This feature is not under active development as it was not used widely. > >> AFAIK its not supported feature. > >> +Nithya +Raghavendra for further clarifications. > >> > > > > This is not actively supported - there has been no work done on this > > feature for a long time. > > > > Regards, > > Nithya > > > >> > >> Regards, > >> Poornima > >> > >> On Wed, Mar 27, 2019 at 12:33 PM Lucian wrote: > >> > >>> Oh, that's just what the doctor ordered! > >>> Hope it works, thanks > >>> > >>> On 27 March 2019 03:15:57 GMT, Vlad Kopylov > wrote: > >>>> > >>>> I don't remember if it still in works > >>>> NUFA > >>>> > >>>> > https://github.com/gluster/glusterfs-specs/blob/master/done/Features/nufa.md > >>>> > >>>> v > >>>> > >>>> On Tue, Mar 26, 2019 at 7:27 AM Nux! wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I'm trying to set up a distributed backup storage (no replicas), but > >>>>> I'd like to prioritise the local bricks for any IO done on the > volume. > >>>>> This will be a backup stor, so in other words, I'd like the files to > be > >>>>> written locally if there is space, so as to save the NICs for other > traffic. > >>>>> > >>>>> Anyone knows how this might be achievable, if at all? > >>>>> > >>>>> -- > >>>>> Sent from the Delta quadrant using Borg technology! > >>>>> > >>>>> Nux! > >>>>> www.nux.ro > >>>>> _______________________________________________ > >>>>> Gluster-users mailing list > >>>>> Gluster-users at gluster.org > >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users > >>>>> > >>>> > >>> -- > >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> Gluster-users at gluster.org > >>> https://lists.gluster.org/mailman/listinfo/gluster-users > >> > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users at gluster.org > > > https://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hunter86_bg at yahoo.com Wed Apr 3 00:26:09 2019 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Wed, 3 Apr 2019 00:26:09 +0000 (UTC) Subject: [Gluster-users] Gluster 5.5 slower than 3.12.15 References: <408118771.15560336.1554251169489.ref@mail.yahoo.com> Message-ID: <408118771.15560336.1554251169489@mail.yahoo.com> Hi Community, I have the feeling that with gluster v5.5 I have poorer performance than it used to be on 3.12.15. Did you observe something like that? I have a 3 node Hyperconverged Cluster (ovirt + glusterfs with replica 3 arbiter1 volumes) with NFS Ganesha and since I have upgraded to v5 - the issues came up.First it was 5.3 notorious experience and now with 5.5 - my sanlock is having problems and higher latency than it used to be. I have switched from NFS-Ganesha to pure FUSE , but the latency problems do not go away. Of course , this is partially due to the consumer hardware, but as the hardware has not changed I was hoping that the performance will remain as is. So, do you expect 5.5 to perform less than 3.12 ? Some info:Volume Name: engineType: ReplicateVolume ID: 30ca1cc2-f2f7-4749-9e2e-cee9d7099dedStatus: StartedSnapshot Count: 0Number of Bricks: 1 x (2 + 1) = 3Transport-type: tcpBricks:Brick1: ovirt1:/gluster_bricks/engine/engineBrick2: ovirt2:/gluster_bricks/engine/engineBrick3: ovirt3:/gluster_bricks/engine/engine (arbiter)Options Reconfigured:performance.client-io-threads: offnfs.disable: ontransport.address-family: inetperformance.quick-read: offperformance.read-ahead: offperformance.io-cache: offperformance.low-prio-threads: 32network.remote-dio: offcluster.eager-lock: enablecluster.quorum-type: autocluster.server-quorum-type: servercluster.data-self-heal-algorithm: fullcluster.locking-scheme: granularcluster.shd-max-threads: 8cluster.shd-wait-qlength: 10000features.shard: onuser.cifs: offstorage.owner-uid: 36storage.owner-gid: 36network.ping-timeout: 30performance.strict-o-direct: oncluster.granular-entry-heal: enablecluster.enable-shared-storage: enable Network: 1 gbit/s Filesystem:XFS Best Regards,Strahil Nikolov -------------- next part -------------- An HTML attachment was scrubbed... URL: From moagrawa at redhat.com Wed Apr 3 02:55:59 2019 From: moagrawa at redhat.com (Mohit Agrawal) Date: Wed, 3 Apr 2019 08:25:59 +0530 Subject: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5 In-Reply-To: References: <20190328164716.27693.35887@mail.ovirt.org>

Message-ID: Hi Olaf, As per current attached "multi-glusterfsd-vol3.txt | multi-glusterfsd-vol4.txt" it is showing multiple processes are running for "ovirt-core ovirt-engine" brick names but there are no logs available in bricklogs.zip specific to this bricks, bricklogs.zip has a dump of ovirt-kube logs only Kindly share brick logs specific to the bricks "ovirt-core ovirt-engine" and share glusterd logs also. Regards Mohit Agrawal On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar wrote: > Dear Krutika, > > 1. > I've changed the volume settings, write performance seems to increased > somewhat, however the profile doesn't really support that since latencies > increased. However read performance has diminished, which does seem to be > supported by the profile runs (attached). > Also the IO does seem to behave more consistent than before. > I don't really understand the idea behind them, maybe you can explain why > these suggestions are good? > These settings seems to avoid as much local caching and access as possible > and push everything to the gluster processes. While i would expect local > access and local caches are a good thing, since it would lead to having > less network access or disk access. > I tried to investigate these settings a bit more, and this is what i > understood of them; > - network.remote-dio; when on it seems to ignore the O_DIRECT flag in the > client, thus causing the files to be cached and buffered in the page cache > on the client, i would expect this to be a good thing especially if the > server process would access the same page cache? > At least that is what grasp from this commit; > https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c line > 867 > Also found this commit; > https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c suggesting > remote-dio actually improves performance, not sure it's a write or read > benchmark > When a file is opened with O_DIRECT it will also disable the write-behind > functionality > > - performance.strict-o-direct: when on, the AFR, will not ignore the > O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper, > which seems to stack the operation, no idea why that is. But generally i > suppose not ignoring the O_DIRECT flag in the AFR is a good thing, when a > processes requests to have O_DIRECT. So this makes sense to me. > > - cluster.choose-local: when off, it doesn't prefer the local node, but > would always choose a brick. Since it's a 9 node cluster, with 3 > subvolumes, only a 1/3 could end-up local, and the other 2/3 should be > pushed to external nodes anyway. Or am I making the total wrong assumption > here? > > It seems to this config is moving to the gluster-block config side of > things, which does make sense. > Since we're running quite some mysql instances, which opens the files with > O_DIRECt i believe, it would mean the only layer of cache is within mysql > it self. Which you could argue is a good thing. But i would expect a little > of write-behind buffer, and maybe some of the data cached within gluster > would alleviate things a bit on gluster's side. But i wouldn't know if > that's the correct mind set, and so might be totally off here. > Also i would expect these gluster v set command to be online > operations, but somehow the bricks went down, after applying these changes. > What appears to have happened is that after the update the brick process > was restarted, but due to multiple brick process start issue, multiple > processes were started, and the brick didn't came online again. > However i'll try to reproduce this, since i would like to test with > cluster.choose-local: on, and see how performance compares. And hopefully > when it occurs collect some useful info. > Question; are network.remote-dio and performance.strict-o-direct mutually > exclusive settings, or can they both be on? > > 2. I've attached all brick logs, the only thing relevant i found was; > [2019-03-28 20:20:07.170452] I [MSGID: 113030] > [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: > open-fd-key-status: 0 for > /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 > [2019-03-28 20:20:07.170491] I [MSGID: 113031] > [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr > status: 0 for > /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 > [2019-03-28 20:20:07.248480] I [MSGID: 113030] > [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: > open-fd-key-status: 0 for > /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 > [2019-03-28 20:20:07.248491] I [MSGID: 113031] > [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr > status: 0 for > /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 > > Thanks Olaf > > ps. sorry needed to resend since it exceed the file limit > > Op ma 1 apr. 2019 om 07:56 schreef Krutika Dhananjay >: > >> Adding back gluster-users >> Comments inline ... >> >> On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar >> wrote: >> >>> Dear Krutika, >>> >>> >>> >>> 1. I?ve made 2 profile runs of around 10 minutes (see files >>> profile_data.txt and profile_data2.txt). Looking at it, most time seems be >>> spent at the fop?s fsync and readdirp. >>> >>> Unfortunate I don?t have the profile info for the 3.12.15 version so >>> it?s a bit hard to compare. >>> >>> One additional thing I do notice on 1 machine (10.32.9.5) the iowait >>> time increased a lot, from an average below the 1% it?s now around the 12% >>> after the upgrade. >>> >>> So first suspicion with be lighting strikes twice, and I?ve also just >>> now a bad disk, but that doesn?t appear to be the case, since all smart >>> status report ok. >>> >>> Also dd shows performance I would more or less expect; >>> >>> dd if=/dev/zero of=/data/test_file bs=100M count=1 oflag=dsync >>> >>> 1+0 records in >>> >>> 1+0 records out >>> >>> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s >>> >>> dd if=/dev/zero of=/data/test_file bs=1G count=1 oflag=dsync >>> >>> 1+0 records in >>> >>> 1+0 records out >>> >>> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s >>> >>> if=/dev/urandom of=/data/test_file bs=1024 count=1000000 >>> >>> 1000000+0 records in >>> >>> 1000000+0 records out >>> >>> 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s >>> >>> dd if=/dev/zero of=/data/test_file bs=1024 count=1000000 >>> >>> 1000000+0 records in >>> >>> 1000000+0 records out >>> >>> 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s >>> >>> When I disable this brick (service glusterd stop; pkill glusterfsd) >>> performance in gluster is better, but not on par with what it was. Also the >>> cpu usages on the ?neighbor? nodes which hosts the other bricks in the same >>> subvolume increases quite a lot in this case, which I wouldn?t expect >>> actually since they shouldn't handle much more work, except flagging shards >>> to heal. Iowait also goes to idle once gluster is stopped, so it?s for >>> sure gluster which waits for io. >>> >>> >>> >> >> So I see that FSYNC %-latency is on the higher side. And I also noticed >> you don't have direct-io options enabled on the volume. >> Could you set the following options on the volume - >> # gluster volume set network.remote-dio off >> # gluster volume set performance.strict-o-direct on >> and also disable choose-local >> # gluster volume set cluster.choose-local off >> >> let me know if this helps. >> >> 2. I?ve attached the mnt log and volume info, but I couldn?t find >>> anything relevant in in those logs. I think this is because we run the VM?s >>> with libgfapi; >>> >>> [root at ovirt-host-01 ~]# engine-config -g LibgfApiSupported >>> >>> LibgfApiSupported: true version: 4.2 >>> >>> LibgfApiSupported: true version: 4.1 >>> >>> LibgfApiSupported: true version: 4.3 >>> >>> And I can confirm the qemu process is invoked with the gluster:// >>> address for the images. >>> >>> The message is logged in the /var/lib/libvert/qemu/ file, >>> which I?ve also included. For a sample case see around; 2019-03-28 20:20:07 >>> >>> Which has the error; E [MSGID: 133010] >>> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >>> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >>> [Stale file handle] >>> >> >> Could you also attach the brick logs for this volume? >> >> >>> >>> 3. yes I see multiple instances for the same brick directory, like; >>> >>> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id >>> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p >>> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid >>> -S /var/run/gluster/452591c9165945d9.socket --brick-name >>> /data/gfs/bricks/brick1/ovirt-core -l >>> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log >>> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1 >>> --process-name brick --brick-port 49154 --xlator-option >>> ovirt-core-server.listen-port=49154 >>> >>> >>> >>> I?ve made an export of the output of ps from the time I observed these >>> multiple processes. >>> >>> In addition the brick_mux bug as noted by Atin. I might also have >>> another possible cause, as ovirt moves nodes from none-operational state or >>> maintenance state to active/activating, it also seems to restart gluster, >>> however I don?t have direct proof for this theory. >>> >>> >>> >> >> +Atin Mukherjee ^^ >> +Mohit Agrawal ^^ >> >> -Krutika >> >> Thanks Olaf >>> >>> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola < >>> sbonazzo at redhat.com>: >>> >>>> >>>> >>>> Il giorno gio 28 mar 2019 alle ore 17:48 ha >>>> scritto: >>>> >>>>> Dear All, >>>>> >>>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While >>>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a >>>>> different experience. After first trying a test upgrade on a 3 node setup, >>>>> which went fine. i headed to upgrade the 9 node production platform, >>>>> unaware of the backward compatibility issues between gluster 3.12.15 -> >>>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start. >>>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata >>>>> was missing or couldn't be accessed. Restoring this file by getting a good >>>>> copy of the underlying bricks, removing the file from the underlying bricks >>>>> where the file was 0 bytes and mark with the stickybit, and the >>>>> corresponding gfid's. Removing the file from the mount point, and copying >>>>> back the file on the mount point. Manually mounting the engine domain, and >>>>> manually creating the corresponding symbolic links in /rhev/data-center and >>>>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was >>>>> root.root), i was able to start the HA engine again. Since the engine was >>>>> up again, and things seemed rather unstable i decided to continue the >>>>> upgrade on the other nodes suspecting an incompatibility in gluster >>>>> versions, i thought would be best to have them all on the same version >>>>> rather soonish. However things went from bad to worse, the engine stopped >>>>> again, and all vm?s stopped working as well. So on a machine outside the >>>>> setup and restored a backup of the engine taken from version 4.2.8 just >>>>> before the upgrade. With this engine I was at least able to start some vm?s >>>>> again, and finalize the upgrade. Once the upgraded, things didn?t stabilize >>>>> and also lose 2 vm?s during the process due to image corruption. After >>>>> figuring out gluster 5.3 had quite some issues I was as lucky to see >>>>> gluster 5.5 was about to be released, on the moment the RPM?s were >>>>> available I?ve installed those. This helped a lot in terms of stability, >>>>> for which I?m very grateful! However the performance is unfortunate >>>>> terrible, it?s about 15% of what the performance was running gluster >>>>> 3.12.15. It?s strange since a simple dd shows ok performance, but our >>>>> actual workload doesn?t. While I would expect the performance to be better, >>>>> due to all improvements made since gluster version 3.12. Does anybody share >>>>> the same experience? >>>>> I really hope gluster 6 will soon be tested with ovirt and released, >>>>> and things start to perform and stabilize again..like the good old days. Of >>>>> course when I can do anything, I?m happy to help. >>>>> >>>> >>>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track >>>> the rebase on Gluster 6. >>>> >>>> >>>> >>>>> >>>>> I think the following short list of issues we have after the migration; >>>>> Gluster 5.5; >>>>> - Poor performance for our workload (mostly write dependent) >>>>> - VM?s randomly pause on unknown storage errors, which are >>>>> ?stale file?s?. corresponding log; Lookup on shard 797 failed. Base file >>>>> gfid = 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle] >>>>> - Some files are listed twice in a directory (probably related >>>>> the stale file issue?) >>>>> Example; >>>>> ls -la >>>>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/ >>>>> total 3081 >>>>> drwxr-x---. 2 vdsm kvm 4096 Mar 18 11:34 . >>>>> drwxr-xr-x. 13 vdsm kvm 4096 Mar 19 09:42 .. >>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>> -rw-rw----. 1 vdsm kvm 1048576 Jan 27 2018 >>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease >>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>> >>>>> - brick processes sometimes starts multiple times. Sometimes I?ve 5 >>>>> brick processes for a single volume. Killing all glusterfsd?s for the >>>>> volume on the machine and running gluster v start force usually just >>>>> starts one after the event, from then on things look all right. >>>>> >>>>> >>>> May I kindly ask to open bugs on Gluster for above issues at >>>> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ? >>>> Sahina? >>>> >>>> >>>>> Ovirt 4.3.2.1-1.el7 >>>>> - All vms images ownership are changed to root.root after the vm >>>>> is shutdown, probably related to; >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only >>>>> scoped to the HA engine. I?m still in compatibility mode 4.2 for the >>>>> cluster and for the vm?s, but upgraded to version ovirt 4.3.2 >>>>> >>>> >>>> Ryan? >>>> >>>> >>>>> - The network provider is set to ovn, which is fine..actually >>>>> cool, only the ?ovs-vswitchd? is a CPU hog, and utilizes 100% >>>>> >>>> >>>> Miguel? Dominik? >>>> >>>> >>>>> - It seems on all nodes vdsm tries to get the the stats for the >>>>> HA engine, which is filling the logs with (not sure if this is new); >>>>> [api.virt] FINISH getStats return={'status': {'message': "Virtual >>>>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}", >>>>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9 >>>>> (api:54) >>>>> >>>> >>>> Simone? >>>> >>>> >>>>> - It seems the package os_brick [root] managedvolume not >>>>> supported: Managed Volume Not Supported. Missing package os-brick.: >>>>> ('Cannot import os_brick',) (caps:149) which fills the vdsm.log, but for >>>>> this I also saw another message, so I suspect this will already be resolved >>>>> shortly >>>>> - The machine I used to run the backup HA engine, doesn?t want >>>>> to get removed from the hosted-engine ?vm-status, not even after running; >>>>> hosted-engine --clean-metadata --host-id=10 --force-clean or hosted-engine >>>>> --clean-metadata --force-clean from the machine itself. >>>>> >>>> >>>> Simone? >>>> >>>> >>>>> >>>>> Think that's about it. >>>>> >>>>> Don?t get me wrong, I don?t want to rant, I just wanted to share my >>>>> experience and see where things can made better. >>>>> >>>> >>>> If not already done, can you please open bugs for above issues at >>>> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ? >>>> >>>> >>>>> >>>>> >>>>> Best Olaf >>>>> _______________________________________________ >>>>> Users mailing list -- users at ovirt.org >>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> List Archives: >>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/ >>>>> >>>> >>>> >>>> -- >>>> >>>> SANDRO BONAZZOLA >>>> >>>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >>>> >>>> Red Hat EMEA >>>> >>>> sbonazzo at redhat.com >>>> >>>> >>> _______________________________________________ >>> Users mailing list -- users at ovirt.org >>> To unsubscribe send an email to users-leave at ovirt.org >>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nl at fischer-ka.de Wed Apr 3 06:48:55 2019 From: nl at fischer-ka.de (Ingo Fischer) Date: Wed, 3 Apr 2019 08:48:55 +0200 Subject: [Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum? Message-ID: <22ed4f3d-d8f1-71ec-1422-241413f93a08@fischer-ka.de> Hi All, I had a replica 2 cluster to host my VM images from my Proxmox cluster. I got a bit around split brain scenarios by using "nufa" to make sure the files are located on the host where the machine also runs normally. So in fact one replica could fail and I still had the VM working. But then I thought about doing better and decided to add a node to increase replica and I decided against arbiter approach. During this I also decided to go away from nufa to make it a more normal approach. But in fact by adding the third replica and removing nufa I'm not really better on availability - only split-brain-chance. I'm still at the point that only one node is allowed to fail because else the now active client quorum is no longer met and FS goes read only (which in fact is not really better then failing completely as it was before). So I thought about adding arbiter bricks as "kind of 4th replica (but without space needs) ... but then I read in docs that only "replica 3 arbiter 1" is allowed as combination. Is this still true? If docs are true: Why arbiter is not allowed for higher replica counts? It would allow to improve on client quorum in my understanding. Thank you for your opinion and/or facts :-) Ingo -- Ingo Fischer Technical Director of Platform Gameforge 4D GmbH Albert-Nestler-Stra?e 8 76131 Karlsruhe Germany Tel. +49 721 354 808-2269 ingo.fischer at gameforge.com http://www.gameforge.com Amtsgericht Mannheim, Handelsregisternummer 718029 USt-IdNr.: DE814330106 Gesch?ftsf?hrer Alexander R?sner, Jeffrey Brown From ravishankar at redhat.com Wed Apr 3 07:38:27 2019 From: ravishankar at redhat.com (Ravishankar N) Date: Wed, 3 Apr 2019 13:08:27 +0530 Subject: [Gluster-users] Is "replica 4 arbiter 1" allowed to tweak client-quorum? In-Reply-To: <22ed4f3d-d8f1-71ec-1422-241413f93a08@fischer-ka.de> References: <22ed4f3d-d8f1-71ec-1422-241413f93a08@fischer-ka.de> Message-ID: <21e01090-8ddc-5d3f-d58c-f673dad5a78a@redhat.com> On 03/04/19 12:18 PM, Ingo Fischer wrote: > Hi All, > > I had a replica 2 cluster to host my VM images from my Proxmox cluster. > I got a bit around split brain scenarios by using "nufa" to make sure > the files are located on the host where the machine also runs normally. > So in fact one replica could fail and I still had the VM working. > > But then I thought about doing better and decided to add a node to > increase replica and I decided against arbiter approach. During this I > also decided to go away from nufa to make it a more normal approach. > > But in fact by adding the third replica and removing nufa I'm not really > better on availability - only split-brain-chance. I'm still at the point > that only one node is allowed to fail because else the now active client > quorum is no longer met and FS goes read only (which in fact is not > really better then failing completely as it was before). > > So I thought about adding arbiter bricks as "kind of 4th replica (but > without space needs) ... but then I read in docs that only "replica 3 > arbiter 1" is allowed as combination. Is this still true? Yes, this is still true. Slightly off-topic, the 'replica 3 arbiter 1' was supposed to mean there are 3 bricks out of which 1 is an arbiter. This supposedly caused some confusion where people thought there were 4 bricks involved. The CLI syntax was changed in the newer releases to 'replica 2 arbiter 1` to mean there are 2 data bricks and 1 arbiter brick. For backward compatibility, the older syntax still works though. The documentation needs to be updated. :-) > If docs are true: Why arbiter is not allowed for higher replica counts? The main motivation for the arbiter feature was to solve a specific case: people who wanted to avoid split-brains associated with replica 2 but did not want to add another full blown data brick to make it replica 3 for cost reasons. > It would allow to improve on client quorum in my understanding. Agreed but the current implementation is only for a 2+1 configuration. Perhaps it is something we could work on in the future to make it generic like you say. > > Thank you for your opinion and/or facts :-) I don't think NUFA is being worked on/tested actively. If you can afford a 3rd data brick, making it replica 3 is definitely better than a 2+1 arbiter since there is more availability by virtue of the 3rd brick also storing data. Both of them prevent split-brains and are used successfully by OVirt/ VM storage/ hyperconvergance use cases. Even without NUFA, for reads, AFR anyway serves it from the local copy (writes still need to go to all bricks). Regards, Ravi > > Ingo > From atumball at redhat.com Wed Apr 3 08:35:07 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 3 Apr 2019 14:05:07 +0530 Subject: [Gluster-users] Gluster 5.5 slower than 3.12.15 In-Reply-To: <408118771.15560336.1554251169489@mail.yahoo.com> References: <408118771.15560336.1554251169489.ref@mail.yahoo.com> <408118771.15560336.1554251169489@mail.yahoo.com> Message-ID: Strahil, With some basic testing, we are noticing the similar behavior too. One of the issue we identified was increased n/w usage in 5.x series (being addressed by https://review.gluster.org/#/c/glusterfs/+/22404/), and there are few other features which write extended attributes which caused some delay. We are in the process of publishing some numbers with release-3.12.x, release-5 and release-6 comparison soon. With some numbers we are already seeing release-6 currently is giving really good performance in many configurations, specially for 1x3 replicate volume type. While we continue to identify and fix issues in 5.x series, one of the request is to validate release-6.x (6.0 or 6.1 which would happen on April 10th), so you can see the difference in your workload. Regards, Amar On Wed, Apr 3, 2019 at 5:57 AM Strahil Nikolov wrote: > Hi Community, > > I have the feeling that with gluster v5.5 I have poorer performance than > it used to be on 3.12.15. Did you observe something like that? > > I have a 3 node Hyperconverged Cluster (ovirt + glusterfs with replica 3 > arbiter1 volumes) with NFS Ganesha and since I have upgraded to v5 - the > issues came up. > First it was 5.3 notorious experience and now with 5.5 - my sanlock is > having problems and higher latency than it used to be. I have switched from > NFS-Ganesha to pure FUSE , but the latency problems do not go away. > > Of course , this is partially due to the consumer hardware, but as the > hardware has not changed I was hoping that the performance will remain as > is. > > So, do you expect 5.5 to perform less than 3.12 ? > > Some info: > Volume Name: engine > Type: Replicate > Volume ID: 30ca1cc2-f2f7-4749-9e2e-cee9d7099ded > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: ovirt1:/gluster_bricks/engine/engine > Brick2: ovirt2:/gluster_bricks/engine/engine > Brick3: ovirt3:/gluster_bricks/engine/engine (arbiter) > Options Reconfigured: > performance.client-io-threads: off > nfs.disable: on > transport.address-family: inet > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.low-prio-threads: 32 > network.remote-dio: off > cluster.eager-lock: enable > cluster.quorum-type: auto > cluster.server-quorum-type: server > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-max-threads: 8 > cluster.shd-wait-qlength: 10000 > features.shard: on > user.cifs: off > storage.owner-uid: 36 > storage.owner-gid: 36 > network.ping-timeout: 30 > performance.strict-o-direct: on > cluster.granular-entry-heal: enable > cluster.enable-shared-storage: enable > > Network: 1 gbit/s > > Filesystem:XFS > > Best Regards, > Strahil Nikolov > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.buitelaar at gmail.com Wed Apr 3 10:28:04 2019 From: olaf.buitelaar at gmail.com (Olaf Buitelaar) Date: Wed, 3 Apr 2019 12:28:04 +0200 Subject: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5 In-Reply-To: References: <20190328164716.27693.35887@mail.ovirt.org>

Message-ID: Dear Mohit, Sorry i thought Krutika was referring to the ovirt-kube brick logs. due the large size (18MB compressed), i've placed the files here; https://edgecastcdn.net/0004FA/files/bricklogs.tar.bz2 Also i see i've attached the wrong files, i intended to attach profile_data4.txt | profile_data3.txt Sorry for the confusion. Thanks Olaf Op wo 3 apr. 2019 om 04:56 schreef Mohit Agrawal : > Hi Olaf, > > As per current attached "multi-glusterfsd-vol3.txt | > multi-glusterfsd-vol4.txt" it is showing multiple processes are running > for "ovirt-core ovirt-engine" brick names but there are no logs > available in bricklogs.zip specific to this bricks, bricklogs.zip > has a dump of ovirt-kube logs only > > Kindly share brick logs specific to the bricks "ovirt-core > ovirt-engine" and share glusterd logs also. > > Regards > Mohit Agrawal > > On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar > wrote: > >> Dear Krutika, >> >> 1. >> I've changed the volume settings, write performance seems to increased >> somewhat, however the profile doesn't really support that since latencies >> increased. However read performance has diminished, which does seem to be >> supported by the profile runs (attached). >> Also the IO does seem to behave more consistent than before. >> I don't really understand the idea behind them, maybe you can explain why >> these suggestions are good? >> These settings seems to avoid as much local caching and access as >> possible and push everything to the gluster processes. While i would expect >> local access and local caches are a good thing, since it would lead to >> having less network access or disk access. >> I tried to investigate these settings a bit more, and this is what i >> understood of them; >> - network.remote-dio; when on it seems to ignore the O_DIRECT flag in the >> client, thus causing the files to be cached and buffered in the page cache >> on the client, i would expect this to be a good thing especially if the >> server process would access the same page cache? >> At least that is what grasp from this commit; >> https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c line >> 867 >> Also found this commit; >> https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c suggesting >> remote-dio actually improves performance, not sure it's a write or read >> benchmark >> When a file is opened with O_DIRECT it will also disable the write-behind >> functionality >> >> - performance.strict-o-direct: when on, the AFR, will not ignore the >> O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper, >> which seems to stack the operation, no idea why that is. But generally i >> suppose not ignoring the O_DIRECT flag in the AFR is a good thing, when a >> processes requests to have O_DIRECT. So this makes sense to me. >> >> - cluster.choose-local: when off, it doesn't prefer the local node, but >> would always choose a brick. Since it's a 9 node cluster, with 3 >> subvolumes, only a 1/3 could end-up local, and the other 2/3 should be >> pushed to external nodes anyway. Or am I making the total wrong assumption >> here? >> >> It seems to this config is moving to the gluster-block config side of >> things, which does make sense. >> Since we're running quite some mysql instances, which opens the files >> with O_DIRECt i believe, it would mean the only layer of cache is within >> mysql it self. Which you could argue is a good thing. But i would expect a >> little of write-behind buffer, and maybe some of the data cached within >> gluster would alleviate things a bit on gluster's side. But i wouldn't know >> if that's the correct mind set, and so might be totally off here. >> Also i would expect these gluster v set command to be online >> operations, but somehow the bricks went down, after applying these changes. >> What appears to have happened is that after the update the brick process >> was restarted, but due to multiple brick process start issue, multiple >> processes were started, and the brick didn't came online again. >> However i'll try to reproduce this, since i would like to test with >> cluster.choose-local: on, and see how performance compares. And hopefully >> when it occurs collect some useful info. >> Question; are network.remote-dio and performance.strict-o-direct mutually >> exclusive settings, or can they both be on? >> >> 2. I've attached all brick logs, the only thing relevant i found was; >> [2019-03-28 20:20:07.170452] I [MSGID: 113030] >> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >> open-fd-key-status: 0 for >> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >> [2019-03-28 20:20:07.170491] I [MSGID: 113031] >> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >> status: 0 for >> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >> [2019-03-28 20:20:07.248480] I [MSGID: 113030] >> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >> open-fd-key-status: 0 for >> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >> [2019-03-28 20:20:07.248491] I [MSGID: 113031] >> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >> status: 0 for >> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >> >> Thanks Olaf >> >> ps. sorry needed to resend since it exceed the file limit >> >> Op ma 1 apr. 2019 om 07:56 schreef Krutika Dhananjay > >: >> >>> Adding back gluster-users >>> Comments inline ... >>> >>> On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar >>> wrote: >>> >>>> Dear Krutika, >>>> >>>> >>>> >>>> 1. I?ve made 2 profile runs of around 10 minutes (see files >>>> profile_data.txt and profile_data2.txt). Looking at it, most time seems be >>>> spent at the fop?s fsync and readdirp. >>>> >>>> Unfortunate I don?t have the profile info for the 3.12.15 version so >>>> it?s a bit hard to compare. >>>> >>>> One additional thing I do notice on 1 machine (10.32.9.5) the iowait >>>> time increased a lot, from an average below the 1% it?s now around the 12% >>>> after the upgrade. >>>> >>>> So first suspicion with be lighting strikes twice, and I?ve also just >>>> now a bad disk, but that doesn?t appear to be the case, since all smart >>>> status report ok. >>>> >>>> Also dd shows performance I would more or less expect; >>>> >>>> dd if=/dev/zero of=/data/test_file bs=100M count=1 oflag=dsync >>>> >>>> 1+0 records in >>>> >>>> 1+0 records out >>>> >>>> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s >>>> >>>> dd if=/dev/zero of=/data/test_file bs=1G count=1 oflag=dsync >>>> >>>> 1+0 records in >>>> >>>> 1+0 records out >>>> >>>> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s >>>> >>>> if=/dev/urandom of=/data/test_file bs=1024 count=1000000 >>>> >>>> 1000000+0 records in >>>> >>>> 1000000+0 records out >>>> >>>> 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s >>>> >>>> dd if=/dev/zero of=/data/test_file bs=1024 count=1000000 >>>> >>>> 1000000+0 records in >>>> >>>> 1000000+0 records out >>>> >>>> 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s >>>> >>>> When I disable this brick (service glusterd stop; pkill glusterfsd) >>>> performance in gluster is better, but not on par with what it was. Also the >>>> cpu usages on the ?neighbor? nodes which hosts the other bricks in the same >>>> subvolume increases quite a lot in this case, which I wouldn?t expect >>>> actually since they shouldn't handle much more work, except flagging shards >>>> to heal. Iowait also goes to idle once gluster is stopped, so it?s for >>>> sure gluster which waits for io. >>>> >>>> >>>> >>> >>> So I see that FSYNC %-latency is on the higher side. And I also noticed >>> you don't have direct-io options enabled on the volume. >>> Could you set the following options on the volume - >>> # gluster volume set network.remote-dio off >>> # gluster volume set performance.strict-o-direct on >>> and also disable choose-local >>> # gluster volume set cluster.choose-local off >>> >>> let me know if this helps. >>> >>> 2. I?ve attached the mnt log and volume info, but I couldn?t find >>>> anything relevant in in those logs. I think this is because we run the VM?s >>>> with libgfapi; >>>> >>>> [root at ovirt-host-01 ~]# engine-config -g LibgfApiSupported >>>> >>>> LibgfApiSupported: true version: 4.2 >>>> >>>> LibgfApiSupported: true version: 4.1 >>>> >>>> LibgfApiSupported: true version: 4.3 >>>> >>>> And I can confirm the qemu process is invoked with the gluster:// >>>> address for the images. >>>> >>>> The message is logged in the /var/lib/libvert/qemu/ file, >>>> which I?ve also included. For a sample case see around; 2019-03-28 20:20:07 >>>> >>>> Which has the error; E [MSGID: 133010] >>>> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >>>> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >>>> [Stale file handle] >>>> >>> >>> Could you also attach the brick logs for this volume? >>> >>> >>>> >>>> 3. yes I see multiple instances for the same brick directory, like; >>>> >>>> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id >>>> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p >>>> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid >>>> -S /var/run/gluster/452591c9165945d9.socket --brick-name >>>> /data/gfs/bricks/brick1/ovirt-core -l >>>> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log >>>> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1 >>>> --process-name brick --brick-port 49154 --xlator-option >>>> ovirt-core-server.listen-port=49154 >>>> >>>> >>>> >>>> I?ve made an export of the output of ps from the time I observed these >>>> multiple processes. >>>> >>>> In addition the brick_mux bug as noted by Atin. I might also have >>>> another possible cause, as ovirt moves nodes from none-operational state or >>>> maintenance state to active/activating, it also seems to restart gluster, >>>> however I don?t have direct proof for this theory. >>>> >>>> >>>> >>> >>> +Atin Mukherjee ^^ >>> +Mohit Agrawal ^^ >>> >>> -Krutika >>> >>> Thanks Olaf >>>> >>>> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola < >>>> sbonazzo at redhat.com>: >>>> >>>>> >>>>> >>>>> Il giorno gio 28 mar 2019 alle ore 17:48 >>>>> ha scritto: >>>>> >>>>>> Dear All, >>>>>> >>>>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While >>>>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a >>>>>> different experience. After first trying a test upgrade on a 3 node setup, >>>>>> which went fine. i headed to upgrade the 9 node production platform, >>>>>> unaware of the backward compatibility issues between gluster 3.12.15 -> >>>>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start. >>>>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata >>>>>> was missing or couldn't be accessed. Restoring this file by getting a good >>>>>> copy of the underlying bricks, removing the file from the underlying bricks >>>>>> where the file was 0 bytes and mark with the stickybit, and the >>>>>> corresponding gfid's. Removing the file from the mount point, and copying >>>>>> back the file on the mount point. Manually mounting the engine domain, and >>>>>> manually creating the corresponding symbolic links in /rhev/data-center and >>>>>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was >>>>>> root.root), i was able to start the HA engine again. Since the engine was >>>>>> up again, and things seemed rather unstable i decided to continue the >>>>>> upgrade on the other nodes suspecting an incompatibility in gluster >>>>>> versions, i thought would be best to have them all on the same version >>>>>> rather soonish. However things went from bad to worse, the engine stopped >>>>>> again, and all vm?s stopped working as well. So on a machine outside the >>>>>> setup and restored a backup of the engine taken from version 4.2.8 just >>>>>> before the upgrade. With this engine I was at least able to start some vm?s >>>>>> again, and finalize the upgrade. Once the upgraded, things didn?t stabilize >>>>>> and also lose 2 vm?s during the process due to image corruption. After >>>>>> figuring out gluster 5.3 had quite some issues I was as lucky to see >>>>>> gluster 5.5 was about to be released, on the moment the RPM?s were >>>>>> available I?ve installed those. This helped a lot in terms of stability, >>>>>> for which I?m very grateful! However the performance is unfortunate >>>>>> terrible, it?s about 15% of what the performance was running gluster >>>>>> 3.12.15. It?s strange since a simple dd shows ok performance, but our >>>>>> actual workload doesn?t. While I would expect the performance to be better, >>>>>> due to all improvements made since gluster version 3.12. Does anybody share >>>>>> the same experience? >>>>>> I really hope gluster 6 will soon be tested with ovirt and released, >>>>>> and things start to perform and stabilize again..like the good old days. Of >>>>>> course when I can do anything, I?m happy to help. >>>>>> >>>>> >>>>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track >>>>> the rebase on Gluster 6. >>>>> >>>>> >>>>> >>>>>> >>>>>> I think the following short list of issues we have after the >>>>>> migration; >>>>>> Gluster 5.5; >>>>>> - Poor performance for our workload (mostly write dependent) >>>>>> - VM?s randomly pause on unknown storage errors, which are >>>>>> ?stale file?s?. corresponding log; Lookup on shard 797 failed. Base file >>>>>> gfid = 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle] >>>>>> - Some files are listed twice in a directory (probably related >>>>>> the stale file issue?) >>>>>> Example; >>>>>> ls -la >>>>>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/ >>>>>> total 3081 >>>>>> drwxr-x---. 2 vdsm kvm 4096 Mar 18 11:34 . >>>>>> drwxr-xr-x. 13 vdsm kvm 4096 Mar 19 09:42 .. >>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>> -rw-rw----. 1 vdsm kvm 1048576 Jan 27 2018 >>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease >>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>> >>>>>> - brick processes sometimes starts multiple times. Sometimes I?ve 5 >>>>>> brick processes for a single volume. Killing all glusterfsd?s for the >>>>>> volume on the machine and running gluster v start force usually just >>>>>> starts one after the event, from then on things look all right. >>>>>> >>>>>> >>>>> May I kindly ask to open bugs on Gluster for above issues at >>>>> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ? >>>>> Sahina? >>>>> >>>>> >>>>>> Ovirt 4.3.2.1-1.el7 >>>>>> - All vms images ownership are changed to root.root after the >>>>>> vm is shutdown, probably related to; >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only >>>>>> scoped to the HA engine. I?m still in compatibility mode 4.2 for the >>>>>> cluster and for the vm?s, but upgraded to version ovirt 4.3.2 >>>>>> >>>>> >>>>> Ryan? >>>>> >>>>> >>>>>> - The network provider is set to ovn, which is fine..actually >>>>>> cool, only the ?ovs-vswitchd? is a CPU hog, and utilizes 100% >>>>>> >>>>> >>>>> Miguel? Dominik? >>>>> >>>>> >>>>>> - It seems on all nodes vdsm tries to get the the stats for the >>>>>> HA engine, which is filling the logs with (not sure if this is new); >>>>>> [api.virt] FINISH getStats return={'status': {'message': "Virtual >>>>>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}", >>>>>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9 >>>>>> (api:54) >>>>>> >>>>> >>>>> Simone? >>>>> >>>>> >>>>>> - It seems the package os_brick [root] managedvolume not >>>>>> supported: Managed Volume Not Supported. Missing package os-brick.: >>>>>> ('Cannot import os_brick',) (caps:149) which fills the vdsm.log, but for >>>>>> this I also saw another message, so I suspect this will already be resolved >>>>>> shortly >>>>>> - The machine I used to run the backup HA engine, doesn?t want >>>>>> to get removed from the hosted-engine ?vm-status, not even after running; >>>>>> hosted-engine --clean-metadata --host-id=10 --force-clean or hosted-engine >>>>>> --clean-metadata --force-clean from the machine itself. >>>>>> >>>>> >>>>> Simone? >>>>> >>>>> >>>>>> >>>>>> Think that's about it. >>>>>> >>>>>> Don?t get me wrong, I don?t want to rant, I just wanted to share my >>>>>> experience and see where things can made better. >>>>>> >>>>> >>>>> If not already done, can you please open bugs for above issues at >>>>> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ? >>>>> >>>>> >>>>>> >>>>>> >>>>>> Best Olaf >>>>>> _______________________________________________ >>>>>> Users mailing list -- users at ovirt.org >>>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>> oVirt Code of Conduct: >>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>> List Archives: >>>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/ >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> SANDRO BONAZZOLA >>>>> >>>>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >>>>> >>>>> Red Hat EMEA >>>>> >>>>> sbonazzo at redhat.com >>>>> >>>>> >>>> _______________________________________________ >>>> Users mailing list -- users at ovirt.org >>>> To unsubscribe send an email to users-leave at ovirt.org >>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>> oVirt Code of Conduct: >>>> https://www.ovirt.org/community/about/community-guidelines/ >>>> List Archives: >>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/ >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Brick: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 11 200 9 No. of Writes: 2 31538 326701 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 22 319528 527228 No. of Writes: 53880 1409021 1140345 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 27747 3229 120201 No. of Writes: 479690 114939 144204 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 209766 7725 43 No. of Writes: 105320 165416 8915 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 6728 RELEASE 0.00 0.00 us 0.00 us 0.00 us 42179 RELEASEDIR 0.01 44.17 us 1.07 us 1288.76 us 2914 OPENDIR 0.02 697.13 us 42.15 us 5689.21 us 322 OPEN 0.02 411.08 us 8.60 us 5405.25 us 606 GETXATTR 0.02 1209.66 us 147.78 us 3219.56 us 234 READDIRP 0.03 38.80 us 19.08 us 7544.91 us 7757 STATFS 0.04 826.28 us 13.79 us 3583.18 us 616 READDIR 0.07 61.83 us 15.94 us 131142.59 us 13989 FSTAT 2.03 137.78 us 48.36 us 235353.97 us 172712 FXATTROP 2.16 983.89 us 10.19 us 660025.30 us 25674 LOOKUP 2.90 406.99 us 36.68 us 756289.17 us 83397 FSYNC 4.63 67941.30 us 13.93 us 1840271.15 us 798 INODELK 7.81 576.74 us 75.16 us 422586.52 us 158680 WRITE 40.09 2713.33 us 11.70 us 1850709.72 us 173111 FINODELK 40.16 3587.78 us 72.64 us 729965.74 us 131143 READ Duration: 58768 seconds Data Read: 45226370705 bytes Data Written: 133611506006 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 394 387 86 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 141 1093 13 No. of Writes: 5905 10055 2308 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 15 515 2595 No. of Writes: 763 1465 1637 Block Size: 262144b+ 524288b+ No. of Reads: 2 0 No. of Writes: 2759 73 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 70 RELEASE 0.00 0.00 us 0.00 us 0.00 us 503 RELEASEDIR 0.00 172.94 us 46.56 us 620.23 us 70 OPEN 0.01 153.42 us 11.38 us 855.47 us 111 GETXATTR 0.01 49.63 us 1.23 us 1288.76 us 503 OPENDIR 0.01 434.10 us 27.72 us 2015.25 us 88 READDIR 0.02 1208.37 us 152.54 us 2434.77 us 46 READDIRP 0.02 43.13 us 20.02 us 2030.66 us 1361 STATFS 0.04 45.66 us 18.57 us 284.28 us 2431 FSTAT 1.20 154.41 us 75.97 us 84525.06 us 23005 FXATTROP 2.86 1865.08 us 14.26 us 212498.60 us 4518 LOOKUP 3.78 1006.27 us 38.86 us 756289.17 us 11072 FSYNC 4.27 60261.87 us 17.32 us 1437527.90 us 209 INODELK 8.19 935.38 us 76.82 us 422586.52 us 25832 WRITE 20.67 13949.32 us 89.67 us 707765.19 us 4374 READ 58.93 7494.13 us 12.88 us 1607033.18 us 23206 FINODELK Duration: 740 seconds Data Read: 385507328 bytes Data Written: 1776420864 bytes Brick: 10.32.9.5:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 2 458 87 No. of Writes: 3 4507 33740 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 54 34013 110867 No. of Writes: 6056 341153 234627 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 7430 587 28255 No. of Writes: 70451 12767 34177 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 2417 6164 15 No. of Writes: 40925 27615 4342 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 49432 RELEASE 0.00 0.00 us 0.00 us 0.00 us 40899 RELEASEDIR 0.00 158.97 us 158.97 us 158.97 us 1 MKNOD 0.00 393.70 us 9.69 us 2344.17 us 8 ENTRYLK 0.00 129.60 us 10.54 us 296.61 us 112 READDIR 0.00 1299.61 us 6.43 us 155911.98 us 125 GETXATTR 0.00 3928.24 us 139.26 us 240788.91 us 236 READDIRP 0.03 3784.28 us 15.61 us 469284.63 us 1686 FSTAT 0.04 2368.24 us 28.06 us 242169.67 us 3623 OPEN 0.05 2811.93 us 8.13 us 1250845.84 us 3381 FLUSH 0.06 4385.28 us 0.80 us 527903.92 us 2653 OPENDIR 0.09 2315.69 us 11.48 us 816339.95 us 7750 STATFS 0.18 55337.88 us 8.34 us 1543417.83 us 648 INODELK 0.37 1462.23 us 6.84 us 1127299.99 us 49902 FINODELK 0.57 3924.78 us 11.60 us 968588.21 us 28256 LOOKUP 1.91 7500.40 us 53.88 us 2738720.92 us 49870 FXATTROP 2.21 30153.49 us 63.31 us 3473303.89 us 14319 READ 14.57 110289.45 us 122.19 us 3055911.44 us 25864 FSYNC 79.91 262383.20 us 98.78 us 4500846.60 us 59632 WRITE Duration: 60363 seconds Data Read: 6417030998 bytes Data Written: 27570997546 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 59 2334 441 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 13 331 7 No. of Writes: 4519 1752 790 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 0 145 0 No. of Writes: 84 399 151 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 214 31 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 615 RELEASE 0.00 0.00 us 0.00 us 0.00 us 467 RELEASEDIR 0.00 45.52 us 8.46 us 78.57 us 25 GETXATTR 0.00 144.98 us 13.50 us 296.61 us 16 READDIR 0.01 7894.70 us 180.57 us 240788.91 us 46 READDIRP 0.03 1404.57 us 1.00 us 94678.96 us 467 OPENDIR 0.06 4985.90 us 17.46 us 403210.36 us 294 FSTAT 0.06 2453.17 us 35.05 us 242169.67 us 615 OPEN 0.10 3976.83 us 9.70 us 1250845.84 us 591 FLUSH 0.10 33579.24 us 10.59 us 937670.52 us 73 INODELK 0.12 2132.57 us 14.22 us 816339.95 us 1361 STATFS 0.29 617.29 us 8.19 us 164742.40 us 11477 FINODELK 0.69 3379.94 us 17.79 us 622513.08 us 5053 LOOKUP 0.84 42003.14 us 160.61 us 1495939.39 us 496 READ 1.66 3575.85 us 68.64 us 1688509.25 us 11476 FXATTROP 22.52 95429.52 us 126.22 us 3055911.44 us 5823 FSYNC 73.52 168379.58 us 110.01 us 4058537.96 us 10773 WRITE Duration: 740 seconds Data Read: 12386304 bytes Data Written: 217700864 bytes Brick: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 789986 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 49432 RELEASE 0.00 0.00 us 0.00 us 0.00 us 40938 RELEASEDIR 0.00 21.03 us 16.15 us 34.88 us 8 ENTRYLK 0.00 270.94 us 270.94 us 270.94 us 1 MKNOD 0.01 261.74 us 11.55 us 9174.66 us 116 GETXATTR 0.01 297.48 us 13.73 us 2466.13 us 112 READDIR 0.07 64.73 us 15.29 us 4946.30 us 3382 FLUSH 0.07 82.72 us 1.51 us 4642.85 us 2661 OPENDIR 0.22 193.05 us 39.92 us 64374.98 us 3624 OPEN 0.25 1255.82 us 14.35 us 63381.45 us 648 INODELK 0.89 57.44 us 10.33 us 8940.33 us 50009 FINODELK 1.44 77.62 us 15.84 us 31914.28 us 59679 WRITE 2.59 294.62 us 15.52 us 115626.36 us 28267 LOOKUP 3.49 224.71 us 77.62 us 98174.30 us 49948 FXATTROP 90.95 11273.35 us 78.67 us 453079.55 us 25908 FSYNC Duration: 60366 seconds Data Read: 0 bytes Data Written: 789986 bytes Interval 9 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 10774 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 615 RELEASE 0.00 0.00 us 0.00 us 0.00 us 467 RELEASEDIR 0.00 85.11 us 12.25 us 434.77 us 14 GETXATTR 0.00 42.29 us 17.37 us 500.84 us 73 INODELK 0.00 205.10 us 15.46 us 509.24 us 16 READDIR 0.04 57.10 us 15.29 us 1829.07 us 591 FLUSH 0.05 79.87 us 1.78 us 1854.43 us 467 OPENDIR 0.11 144.84 us 45.31 us 17419.60 us 615 OPEN 0.79 55.64 us 13.17 us 8940.33 us 11478 FINODELK 0.93 69.84 us 16.64 us 6779.39 us 10774 WRITE 1.79 286.71 us 16.91 us 24721.64 us 5053 LOOKUP 3.09 218.15 us 81.40 us 54774.50 us 11476 FXATTROP 93.19 12944.68 us 111.22 us 453079.55 us 5825 FSYNC Duration: 740 seconds Data Read: 0 bytes Data Written: 10774 bytes Brick: 10.32.9.4:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 52412 6 0 No. of Writes: 3 4504 33731 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 0 66342 53000 No. of Writes: 6056 340041 234374 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 2686 356 12946 No. of Writes: 70264 12678 34177 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 602 3108 3 No. of Writes: 20547 27615 4342 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 49333 RELEASE 0.00 0.00 us 0.00 us 0.00 us 40379 RELEASEDIR 0.00 22.96 us 15.97 us 51.06 us 8 ENTRYLK 0.00 260.50 us 260.50 us 260.50 us 1 MKNOD 0.00 77.42 us 16.00 us 2701.10 us 648 INODELK 0.00 144.50 us 7.61 us 1702.83 us 428 GETXATTR 0.01 285.11 us 12.41 us 3140.45 us 406 READDIR 0.01 51.02 us 13.08 us 46431.20 us 3384 FLUSH 0.01 65.58 us 0.94 us 17715.27 us 2808 OPENDIR 0.01 80.40 us 11.70 us 19019.90 us 2445 STAT 0.03 118.76 us 40.23 us 33323.48 us 3626 OPEN 0.03 57.81 us 15.15 us 27740.94 us 7757 STATFS 0.04 197.89 us 119.38 us 17249.34 us 2481 READDIRP 0.36 99.03 us 11.12 us 301165.99 us 49989 FINODELK 1.08 526.62 us 13.29 us 263413.97 us 28422 LOOKUP 1.23 341.69 us 71.48 us 563688.45 us 49950 FXATTROP 7.47 3998.57 us 82.10 us 469183.97 us 25947 READ 35.02 18777.53 us 92.85 us 483169.32 us 25908 FSYNC 54.69 12727.00 us 149.97 us 759284.50 us 59684 WRITE Duration: 58519 seconds Data Read: 3261956842 bytes Data Written: 24886890282 bytes Interval 9 Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 665 0 0 No. of Writes: 0 59 2334 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 0 7 103 No. of Writes: 441 4519 1752 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 2 3 17 No. of Writes: 790 84 399 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 0 0 0 No. of Writes: 151 214 31 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 615 RELEASE 0.00 0.00 us 0.00 us 0.00 us 488 RELEASEDIR 0.00 39.06 us 18.17 us 225.32 us 73 INODELK 0.00 76.36 us 9.68 us 297.33 us 46 GETXATTR 0.01 222.65 us 21.02 us 593.11 us 58 READDIR 0.01 45.45 us 12.62 us 289.83 us 417 STAT 0.01 36.42 us 13.83 us 918.21 us 591 FLUSH 0.01 53.38 us 0.99 us 474.56 us 488 OPENDIR 0.02 87.38 us 40.23 us 5527.50 us 615 OPEN 0.03 49.03 us 18.83 us 3866.60 us 1361 STATFS 0.04 189.02 us 122.54 us 990.42 us 435 READDIRP 0.33 63.39 us 13.25 us 128981.30 us 11477 FINODELK 0.74 321.28 us 13.43 us 37963.21 us 5074 LOOKUP 0.98 186.47 us 80.20 us 43834.36 us 11476 FXATTROP 2.30 6321.31 us 154.25 us 110020.47 us 797 READ 39.43 8011.27 us 168.50 us 404368.45 us 10774 WRITE 56.08 21071.37 us 152.03 us 325318.37 us 5826 FSYNC Duration: 740 seconds Data Read: 2502580 bytes Data Written: 217700864 bytes Brick: 10.32.9.8:/data/gfs/bricks/bricka/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 2836841 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 3501 RELEASE 0.00 0.00 us 0.00 us 0.00 us 39501 RELEASEDIR 0.00 23.26 us 15.42 us 47.19 us 110 FLUSH 0.01 85.78 us 57.99 us 148.82 us 124 REMOVEXATTR 0.01 90.83 us 67.34 us 158.33 us 124 SETATTR 0.01 67.66 us 10.50 us 368.03 us 194 GETXATTR 0.02 84.14 us 41.98 us 499.27 us 250 OPEN 0.04 36.99 us 12.62 us 398.57 us 944 INODELK 0.06 280.66 us 14.10 us 1296.89 us 197 READDIR 0.14 49.60 us 1.25 us 911.95 us 2704 OPENDIR 8.05 27.51 us 11.45 us 86619.65 us 270887 FINODELK 8.60 50.28 us 14.57 us 117405.17 us 158241 WRITE 22.34 810.95 us 15.51 us 136924.46 us 25499 LOOKUP 26.84 184.15 us 32.55 us 187376.40 us 134874 FSYNC 33.87 115.65 us 48.10 us 68557.92 us 271003 FXATTROP Duration: 59079 seconds Data Read: 0 bytes Data Written: 2836841 bytes Interval 9 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 25110 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 84 RELEASE 0.00 0.00 us 0.00 us 0.00 us 482 RELEASEDIR 0.00 22.68 us 15.95 us 30.51 us 22 FLUSH 0.02 81.94 us 66.17 us 121.51 us 24 REMOVEXATTR 0.02 92.75 us 73.22 us 158.33 us 24 SETATTR 0.02 50.76 us 10.50 us 198.71 us 52 GETXATTR 0.06 88.22 us 47.47 us 347.92 us 84 OPEN 0.07 200.12 us 17.43 us 366.16 us 46 READDIR 0.10 43.88 us 12.88 us 398.57 us 298 INODELK 0.17 46.60 us 1.30 us 95.78 us 482 OPENDIR 6.89 35.71 us 14.98 us 8325.56 us 25110 WRITE 8.62 243.97 us 17.27 us 13438.88 us 4599 LOOKUP 9.62 26.97 us 12.23 us 10471.02 us 46438 FINODELK 32.58 183.27 us 33.33 us 182520.02 us 23144 FSYNC 41.83 117.30 us 57.85 us 1991.12 us 46424 FXATTROP Duration: 740 seconds Data Read: 0 bytes Data Written: 25110 bytes Brick: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 8 1097 109 No. of Writes: 0 2901 273197 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 115 238693 440909 No. of Writes: 36872 1361504 875644 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 37900 3346 141710 No. of Writes: 293109 93776 162079 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 3889 7281 33 No. of Writes: 161749 236364 7941 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 FORGET 0.00 0.00 us 0.00 us 0.00 us 3720 RELEASE 0.00 0.00 us 0.00 us 0.00 us 39522 RELEASEDIR 0.00 46.19 us 10.83 us 328.71 us 167 GETXATTR 0.00 107.36 us 42.28 us 762.18 us 195 OPEN 0.01 203.17 us 12.21 us 864.86 us 197 READDIR 0.03 43.02 us 1.32 us 452.74 us 2704 OPENDIR 0.06 2113.84 us 1920.14 us 2569.20 us 124 READDIRP 0.06 36.11 us 17.79 us 347.13 us 7757 STATFS 0.09 35.61 us 16.14 us 340.33 us 11844 FSTAT 0.73 27.99 us 11.02 us 73986.88 us 118371 FINODELK 1.77 136.85 us 37.39 us 121066.77 us 58862 FSYNC 1.88 346.99 us 15.01 us 77684.23 us 24658 LOOKUP 3.34 128.87 us 55.07 us 45501.15 us 118386 FXATTROP 5.55 52717.08 us 16.10 us 2004661.60 us 480 INODELK 9.40 234.45 us 75.18 us 172924.48 us 182886 WRITE 77.09 3911.50 us 74.71 us 427304.61 us 89909 READ Duration: 59079 seconds Data Read: 18550783716 bytes Data Written: 169056832000 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 28 1201 202 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 88 1012 13 No. of Writes: 11370 7637 1887 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 13 723 0 No. of Writes: 518 690 1562 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 2221 50 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 21 RELEASE 0.00 0.00 us 0.00 us 0.00 us 473 RELEASEDIR 0.00 63.17 us 11.55 us 328.71 us 24 GETXATTR 0.00 85.37 us 47.47 us 218.02 us 21 OPEN 0.01 191.44 us 20.33 us 506.91 us 28 READDIR 0.05 42.27 us 1.32 us 103.43 us 473 OPENDIR 0.12 35.99 us 17.79 us 312.67 us 1361 STATFS 0.13 2137.12 us 1993.74 us 2312.10 us 24 READDIRP 0.19 36.56 us 16.68 us 182.58 us 2058 FSTAT 1.46 28.28 us 11.78 us 4920.38 us 20656 FINODELK 3.09 283.99 us 16.10 us 77684.23 us 4368 LOOKUP 3.43 134.79 us 38.66 us 46317.56 us 10211 FSYNC 6.69 129.92 us 55.07 us 1519.44 us 20670 FXATTROP 15.38 225.45 us 75.18 us 166890.53 us 27366 WRITE 18.50 114198.06 us 16.67 us 2004661.60 us 65 INODELK 50.94 11055.19 us 133.17 us 355082.08 us 1849 READ Duration: 740 seconds Data Read: 57180160 bytes Data Written: 1466518016 bytes Brick: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 8 146 10 No. of Writes: 0 5640 191078 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 12 275838 263894 No. of Writes: 29947 1275560 712585 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 17182 2303 69829 No. of Writes: 286032 45424 94648 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 2287 4034 20 No. of Writes: 88659 100478 6790 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 3501 RELEASE 0.00 0.00 us 0.00 us 0.00 us 40057 RELEASEDIR 0.00 20.98 us 14.68 us 31.79 us 110 FLUSH 0.00 82.04 us 58.17 us 510.84 us 124 REMOVEXATTR 0.00 86.21 us 62.72 us 156.44 us 124 SETATTR 0.00 268.84 us 6.67 us 2674.39 us 259 GETXATTR 0.00 321.60 us 41.89 us 3004.61 us 250 OPEN 0.01 475.40 us 14.32 us 1787.62 us 281 READDIR 0.01 1249.13 us 26.42 us 3832.88 us 234 READDIRP 0.05 178.77 us 16.13 us 351764.07 us 7757 STATFS 0.09 822.57 us 1.17 us 1068559.38 us 2746 OPENDIR 0.12 292.98 us 25.31 us 1365160.22 us 10719 FSTAT 0.90 25010.58 us 13.33 us 887933.44 us 941 INODELK 1.38 133.53 us 11.10 us 3938189.55 us 270885 FINODELK 1.68 162.21 us 45.86 us 2503412.21 us 271003 FXATTROP 2.03 394.43 us 31.20 us 756176.42 us 134874 FSYNC 12.66 2092.21 us 72.21 us 4245933.36 us 158241 WRITE 14.29 14633.42 us 10.75 us 4031333.55 us 25543 LOOKUP 66.78 18236.84 us 69.72 us 6429153.50 us 95797 READ Duration: 59031 seconds Data Read: 10396155504 bytes Data Written: 84404067328 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 64 279 45 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 172 529 10 No. of Writes: 14494 5264 1377 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 5 351 0 No. of Writes: 316 548 1064 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 1620 39 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 84 RELEASE 0.00 0.00 us 0.00 us 0.00 us 494 RELEASEDIR 0.00 22.20 us 17.97 us 31.79 us 22 FLUSH 0.00 81.91 us 68.21 us 101.65 us 24 REMOVEXATTR 0.00 89.38 us 72.10 us 128.30 us 24 SETATTR 0.01 43.26 us 1.29 us 153.44 us 494 OPENDIR 0.01 427.33 us 15.79 us 1319.22 us 70 READDIR 0.02 33.36 us 19.57 us 258.67 us 1361 STATFS 0.02 561.73 us 9.55 us 2674.39 us 86 GETXATTR 0.02 1283.75 us 28.25 us 3720.17 us 46 READDIRP 0.03 790.46 us 41.89 us 3004.61 us 84 OPEN 0.03 42.33 us 25.31 us 308.87 us 1862 FSTAT 0.53 27.26 us 11.92 us 22788.88 us 46436 FINODELK 0.61 316.49 us 10.75 us 100131.38 us 4611 LOOKUP 2.25 231.96 us 37.46 us 540421.26 us 23144 FSYNC 2.97 152.85 us 51.13 us 541193.18 us 46424 FXATTROP 4.89 39428.07 us 14.19 us 881646.99 us 296 INODELK 18.93 1799.80 us 72.85 us 3118991.86 us 25110 WRITE 69.67 155845.21 us 130.17 us 4885484.01 us 1067 READ Duration: 740 seconds Data Read: 28553216 bytes Data Written: 1032566784 bytes Brick: 10.32.9.21:/data/gfs/bricks/bricka/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 3513729 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 FORGET 0.00 0.00 us 0.00 us 0.00 us 4089 RELEASE 0.00 0.00 us 0.00 us 0.00 us 44272 RELEASEDIR 0.18 42.97 us 1.09 us 1138.82 us 2893 OPENDIR 0.42 611.92 us 15.33 us 7388.53 us 479 INODELK 0.56 679.34 us 15.23 us 2483.07 us 574 READDIR 0.61 2170.81 us 48.25 us 13138.61 us 195 OPEN 0.82 1066.87 us 8.75 us 13801.35 us 535 GETXATTR 4.89 28.84 us 10.98 us 51214.69 us 118373 FINODELK 9.18 35.04 us 14.64 us 81798.39 us 182886 WRITE 18.04 506.78 us 12.31 us 165781.70 us 24847 LOOKUP 22.07 130.09 us 53.80 us 40959.22 us 118386 FXATTROP 43.23 512.45 us 38.18 us 285202.84 us 58862 FSYNC Duration: 60363 seconds Data Read: 0 bytes Data Written: 3513729 bytes Interval 9 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 27366 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 21 RELEASE 0.00 0.00 us 0.00 us 0.00 us 500 RELEASEDIR 0.02 82.18 us 48.25 us 161.40 us 21 OPEN 0.02 41.30 us 16.13 us 156.19 us 65 INODELK 0.08 136.77 us 13.02 us 909.03 us 66 GETXATTR 0.20 43.34 us 1.27 us 277.72 us 500 OPENDIR 0.33 428.51 us 15.42 us 1215.07 us 82 READDIR 5.70 29.84 us 11.88 us 2949.68 us 20656 FINODELK 9.15 36.14 us 14.64 us 4606.43 us 27366 WRITE 11.51 283.22 us 12.31 us 53047.02 us 4395 LOOKUP 26.02 136.12 us 53.80 us 40959.22 us 20670 FXATTROP 46.96 497.27 us 40.09 us 274185.32 us 10211 FSYNC Duration: 740 seconds Data Read: 0 bytes Data Written: 27366 bytes Brick: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-data ----------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 11 66 2 No. of Writes: 2 31826 326701 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 4 116933 302613 No. of Writes: 53880 1410548 1140566 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 11546 1749 76356 No. of Writes: 479855 114971 144225 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 1906 4093 24 No. of Writes: 105312 165416 8915 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 6820 RELEASE 0.00 0.00 us 0.00 us 0.00 us 44299 RELEASEDIR 0.03 42.29 us 13.86 us 9356.79 us 2425 STAT 0.04 46.94 us 1.04 us 772.13 us 2893 OPENDIR 0.04 314.86 us 10.29 us 2391.56 us 479 GETXATTR 0.05 207.42 us 14.29 us 5815.42 us 799 INODELK 0.05 51.40 us 28.18 us 540.78 us 3666 FSTAT 0.09 39.82 us 18.62 us 9889.16 us 7757 STATFS 0.12 1358.64 us 43.37 us 233429.90 us 322 OPEN 0.14 851.72 us 15.78 us 4414.06 us 574 READDIR 0.16 224.28 us 143.72 us 3249.69 us 2482 READDIRP 1.46 30.22 us 10.80 us 110711.59 us 173110 FINODELK 4.32 601.98 us 15.19 us 91847.23 us 25659 LOOKUP 6.30 130.39 us 49.28 us 232146.00 us 172711 FXATTROP 8.84 378.85 us 35.58 us 430356.59 us 83395 FSYNC 23.35 525.84 us 73.80 us 494782.91 us 158694 WRITE 55.01 4360.84 us 78.00 us 503162.29 us 45075 READ Duration: 60363 seconds Data Read: 10404548654 bytes Data Written: 133624068438 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 394 387 86 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 31 563 12 No. of Writes: 5905 10055 2308 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 7 368 1 No. of Writes: 763 1465 1637 Block Size: 262144b+ 524288b+ No. of Reads: 1 0 No. of Writes: 2759 73 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 70 RELEASE 0.00 0.00 us 0.00 us 0.00 us 500 RELEASEDIR 0.02 42.46 us 14.29 us 589.73 us 210 INODELK 0.03 34.11 us 20.98 us 177.92 us 413 STAT 0.04 298.66 us 10.97 us 1498.98 us 63 GETXATTR 0.06 48.48 us 1.17 us 120.46 us 500 OPENDIR 0.08 52.55 us 31.44 us 108.17 us 637 FSTAT 0.12 37.70 us 18.62 us 406.46 us 1361 STATFS 0.17 871.79 us 16.50 us 2028.77 us 82 READDIR 0.23 227.43 us 145.28 us 3249.69 us 435 READDIRP 0.56 3427.01 us 43.37 us 233429.90 us 70 OPEN 1.62 30.00 us 11.97 us 3260.32 us 23206 FINODELK 8.55 159.50 us 72.57 us 232146.00 us 23005 FXATTROP 8.94 849.78 us 16.81 us 91847.23 us 4515 LOOKUP 24.77 959.75 us 38.69 us 430356.59 us 11072 FSYNC 24.89 10859.72 us 153.99 us 250511.54 us 983 READ 29.91 496.71 us 73.80 us 453768.34 us 25832 WRITE Duration: 740 seconds Data Read: 30142464 bytes Data Written: 1776420864 bytes Brick: 10.32.9.20:/data/gfs/bricks/bricka/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 3979583 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 6630 RELEASE 0.00 0.00 us 0.00 us 0.00 us 36900 RELEASEDIR 0.01 57.48 us 12.65 us 329.66 us 84 GETXATTR 0.02 193.66 us 14.90 us 878.82 us 70 READDIR 0.05 105.94 us 41.17 us 683.04 us 322 OPEN 0.06 54.19 us 15.72 us 5135.95 us 800 INODELK 0.19 54.37 us 1.60 us 1035.48 us 2641 OPENDIR 7.68 32.94 us 11.38 us 68417.92 us 173114 FINODELK 9.52 44.57 us 14.70 us 55440.51 us 158694 WRITE 24.39 712.95 us 16.53 us 280142.79 us 25407 LOOKUP 27.40 243.98 us 34.94 us 251521.50 us 83395 FSYNC 30.68 131.93 us 50.81 us 55731.00 us 172711 FXATTROP Duration: 57920 seconds Data Read: 0 bytes Data Written: 3979583 bytes Interval 9 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 25832 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 70 RELEASE 0.00 0.00 us 0.00 us 0.00 us 464 RELEASEDIR 0.01 62.34 us 15.20 us 116.82 us 17 GETXATTR 0.02 199.99 us 16.79 us 778.14 us 10 READDIR 0.10 114.77 us 46.50 us 683.04 us 70 OPEN 0.22 86.24 us 16.71 us 5135.95 us 211 INODELK 0.32 56.51 us 1.98 us 1035.48 us 464 OPENDIR 8.88 31.82 us 12.28 us 7988.05 us 23206 FINODELK 11.60 37.32 us 15.06 us 2981.61 us 25832 WRITE 12.08 224.23 us 20.07 us 39256.75 us 4479 LOOKUP 28.45 213.58 us 40.22 us 94343.80 us 11072 FSYNC 38.31 138.39 us 69.94 us 3069.85 us 23005 FXATTROP Duration: 740 seconds Data Read: 0 bytes Data Written: 25832 bytes Brick: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-data ----------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 4 109 21 No. of Writes: 0 2901 273197 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 42 112256 166124 No. of Writes: 36872 1361504 875644 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 12144 1370 69995 No. of Writes: 293109 93776 162079 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 1466 2942 9 No. of Writes: 161749 236364 7941 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 3097 RELEASE 0.00 0.00 us 0.00 us 0.00 us 36900 RELEASEDIR 0.00 53.86 us 12.74 us 546.30 us 60 GETXATTR 0.00 211.97 us 16.62 us 988.45 us 70 READDIR 0.00 101.31 us 39.98 us 497.56 us 195 OPEN 0.00 46.92 us 14.96 us 3469.09 us 481 INODELK 0.01 52.86 us 1.65 us 1573.96 us 2641 OPENDIR 0.03 216.49 us 152.23 us 2562.55 us 2482 READDIRP 0.03 73.86 us 18.77 us 125905.96 us 7757 STATFS 0.06 111.91 us 16.04 us 655589.61 us 10152 FSTAT 0.07 542.53 us 12.89 us 523421.21 us 2425 STAT 1.10 803.43 us 18.50 us 1534952.31 us 24595 LOOKUP 1.16 177.03 us 11.27 us 1749236.34 us 118375 FINODELK 1.44 218.66 us 58.80 us 1784231.76 us 118390 FXATTROP 13.72 4194.48 us 39.91 us 2743546.94 us 58865 FSYNC 36.03 14004.46 us 79.14 us 2966713.52 us 46303 READ 46.33 4558.98 us 77.68 us 2638579.30 us 182887 WRITE Duration: 57920 seconds Data Read: 8237195368 bytes Data Written: 169056832000 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 28 1201 202 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 131 256 14 No. of Writes: 11370 7637 1887 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 2 277 4 No. of Writes: 518 690 1562 Block Size: 262144b+ 524288b+ No. of Reads: 2 0 No. of Writes: 2221 50 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 21 RELEASE 0.00 0.00 us 0.00 us 0.00 us 464 RELEASEDIR 0.00 47.44 us 13.63 us 73.33 us 12 GETXATTR 0.00 212.66 us 19.17 us 803.07 us 10 READDIR 0.00 118.97 us 59.03 us 352.73 us 21 OPEN 0.00 45.59 us 20.23 us 248.98 us 65 INODELK 0.01 56.79 us 1.86 us 1008.29 us 464 OPENDIR 0.03 48.42 us 20.11 us 1484.22 us 1764 FSTAT 0.04 214.84 us 152.23 us 558.38 us 435 READDIRP 0.06 113.06 us 19.98 us 96371.03 us 1361 STATFS 0.07 470.41 us 15.68 us 177048.08 us 413 STAT 0.61 369.50 us 21.75 us 98568.83 us 4359 LOOKUP 0.84 107.63 us 11.27 us 422960.68 us 20656 FINODELK 1.50 191.62 us 58.80 us 669097.82 us 20670 FXATTROP 15.43 59246.36 us 108.59 us 2819788.24 us 686 READ 15.75 4062.95 us 40.23 us 1993844.42 us 10211 FSYNC 65.65 6319.81 us 80.06 us 2441596.43 us 27366 WRITE Duration: 740 seconds Data Read: 22843392 bytes Data Written: 1466518016 bytes Brick: 10.32.9.3:/data/gfs/bricks/brick3/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 4 131 8 No. of Writes: 0 5640 191078 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 16 81419 161095 No. of Writes: 29947 1275560 712585 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 17291 1342 103864 No. of Writes: 286032 45424 94648 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 1022 1870 19 No. of Writes: 88659 100478 6790 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 3665 RELEASE 0.00 0.00 us 0.00 us 0.00 us 40654 RELEASEDIR 0.00 77.86 us 14.39 us 255.33 us 58 GETXATTR 0.00 55.93 us 18.71 us 185.55 us 110 FLUSH 0.00 148.39 us 76.71 us 306.67 us 124 REMOVEXATTR 0.00 156.41 us 94.69 us 237.19 us 124 SETATTR 0.00 282.94 us 26.64 us 1991.99 us 72 READDIR 0.01 249.61 us 62.86 us 1608.99 us 250 OPEN 0.02 108.18 us 23.05 us 1656.32 us 942 INODELK 0.03 73.13 us 23.73 us 337.63 us 2425 STAT 0.04 93.83 us 1.99 us 17101.96 us 2641 OPENDIR 0.10 78.77 us 24.79 us 1132.37 us 7757 STATFS 0.14 108.18 us 41.73 us 4078.40 us 7332 FSTAT 0.18 262.35 us 91.17 us 5148.11 us 3890 READDIRP 2.78 59.76 us 13.90 us 60015.05 us 270884 FINODELK 3.23 739.71 us 25.36 us 119501.01 us 25436 LOOKUP 7.72 333.10 us 45.00 us 283828.60 us 134874 FSYNC 9.10 195.46 us 67.03 us 157955.41 us 271003 FXATTROP 19.48 716.64 us 94.11 us 340140.18 us 158241 WRITE 57.15 8361.71 us 110.31 us 596087.45 us 39783 READ Duration: 60363 seconds Data Read: 9818510392 bytes Data Written: 84404067328 bytes Interval 9 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 64 279 45 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 83 367 8 No. of Writes: 14494 5264 1377 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 4 586 0 No. of Writes: 316 548 1064 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 1620 39 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 84 RELEASE 0.00 0.00 us 0.00 us 0.00 us 464 RELEASEDIR 0.00 84.14 us 26.99 us 150.82 us 9 GETXATTR 0.00 52.66 us 18.71 us 108.64 us 22 FLUSH 0.00 187.96 us 35.03 us 565.44 us 11 READDIR 0.01 139.45 us 77.67 us 239.13 us 24 REMOVEXATTR 0.01 152.96 us 94.69 us 236.75 us 24 SETATTR 0.05 67.18 us 24.77 us 302.92 us 413 STAT 0.06 395.71 us 67.77 us 1608.99 us 84 OPEN 0.06 82.30 us 2.19 us 210.85 us 464 OPENDIR 0.09 186.01 us 24.60 us 1656.32 us 297 INODELK 0.18 78.19 us 27.95 us 1050.83 us 1361 STATFS 0.25 117.05 us 45.04 us 4078.40 us 1274 FSTAT 0.30 264.20 us 102.02 us 5148.11 us 682 READDIRP 2.07 271.71 us 35.32 us 22720.97 us 4581 LOOKUP 4.81 62.33 us 16.26 us 6962.45 us 46436 FINODELK 12.97 337.03 us 54.20 us 221094.08 us 23144 FSYNC 15.31 198.37 us 91.04 us 5197.06 us 46424 FXATTROP 24.57 588.49 us 97.75 us 228091.00 us 25110 WRITE 39.26 22524.25 us 112.00 us 551619.18 us 1048 READ Duration: 740 seconds Data Read: 42180608 bytes Data Written: 1032566784 bytes -------------- next part -------------- Brick: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 11 197 9 No. of Writes: 2 5298 50158 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 22 152814 155742 No. of Writes: 10281 141128 229969 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 19592 2856 34807 No. of Writes: 46540 15874 16618 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 41083 7619 43 No. of Writes: 12325 19939 1278 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 1163 RELEASE 0.00 0.00 us 0.00 us 0.00 us 8997 RELEASEDIR 0.01 42.50 us 1.07 us 200.99 us 2015 OPENDIR 0.03 1205.44 us 147.78 us 3219.56 us 160 READDIRP 0.03 38.21 us 19.08 us 7544.91 us 5346 STATFS 0.03 947.57 us 42.15 us 5689.21 us 221 OPEN 0.03 536.56 us 11.61 us 5405.25 us 416 GETXATTR 0.06 42.40 us 15.94 us 3408.97 us 9626 FSTAT 0.06 992.07 us 13.79 us 3583.18 us 440 READDIR 2.07 784.10 us 10.19 us 100292.79 us 17781 LOOKUP 2.58 128.10 us 48.36 us 73127.27 us 135547 FXATTROP 2.75 282.45 us 36.68 us 403763.55 us 65559 FSYNC 5.30 72558.78 us 13.93 us 1840271.15 us 491 INODELK 7.97 450.94 us 75.16 us 368078.53 us 118790 WRITE 24.45 1212.26 us 11.70 us 1850709.72 us 135580 FINODELK 54.60 3082.33 us 72.64 us 387813.52 us 119063 READ Duration: 10748 seconds Data Read: 13585666193 bytes Data Written: 16779903830 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 4096b+ No. of Reads: 0 0 1151 No. of Writes: 13 2 405 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 580 10 0 No. of Writes: 357 115 16 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 314 96 0 No. of Writes: 25 19 62 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 9 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 42.88 us 11.61 us 65.94 us 6 GETXATTR 0.01 95.97 us 52.28 us 174.48 us 9 OPEN 0.01 43.61 us 1.83 us 68.31 us 25 OPENDIR 0.01 30.47 us 22.02 us 42.78 us 53 STATFS 0.01 291.00 us 148.50 us 772.43 us 6 READDIR 0.01 2255.97 us 2255.97 us 2255.97 us 1 READDIRP 0.03 42.10 us 18.38 us 114.84 us 98 FSTAT 0.10 30.02 us 14.10 us 662.25 us 520 FINODELK 0.18 165.79 us 20.56 us 2137.59 us 173 LOOKUP 0.21 137.80 us 53.21 us 366.07 us 243 FSYNC 0.43 133.43 us 81.66 us 384.37 us 520 FXATTROP 4.53 713.51 us 75.16 us 101590.43 us 1014 WRITE 28.81 102208.45 us 16.11 us 931987.84 us 45 INODELK 65.66 4873.26 us 79.54 us 293295.82 us 2151 READ Duration: 26 seconds Data Read: 42790912 bytes Data Written: 40480256 bytes Brick: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 140764 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 8678 RELEASE 0.00 0.00 us 0.00 us 0.00 us 10636 RELEASEDIR 0.00 21.03 us 16.15 us 34.88 us 8 ENTRYLK 0.00 270.94 us 270.94 us 270.94 us 1 MKNOD 0.01 323.61 us 13.73 us 2466.13 us 80 READDIR 0.02 326.98 us 11.55 us 9174.66 us 83 GETXATTR 0.09 65.78 us 16.75 us 4946.30 us 2332 FLUSH 0.09 85.68 us 1.51 us 4642.85 us 1834 OPENDIR 0.32 227.62 us 41.15 us 64374.98 us 2476 OPEN 0.45 2256.77 us 15.05 us 63381.45 us 347 INODELK 0.97 56.53 us 10.33 us 5526.47 us 29676 FINODELK 1.78 74.73 us 15.84 us 5393.43 us 41433 WRITE 3.57 320.06 us 15.52 us 115626.36 us 19340 LOOKUP 3.89 227.62 us 77.62 us 68580.08 us 29634 FXATTROP 88.81 9879.97 us 149.50 us 367307.05 us 15606 FSYNC Duration: 12346 seconds Data Read: 0 bytes Data Written: 140764 bytes Interval 6 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 174 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 25 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.04 73.34 us 20.62 us 134.34 us 5 GETXATTR 0.24 82.36 us 46.64 us 190.61 us 25 OPEN 0.31 437.89 us 272.99 us 1086.34 us 6 READDIR 0.43 184.47 us 19.85 us 2374.95 us 20 FLUSH 0.47 235.68 us 19.14 us 1331.71 us 17 INODELK 0.58 197.88 us 2.13 us 1981.61 us 25 OPENDIR 1.50 115.21 us 18.93 us 1851.08 us 112 FINODELK 2.46 194.41 us 96.77 us 1656.53 us 109 FXATTROP 5.75 284.33 us 23.78 us 1087.73 us 174 WRITE 6.46 295.65 us 21.14 us 19073.84 us 188 LOOKUP 81.76 12553.12 us 218.24 us 73123.96 us 56 FSYNC Duration: 26 seconds Data Read: 0 bytes Data Written: 174 bytes Brick: 10.32.9.4:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 9195 6 0 No. of Writes: 3 700 2043 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 0 32431 16795 No. of Writes: 623 67451 42803 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 1749 317 6006 No. of Writes: 12604 1347 5850 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 565 3076 3 No. of Writes: 2324 2469 893 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 8579 RELEASE 0.00 0.00 us 0.00 us 0.00 us 8397 RELEASEDIR 0.00 22.96 us 15.97 us 51.06 us 8 ENTRYLK 0.00 260.50 us 260.50 us 260.50 us 1 MKNOD 0.00 93.03 us 16.42 us 2701.10 us 347 INODELK 0.00 128.90 us 7.61 us 1129.71 us 284 GETXATTR 0.01 300.85 us 12.41 us 3140.45 us 290 READDIR 0.01 57.69 us 13.08 us 46431.20 us 2334 FLUSH 0.01 70.20 us 0.94 us 17715.27 us 1939 OPENDIR 0.02 89.64 us 11.70 us 19019.90 us 1691 STAT 0.03 132.06 us 42.23 us 33323.48 us 2478 OPEN 0.03 62.90 us 16.20 us 27740.94 us 5346 STATFS 0.03 202.21 us 119.54 us 17249.34 us 1709 READDIRP 0.36 120.14 us 11.12 us 301165.99 us 29660 FINODELK 1.20 614.44 us 13.29 us 263413.97 us 19453 LOOKUP 1.27 427.97 us 71.48 us 563688.45 us 29634 FXATTROP 9.41 3840.22 us 82.10 us 469183.97 us 24493 READ 27.25 17452.64 us 92.85 us 483169.32 us 15606 FSYNC 60.36 14557.60 us 149.97 us 759284.50 us 41437 WRITE Duration: 10499 seconds Data Read: 2314290390 bytes Data Written: 3195953450 bytes Interval 6 Stats: Block Size: 256b+ 512b+ 4096b+ No. of Reads: 23 0 571 No. of Writes: 0 3 13 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 52 0 0 No. of Writes: 128 13 1 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 46 0 0 No. of Writes: 7 7 2 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 25 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.01 68.94 us 10.11 us 221.91 us 6 GETXATTR 0.01 30.00 us 15.66 us 54.39 us 20 FLUSH 0.02 44.60 us 20.24 us 95.68 us 15 STAT 0.02 41.06 us 19.29 us 103.84 us 17 INODELK 0.02 162.01 us 110.86 us 228.45 us 6 READDIR 0.03 45.60 us 1.45 us 89.10 us 25 OPENDIR 0.04 75.73 us 44.53 us 190.60 us 25 OPEN 0.04 36.62 us 16.52 us 66.38 us 53 STATFS 0.09 185.21 us 136.70 us 275.71 us 21 READDIRP 0.11 44.70 us 19.04 us 91.96 us 111 FINODELK 0.37 148.06 us 86.69 us 324.91 us 109 FXATTROP 1.53 355.20 us 16.63 us 41956.49 us 188 LOOKUP 25.17 19663.37 us 213.60 us 89492.03 us 56 FSYNC 28.20 7089.90 us 177.90 us 39757.75 us 174 WRITE 44.33 2802.26 us 104.46 us 49721.81 us 692 READ Duration: 26 seconds Data Read: 5790220 bytes Data Written: 4044288 bytes Brick: 10.32.9.5:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 2 458 87 No. of Writes: 3 703 2052 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 54 15030 33939 No. of Writes: 623 68563 43056 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 3940 540 7475 No. of Writes: 12791 1436 5850 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 2395 6141 15 No. of Writes: 22702 2469 893 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 8678 RELEASE 0.00 0.00 us 0.00 us 0.00 us 10597 RELEASEDIR 0.00 158.97 us 158.97 us 158.97 us 1 MKNOD 0.00 393.70 us 9.69 us 2344.17 us 8 ENTRYLK 0.00 126.39 us 10.54 us 281.56 us 80 READDIR 0.00 1956.79 us 6.43 us 155911.98 us 82 GETXATTR 0.00 3315.24 us 139.26 us 170764.63 us 160 READDIRP 0.03 4020.12 us 15.61 us 469284.63 us 1164 FSTAT 0.04 2583.09 us 28.06 us 204742.02 us 2476 OPEN 0.04 2825.94 us 8.33 us 590785.10 us 2332 FLUSH 0.07 5673.25 us 0.80 us 527903.92 us 1832 OPENDIR 0.09 2595.01 us 11.48 us 382537.47 us 5342 STATFS 0.14 59496.60 us 8.34 us 1543417.83 us 347 INODELK 0.40 2031.08 us 6.84 us 1127299.99 us 29607 FINODELK 0.57 4484.44 us 11.60 us 968588.21 us 19334 LOOKUP 2.00 10241.03 us 53.88 us 1880631.52 us 29585 FXATTROP 2.57 29123.91 us 63.31 us 3473303.89 us 13399 READ 11.80 115055.07 us 122.19 us 2279735.88 us 15581 FSYNC 82.25 301692.99 us 98.78 us 4500846.60 us 41400 WRITE Duration: 12343 seconds Data Read: 4270911318 bytes Data Written: 5880060714 bytes Interval 6 Stats: Block Size: 512b+ 4096b+ 8192b+ No. of Reads: 0 29 109 No. of Writes: 3 13 128 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 15 0 47 No. of Writes: 13 1 7 Block Size: 131072b+ 262144b+ No. of Reads: 0 0 No. of Writes: 7 2 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 25 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 40.48 us 6.94 us 92.02 us 6 GETXATTR 0.00 28.47 us 10.19 us 50.39 us 20 FLUSH 0.00 53.78 us 31.59 us 76.06 us 12 FSTAT 0.00 129.00 us 88.83 us 161.83 us 6 READDIR 0.00 47.94 us 0.89 us 88.84 us 25 OPENDIR 0.00 1499.83 us 1499.83 us 1499.83 us 1 READDIRP 0.04 1333.68 us 30.78 us 31125.60 us 25 OPEN 0.05 836.76 us 12.08 us 23439.58 us 53 STATFS 0.11 872.63 us 8.21 us 56495.95 us 111 FINODELK 0.88 6920.74 us 68.16 us 625281.67 us 109 FXATTROP 1.02 4614.95 us 13.66 us 348629.63 us 188 LOOKUP 1.65 83057.19 us 12.45 us 658978.30 us 17 INODELK 6.49 27709.93 us 98.14 us 471332.27 us 200 READ 8.08 123206.43 us 267.50 us 979136.22 us 56 FSYNC 81.66 396159.13 us 232.40 us 1353202.04 us 176 WRITE Duration: 26 seconds Data Read: 4341760 bytes Data Written: 4044288 bytes Brick: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 8 146 10 No. of Writes: 0 822 17574 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 12 147605 81020 No. of Writes: 3335 177490 110247 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 12618 1795 18321 No. of Writes: 30013 5366 10235 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 2188 4008 20 No. of Writes: 8375 8875 585 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 608 RELEASE 0.00 0.00 us 0.00 us 0.00 us 8795 RELEASEDIR 0.00 20.59 us 14.68 us 25.91 us 75 FLUSH 0.00 77.89 us 58.31 us 94.39 us 85 REMOVEXATTR 0.00 85.37 us 64.65 us 156.44 us 85 SETATTR 0.00 86.81 us 42.56 us 643.87 us 146 OPEN 0.00 127.62 us 6.67 us 943.71 us 165 GETXATTR 0.00 507.62 us 14.32 us 1787.62 us 201 READDIR 0.01 1235.24 us 26.42 us 3832.88 us 160 READDIRP 0.06 244.53 us 16.13 us 351764.07 us 5346 STATFS 0.10 1171.75 us 1.17 us 1068559.38 us 1895 OPENDIR 0.13 406.39 us 25.42 us 1365160.22 us 7375 FSTAT 0.53 20755.68 us 13.33 us 887933.44 us 564 INODELK 1.46 170.00 us 45.86 us 2503412.21 us 190741 FXATTROP 1.53 178.24 us 11.10 us 3938189.55 us 190604 FINODELK 1.82 425.12 us 32.12 us 663207.46 us 94843 FSYNC 11.43 2187.65 us 72.21 us 4245933.36 us 116005 WRITE 16.73 21132.03 us 14.12 us 4031333.55 us 17576 LOOKUP 66.20 15737.71 us 69.72 us 6429153.50 us 93418 READ Duration: 11011 seconds Data Read: 4863622768 bytes Data Written: 8447544320 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 3 109 5 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 6829 259 5 No. of Writes: 228 230 13 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 0 175 0 No. of Writes: 4 2 12 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 12 2 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 7 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 74.17 us 74.17 us 74.17 us 1 REMOVEXATTR 0.00 89.45 us 89.45 us 89.45 us 1 SETATTR 0.00 33.25 us 16.33 us 64.78 us 4 GETXATTR 0.00 70.61 us 52.42 us 153.99 us 7 OPEN 0.00 37.55 us 1.95 us 64.59 us 25 OPENDIR 0.00 30.14 us 18.44 us 44.04 us 53 STATFS 0.00 275.53 us 155.69 us 765.06 us 6 READDIR 0.00 2352.74 us 2352.74 us 2352.74 us 1 READDIRP 0.01 43.65 us 29.55 us 73.65 us 76 FSTAT 0.05 149.19 us 25.96 us 227.32 us 171 LOOKUP 0.06 25.33 us 11.55 us 59.71 us 1236 FINODELK 0.15 130.28 us 50.48 us 244.99 us 609 FSYNC 0.26 113.48 us 78.80 us 565.69 us 1237 FXATTROP 0.27 237.31 us 80.85 us 2140.21 us 620 WRITE 10.35 142923.57 us 15.02 us 887933.44 us 39 INODELK 88.84 6582.39 us 75.54 us 3820683.07 us 7268 READ Duration: 26 seconds Data Read: 41644032 bytes Data Written: 11333120 bytes Brick: 10.32.9.21:/data/gfs/bricks/bricka/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 583084 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 FORGET 0.00 0.00 us 0.00 us 0.00 us 1506 RELEASE 0.00 0.00 us 0.00 us 0.00 us 11330 RELEASEDIR 0.16 41.96 us 1.09 us 174.18 us 2000 OPENDIR 0.53 882.55 us 15.33 us 7388.53 us 313 INODELK 0.61 785.26 us 15.23 us 2483.07 us 410 READDIR 0.78 2944.84 us 50.45 us 13138.61 us 139 OPEN 1.04 1439.59 us 8.75 us 13801.35 us 379 GETXATTR 4.69 28.29 us 10.98 us 51214.69 us 87008 FINODELK 9.29 34.56 us 14.65 us 81798.39 us 141069 WRITE 19.62 601.79 us 13.34 us 82349.56 us 17113 LOOKUP 21.24 128.13 us 74.02 us 5590.59 us 87026 FXATTROP 42.05 509.22 us 38.18 us 285202.84 us 43355 FSYNC Duration: 12343 seconds Data Read: 0 bytes Data Written: 583084 bytes Interval 6 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 820 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 14 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.26 54.89 us 17.77 us 76.95 us 9 GETXATTR 0.54 73.00 us 50.45 us 123.40 us 14 OPEN 0.55 41.66 us 1.43 us 58.76 us 25 OPENDIR 0.88 278.14 us 157.59 us 394.86 us 6 READDIR 0.97 40.13 us 17.39 us 342.49 us 46 INODELK 9.81 34.88 us 16.20 us 3058.74 us 534 FINODELK 14.96 34.64 us 16.51 us 177.38 us 820 WRITE 17.10 168.21 us 24.51 us 323.55 us 193 LOOKUP 17.52 127.46 us 46.97 us 299.68 us 261 FSYNC 37.41 133.00 us 80.38 us 341.49 us 534 FXATTROP Duration: 26 seconds Data Read: 0 bytes Data Written: 820 bytes Brick: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-data ----------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 11 66 2 No. of Writes: 2 5586 50158 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 4 52513 93336 No. of Writes: 10281 142655 230190 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 7241 1565 19639 No. of Writes: 46705 15906 16639 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 1874 4071 24 No. of Writes: 12317 19939 1278 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 1255 RELEASE 0.00 0.00 us 0.00 us 0.00 us 11357 RELEASEDIR 0.03 45.76 us 13.86 us 9356.79 us 1679 STAT 0.03 46.01 us 1.04 us 772.13 us 2000 OPENDIR 0.04 326.97 us 10.29 us 2391.56 us 353 GETXATTR 0.05 51.07 us 29.86 us 540.78 us 2522 FSTAT 0.05 313.16 us 16.18 us 5815.42 us 491 INODELK 0.07 883.53 us 45.88 us 6148.85 us 221 OPEN 0.08 41.10 us 18.98 us 9889.16 us 5346 STATFS 0.12 856.40 us 15.78 us 4414.06 us 410 READDIR 0.14 225.48 us 143.72 us 926.43 us 1710 READDIRP 1.41 29.62 us 10.80 us 110711.59 us 135572 FINODELK 3.66 587.34 us 15.19 us 61302.68 us 17766 LOOKUP 5.95 125.02 us 49.28 us 17686.93 us 135539 FXATTROP 6.44 279.71 us 35.58 us 407061.84 us 65554 FSYNC 19.89 476.93 us 75.41 us 440395.70 us 118787 WRITE 62.04 4088.32 us 78.00 us 503162.29 us 43217 READ Duration: 12343 seconds Data Read: 4610277422 bytes Data Written: 16792466262 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 4096b+ No. of Reads: 0 0 70 No. of Writes: 13 2 405 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads: 342 5 0 No. of Writes: 357 115 16 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 168 0 0 No. of Writes: 25 19 62 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 9 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.01 35.71 us 24.38 us 65.77 us 15 STAT 0.02 44.49 us 1.68 us 67.87 us 25 OPENDIR 0.02 126.43 us 52.25 us 398.25 us 9 OPEN 0.02 51.99 us 32.93 us 109.37 us 26 FSTAT 0.03 256.51 us 145.04 us 360.84 us 6 READDIR 0.03 33.20 us 20.54 us 47.63 us 53 STATFS 0.03 185.61 us 12.72 us 1214.94 us 10 GETXATTR 0.07 94.06 us 17.46 us 2439.30 us 45 INODELK 0.08 213.94 us 177.04 us 383.69 us 21 READDIRP 0.26 29.48 us 15.33 us 71.02 us 521 FINODELK 0.48 159.48 us 21.99 us 327.94 us 173 LOOKUP 1.22 136.16 us 84.01 us 461.36 us 521 FXATTROP 4.60 1098.76 us 46.07 us 205950.17 us 243 FSYNC 8.03 459.44 us 85.85 us 21485.20 us 1014 WRITE 85.10 8442.50 us 119.17 us 309111.15 us 585 READ Duration: 26 seconds Data Read: 14180352 bytes Data Written: 40480256 bytes Brick: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-data ----------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 4 109 21 No. of Writes: 0 460 54715 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 42 45163 54896 No. of Writes: 8883 212986 162092 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 7020 1175 20888 No. of Writes: 48288 16340 24443 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 1408 2923 9 No. of Writes: 19159 26333 792 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 514 RELEASE 0.00 0.00 us 0.00 us 0.00 us 6838 RELEASEDIR 0.00 59.50 us 14.59 us 546.30 us 41 GETXATTR 0.00 220.10 us 16.62 us 988.45 us 50 READDIR 0.00 104.05 us 39.98 us 497.56 us 139 OPEN 0.00 48.49 us 15.80 us 3469.09 us 315 INODELK 0.01 50.94 us 1.65 us 1573.96 us 1820 OPENDIR 0.03 215.71 us 153.61 us 2562.55 us 1710 READDIRP 0.03 69.86 us 18.77 us 125905.96 us 5346 STATFS 0.07 141.42 us 16.04 us 655589.61 us 6984 FSTAT 0.08 659.41 us 12.89 us 523421.21 us 1679 STAT 1.24 1008.30 us 19.19 us 1534952.31 us 16933 LOOKUP 1.30 204.65 us 11.71 us 1749236.34 us 87007 FINODELK 1.48 233.96 us 73.33 us 1784231.76 us 87026 FXATTROP 12.57 3983.58 us 39.91 us 2743546.94 us 43355 FSYNC 41.38 12734.21 us 79.14 us 2966713.52 us 44635 READ 41.80 4069.99 us 77.68 us 2638579.30 us 141069 WRITE Duration: 9900 seconds Data Read: 3717300328 bytes Data Written: 21265203712 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 3 39 2 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 381 294 20 No. of Writes: 83 467 47 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 0 264 0 No. of Writes: 32 35 41 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 65 6 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 14 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 55.58 us 15.93 us 93.98 us 8 GETXATTR 0.00 37.21 us 25.04 us 59.45 us 15 STAT 0.01 69.47 us 46.35 us 141.23 us 14 OPEN 0.01 26.67 us 18.22 us 49.45 us 46 INODELK 0.01 49.56 us 2.58 us 143.98 us 25 OPENDIR 0.01 34.60 us 18.77 us 126.90 us 53 STATFS 0.01 355.56 us 149.60 us 988.45 us 6 READDIR 0.02 41.77 us 21.14 us 73.50 us 72 FSTAT 0.02 215.13 us 169.65 us 274.27 us 21 READDIRP 0.08 29.04 us 13.69 us 66.67 us 534 FINODELK 0.17 160.73 us 23.54 us 321.76 us 193 LOOKUP 0.39 135.86 us 77.40 us 1627.16 us 534 FXATTROP 4.78 1072.78 us 86.42 us 197615.21 us 820 WRITE 19.40 13686.01 us 47.75 us 1796049.15 us 261 FSYNC 75.09 14420.80 us 94.10 us 2221701.71 us 959 READ Duration: 26 seconds Data Read: 21602304 bytes Data Written: 49301504 bytes Brick: 10.32.9.20:/data/gfs/bricks/bricka/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 549022 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 2 FORGET 0.00 0.00 us 0.00 us 0.00 us 1065 RELEASE 0.00 0.00 us 0.00 us 0.00 us 6838 RELEASEDIR 0.01 53.35 us 12.82 us 228.91 us 57 GETXATTR 0.02 194.83 us 14.90 us 878.82 us 50 READDIR 0.04 43.27 us 15.72 us 322.38 us 491 INODELK 0.04 97.28 us 41.17 us 432.99 us 221 OPEN 0.16 53.03 us 1.74 us 501.22 us 1820 OPENDIR 7.40 33.07 us 11.38 us 68417.92 us 135575 FINODELK 9.15 46.66 us 14.70 us 55440.51 us 118787 WRITE 26.84 925.03 us 16.53 us 280142.79 us 17586 LOOKUP 27.29 252.34 us 34.94 us 251521.50 us 65555 FSYNC 29.08 130.03 us 50.81 us 55731.00 us 135539 FXATTROP Duration: 9900 seconds Data Read: 0 bytes Data Written: 549022 bytes Interval 6 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 1014 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 9 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.11 43.98 us 15.40 us 84.40 us 5 GETXATTR 0.44 95.58 us 51.84 us 164.84 us 9 OPEN 0.62 48.41 us 2.33 us 70.35 us 25 OPENDIR 0.72 31.10 us 17.28 us 117.76 us 45 INODELK 0.98 318.59 us 154.88 us 878.82 us 6 READDIR 9.45 35.24 us 14.72 us 2246.13 us 521 FINODELK 14.32 160.73 us 22.70 us 316.55 us 173 LOOKUP 17.04 135.61 us 56.40 us 1874.03 us 244 FSYNC 19.01 36.40 us 16.78 us 2947.68 us 1014 WRITE 37.30 139.06 us 85.48 us 1078.96 us 521 FXATTROP Duration: 26 seconds Data Read: 0 bytes Data Written: 1014 bytes Brick: 10.32.9.8:/data/gfs/bricks/bricka/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 372917 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 608 RELEASE 0.00 0.00 us 0.00 us 0.00 us 8719 RELEASEDIR 0.00 23.65 us 15.42 us 47.19 us 75 FLUSH 0.01 86.28 us 57.99 us 148.82 us 85 REMOVEXATTR 0.01 90.74 us 67.34 us 130.94 us 85 SETATTR 0.01 73.83 us 11.11 us 236.42 us 133 GETXATTR 0.02 83.84 us 41.98 us 499.27 us 146 OPEN 0.03 33.97 us 12.62 us 315.20 us 564 INODELK 0.06 316.19 us 16.57 us 1296.89 us 141 READDIR 0.13 49.96 us 1.25 us 911.95 us 1865 OPENDIR 7.61 27.61 us 11.45 us 86619.65 us 190604 FINODELK 9.35 55.71 us 14.57 us 117405.17 us 116005 WRITE 25.18 183.50 us 32.55 us 187376.40 us 94843 FSYNC 25.95 1022.20 us 15.51 us 136924.46 us 17546 LOOKUP 31.63 114.64 us 48.10 us 68557.92 us 190742 FXATTROP Duration: 11059 seconds Data Read: 0 bytes Data Written: 372917 bytes Interval 6 Stats: Block Size: 1b+ No. of Reads: 0 No. of Writes: 620 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 7 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.02 96.14 us 96.14 us 96.14 us 1 SETATTR 0.02 102.92 us 102.92 us 102.92 us 1 REMOVEXATTR 0.13 84.08 us 51.75 us 221.40 us 7 OPEN 0.13 61.82 us 16.08 us 159.35 us 10 GETXATTR 0.25 46.62 us 1.88 us 77.92 us 25 OPENDIR 0.27 32.45 us 17.59 us 152.92 us 39 INODELK 0.34 261.02 us 153.06 us 363.57 us 6 READDIR 4.63 34.65 us 16.37 us 80.81 us 620 WRITE 5.77 156.66 us 22.46 us 304.56 us 171 LOOKUP 7.34 27.58 us 14.15 us 781.04 us 1236 FINODELK 31.07 116.50 us 78.86 us 261.81 us 1238 FXATTROP 50.02 381.21 us 39.71 us 149969.67 us 609 FSYNC Duration: 26 seconds Data Read: 0 bytes Data Written: 620 bytes Brick: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-data ---------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 8 1096 109 No. of Writes: 0 460 54715 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 115 111912 132663 No. of Writes: 8883 212986 162092 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 27012 2879 35570 No. of Writes: 48288 16340 24443 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 3743 7207 32 No. of Writes: 19159 26333 792 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 1 FORGET 0.00 0.00 us 0.00 us 0.00 us 1137 RELEASE 0.00 0.00 us 0.00 us 0.00 us 8740 RELEASEDIR 0.00 43.88 us 10.83 us 250.84 us 124 GETXATTR 0.00 120.62 us 44.65 us 762.18 us 139 OPEN 0.01 206.40 us 12.21 us 864.86 us 141 READDIR 0.02 43.05 us 1.44 us 452.74 us 1865 OPENDIR 0.05 2104.16 us 1920.14 us 2569.20 us 85 READDIRP 0.05 36.26 us 19.25 us 347.13 us 5346 STATFS 0.08 35.13 us 16.14 us 340.33 us 8148 FSTAT 0.63 27.73 us 11.02 us 73986.88 us 87007 FINODELK 1.53 134.48 us 38.31 us 90956.10 us 43355 FSYNC 1.73 388.39 us 15.69 us 62037.95 us 16978 LOOKUP 2.63 31993.13 us 16.10 us 888210.80 us 314 INODELK 2.93 128.32 us 73.56 us 45501.15 us 87026 FXATTROP 8.70 235.34 us 76.95 us 172924.48 us 141069 WRITE 81.65 3612.62 us 74.71 us 427304.61 us 86246 READ Duration: 11059 seconds Data Read: 8285158628 bytes Data Written: 21265203712 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 3 39 2 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 4976 418 10 No. of Writes: 83 467 47 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 0 264 0 No. of Writes: 32 35 41 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 65 6 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 14 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 51.92 us 14.43 us 92.40 us 9 GETXATTR 0.01 40.96 us 1.87 us 66.77 us 25 OPENDIR 0.01 75.90 us 55.90 us 118.42 us 14 OPEN 0.01 236.50 us 152.98 us 361.08 us 6 READDIR 0.01 32.60 us 22.90 us 46.88 us 53 STATFS 0.01 2244.32 us 2244.32 us 2244.32 us 1 READDIRP 0.02 38.95 us 19.17 us 88.64 us 84 FSTAT 0.10 28.63 us 16.14 us 161.63 us 534 FINODELK 0.20 160.39 us 23.30 us 348.09 us 193 LOOKUP 0.38 226.64 us 49.98 us 20602.07 us 261 FSYNC 0.47 137.09 us 84.61 us 5477.48 us 534 FXATTROP 0.97 181.82 us 86.92 us 637.73 us 820 WRITE 28.68 96252.17 us 18.55 us 888210.80 us 46 INODELK 69.12 1882.74 us 79.08 us 157169.15 us 5668 READ Duration: 26 seconds Data Read: 41271296 bytes Data Written: 49301504 bytes Brick: 10.32.9.3:/data/gfs/bricks/brick3/ovirt-data --------------------------------------------------- Cumulative Stats: Block Size: 256b+ 512b+ 1024b+ No. of Reads: 4 131 8 No. of Writes: 0 822 17574 Block Size: 2048b+ 4096b+ 8192b+ No. of Reads: 16 35286 49422 No. of Writes: 3335 177490 110247 Block Size: 16384b+ 32768b+ 65536b+ No. of Reads: 12220 1028 23348 No. of Writes: 30013 5366 10235 Block Size: 131072b+ 262144b+ 524288b+ No. of Reads: 974 1858 19 No. of Writes: 8375 8875 585 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 772 RELEASE 0.00 0.00 us 0.00 us 0.00 us 10592 RELEASEDIR 0.00 76.27 us 14.39 us 255.33 us 38 GETXATTR 0.00 56.50 us 19.21 us 185.55 us 75 FLUSH 0.00 150.43 us 76.71 us 292.39 us 85 REMOVEXATTR 0.00 157.91 us 97.76 us 237.19 us 85 SETATTR 0.00 308.86 us 26.64 us 1991.99 us 51 READDIR 0.01 183.93 us 62.86 us 1362.63 us 146 OPEN 0.01 73.43 us 23.05 us 777.89 us 564 INODELK 0.03 75.13 us 23.73 us 337.63 us 1679 STAT 0.03 89.35 us 1.99 us 1982.01 us 1820 OPENDIR 0.09 79.36 us 24.79 us 805.05 us 5346 STATFS 0.11 106.63 us 41.73 us 1740.16 us 5044 FSTAT 0.15 262.43 us 91.17 us 4453.28 us 2680 READDIRP 2.38 58.64 us 13.90 us 26031.44 us 190605 FINODELK 3.35 898.17 us 25.36 us 119501.01 us 17501 LOOKUP 6.60 326.75 us 45.00 us 283828.60 us 94843 FSYNC 7.89 194.27 us 67.03 us 157955.41 us 190743 FXATTROP 17.21 696.94 us 97.07 us 340140.18 us 116005 WRITE 62.13 7751.59 us 110.31 us 596087.45 us 37647 READ Duration: 12343 seconds Data Read: 3321340984 bytes Data Written: 8447544320 bytes Interval 6 Stats: Block Size: 512b+ 1024b+ 2048b+ No. of Reads: 0 0 0 No. of Writes: 3 109 5 Block Size: 4096b+ 8192b+ 16384b+ No. of Reads: 187 225 13 No. of Writes: 228 230 13 Block Size: 32768b+ 65536b+ 131072b+ No. of Reads: 0 185 0 No. of Writes: 4 2 12 Block Size: 262144b+ 524288b+ No. of Reads: 0 0 No. of Writes: 12 2 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop --------- ----------- ----------- ----------- ------------ ---- 0.00 0.00 us 0.00 us 0.00 us 7 RELEASE 0.00 0.00 us 0.00 us 0.00 us 25 RELEASEDIR 0.00 173.56 us 173.56 us 173.56 us 1 REMOVEXATTR 0.00 190.79 us 190.79 us 190.79 us 1 SETATTR 0.01 80.71 us 20.92 us 143.06 us 5 GETXATTR 0.01 163.10 us 113.71 us 277.38 us 7 OPEN 0.01 76.13 us 26.98 us 125.17 us 15 STAT 0.02 77.17 us 1.99 us 133.67 us 25 OPENDIR 0.03 61.12 us 30.85 us 227.60 us 39 INODELK 0.05 656.71 us 271.17 us 1991.99 us 6 READDIR 0.05 75.47 us 32.20 us 361.18 us 53 STATFS 0.07 110.87 us 59.06 us 160.47 us 52 FSTAT 0.11 249.69 us 108.39 us 447.56 us 34 READDIRP 0.50 231.83 us 29.43 us 677.06 us 171 LOOKUP 0.91 58.52 us 20.49 us 999.15 us 1238 FINODELK 2.45 319.84 us 67.30 us 4499.21 us 609 FSYNC 2.79 356.93 us 114.39 us 4840.74 us 620 WRITE 3.02 193.78 us 110.23 us 2600.12 us 1239 FXATTROP 89.95 11709.07 us 151.63 us 198748.33 us 610 READ Duration: 26 seconds Data Read: 14946304 bytes Data Written: 11333120 bytes From pascal.suter at dalco.ch Wed Apr 3 10:28:40 2019 From: pascal.suter at dalco.ch (Pascal Suter) Date: Wed, 3 Apr 2019 12:28:40 +0200 Subject: [Gluster-users] performance - what can I expect Message-ID: Hi all I am currently testing gluster on a single server. I have three bricks, each a hardware RAID6 volume with thin provisioned LVM that was aligned to the RAID and then formatted with xfs. i've created a distributed volume so that entire files get distributed across my three bricks. first I ran a iozone benchmark across each brick testing the read and write perofrmance of a single large file per brick i then mounted my gluster volume locally and ran another iozone run with the same parameters writing a single file. the file went to brick 1 which, when used driectly, would write with 2.3GB/s and read with 1.5GB/s. however, through gluster i got only 800MB/s read and 750MB/s write throughput another run with two processes each writing a file, where one file went to the first brick and the other file to the second brick (which by itself when directly accessed wrote at 2.8GB/s and read at 2.7GB/s) resulted in 1.2GB/s of aggregated write and also aggregated read throughput. Is this a normal performance i can expect out of a glusterfs or is it worth tuning in order to really get closer to the actual brick filesystem performance? here are the iozone commands i use for writing and reading.. note that i am using directIO in order to make sure i don't get fooled by cache :) ./iozone -i 0 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 -s $filesize -r $recordsize > iozone-brick${b}-write.txt ./iozone -i 1 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 -s $filesize -r $recordsize > iozone-brick${b}-read.txt cheers Pascal From jthottan at redhat.com Wed Apr 3 11:16:08 2019 From: jthottan at redhat.com (Jiffin Tony Thottan) Date: Wed, 3 Apr 2019 16:46:08 +0530 Subject: [Gluster-users] Gluster GEO replication fault after write over nfs-ganesha In-Reply-To: <1a5fb44e-fc3b-4edb-28ee-baa4ed077251@redhat.com> References: <1a5fb44e-fc3b-4edb-28ee-baa4ed077251@redhat.com> Message-ID: <050e80fd-2904-f69c-dd91-8bf0dfb96c3f@redhat.com> CCIng sunn as well. On 28/03/19 4:05 PM, Soumya Koduri wrote: > > > On 3/27/19 7:39 PM, Alexey Talikov wrote: >> I have two clusters with dispersed volumes (2+1) with GEO replication >> It works fine till I use glusterfs-fuse, but as even one file written >> over nfs-ganesha replication goes to Fault and recovers after I >> remove this file (sometimes after stop/start) >> I think nfs-hanesha writes file in some way that produces problem >> with replication >> > > I am not much familiar with geo-rep and not sure what/why exactly > failed here. Request Kotresh (cc'ed) to take a look and provide his > insights on the issue. > > Thanks, > Soumya > >> |OSError: [Errno 61] No data available: >> '.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8' | >> >> but if I check over glusterfs mounted with aux-gfid-mount >> >> |getfattr -n trusted.glusterfs.pathinfo -e text >> /mnt/TEST/.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8 getfattr: >> Removing leading '/' from absolute path names # file: >> mnt/TEST/.gfid/9c9514ce-a310-4a1c-a87b-a800a32a99f8 >> trusted.glusterfs.pathinfo="( >> ( >> ))" | >> >> File exists >> Details available here >> https://github.com/nfs-ganesha/nfs-ganesha/issues/408 >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users From kbh-admin at mpa-ifw.tu-darmstadt.de Wed Apr 3 14:20:32 2019 From: kbh-admin at mpa-ifw.tu-darmstadt.de (kbh-admin) Date: Wed, 3 Apr 2019 16:20:32 +0200 Subject: [Gluster-users] Gluster and LVM Message-ID: Hello Gluster-Community, we consider to build several gluster-servers and have a question regarding? lvm and glusterfs. Scenario 1: Snapshots Of course, taking snapshots is a good capability and we want to use lvm for that. Scenaraio 2: Increase gluster volume We want to increase the gluster volume by adding hdd's and/or by adding dell powervaults later. We got the recommendation to set up a new gluster volume for the powervaults and don't use lvm in that case (lvresize ....) . What would you suggest and how do you manage both lvm and glusterfs together? Thanks in advance. Felix From dm at belkam.com Wed Apr 3 15:26:38 2019 From: dm at belkam.com (Dmitry Melekhov) Date: Wed, 3 Apr 2019 19:26:38 +0400 Subject: [Gluster-users] Gluster and LVM In-Reply-To: References: Message-ID: 03.04.2019 18:20, kbh-admin ?????: > Hello Gluster-Community, > > > we consider to build several gluster-servers and have a question > regarding? lvm and glusterfs. > > > Scenario 1: Snapshots > > Of course, taking snapshots is a good capability and we want to use > lvm for that. > > > Scenaraio 2: Increase gluster volume > > We want to increase the gluster volume by adding hdd's and/or by adding > > dell powervaults later. We got the recommendation to set up a new > gluster volume > > for the powervaults and don't use lvm in that case (lvresize ....) . > > > What would you suggest and how do you manage both lvm and glusterfs > together? If you already have storage why you need gluster? Just use it :-) > > Thanks in advance. > > > Felix > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users From pkalever at redhat.com Wed Apr 3 15:41:23 2019 From: pkalever at redhat.com (Prasanna Kalever) Date: Wed, 3 Apr 2019 21:11:23 +0530 Subject: [Gluster-users] Help: gluster-block In-Reply-To: References: Message-ID: On Tue, Apr 2, 2019 at 1:34 AM Karim Roumani wrote: > Actually we have a question. > > We did two tests as follows. > > Test 1 - iSCSI target on the glusterFS server > Test 2 - iSCSI target on a separate server with gluster client > > Test 2 performed a read speed of <1GB/second while Test 1 about > 300MB/second > > Any reason you see to why this may be the case? > For Test 1 case, 1. ops b/w * iscsi initiator <-> iscsi target and * tcmu-runner <-> gluster server are all using the same NIC resource. 2. Also, it might be possible that, the node might be facing high resource usage like cpu is high and/or memory is low, as everything is on the same node. You can check also check gluster profile info, to corner down some of these. Thanks! -- Prasanna > ? > > On Mon, Apr 1, 2019 at 1:00 PM Karim Roumani > wrote: > >> Thank you Prasanna for your quick response very much appreaciated we will >> review and get back to you. >> ? >> >> On Mon, Mar 25, 2019 at 9:00 AM Prasanna Kalever >> wrote: >> >>> [ adding +gluster-users for archive purpose ] >>> >>> On Sat, Mar 23, 2019 at 1:51 AM Jeffrey Chin >>> wrote: >>> > >>> > Hello Mr. Kalever, >>> >>> Hello Jeffrey, >>> >>> > >>> > I am currently working on a project to utilize GlusterFS for VMWare >>> VMs. In our research, we found that utilizing block devices with GlusterFS >>> would be the best approach for our use case (correct me if I am wrong). I >>> saw the gluster utility that you are a contributor for called gluster-block >>> (https://github.com/gluster/gluster-block), and I had a question about >>> the configuration. From what I understand, gluster-block only works on the >>> servers that are serving the gluster volume. Would it be possible to run >>> the gluster-block utility on a client machine that has a gluster volume >>> mounted to it? >>> >>> Yes, that is right! At the moment gluster-block is coupled with >>> glusterd for simplicity. >>> But we have made some changes here [1] to provide a way to specify >>> server address (volfile-server) which is outside the gluster-blockd >>> node, please take a look. >>> >>> Although it is not complete solution, but it should at-least help for >>> some usecases. Feel free to raise an issue [2] with the details about >>> your usecase and etc or submit a PR by your self :-) >>> We never picked it, as we never have a usecase needing separation of >>> gluster-blockd and glusterd. >>> >>> > >>> > I also have another question: how do I make the iSCSI targets persist >>> if all of the gluster nodes were rebooted? It seems like once all of the >>> nodes reboot, I am unable to reconnect to the iSCSI targets created by the >>> gluster-block utility. >>> >>> do you mean rebooting iscsi initiator ? or gluster-block/gluster >>> target/server nodes ? >>> >>> 1. for initiator to automatically connect to block devices post >>> reboot, we need to make below changes in /etc/iscsi/iscsid.conf: >>> node.startup = automatic >>> >>> 2. if you mean, just in case if all the gluster nodes goes down, on >>> the initiator all the available HA path's will be down, but we still >>> want the IO to be queued on the initiator, until one of the path >>> (gluster node) is availabe: >>> >>> for this in gluster-block sepcific section of multipath.conf you need >>> to replace 'no_path_retry 120' as 'no_path_retry queue' >>> Note: refer README for current multipath.conf setting recommendations. >>> >>> [1] https://github.com/gluster/gluster-block/pull/161 >>> [2] https://github.com/gluster/gluster-block/issues/new >>> >>> BRs, >>> -- >>> Prasanna >>> >> >> >> -- >> >> Thank you, >> >> *Karim Roumani* >> Director of Technology Solutions >> >> TekReach Solutions / Albatross Cloud >> 714-916-5677 >> Karim.Roumani at tekreach.com >> Albatross.cloud - One Stop Cloud Solutions >> Portalfronthosting.com - Complete >> SharePoint Solutions >> > > > -- > > Thank you, > > *Karim Roumani* > Director of Technology Solutions > > TekReach Solutions / Albatross Cloud > 714-916-5677 > Karim.Roumani at tekreach.com > Albatross.cloud - One Stop Cloud Solutions > Portalfronthosting.com - Complete > SharePoint Solutions > -------------- next part -------------- An HTML attachment was scrubbed... URL: From moagrawa at redhat.com Wed Apr 3 15:56:15 2019 From: moagrawa at redhat.com (Mohit Agrawal) Date: Wed, 3 Apr 2019 21:26:15 +0530 Subject: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5 In-Reply-To: References: <20190328164716.27693.35887@mail.ovirt.org>

Message-ID: Hi, Thanks Olaf for sharing the relevant logs. @Atin, You are right patch https://review.gluster.org/#/c/glusterfs/+/22344/ will resolve the issue running multiple brick instance for same brick. As we can see in below logs glusterd is trying to start the same brick instance twice at the same time [2019-04-01 10:23:21.752401] I [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh brick process for brick /data/gfs/bricks/brick1/ovirt-engine [2019-04-01 10:23:30.348091] I [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh brick process for brick /data/gfs/bricks/brick1/ovirt-engine [2019-04-01 10:24:13.353396] I [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh brick process for brick /data/gfs/bricks/brick1/ovirt-engine [2019-04-01 10:24:24.253764] I [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh brick process for brick /data/gfs/bricks/brick1/ovirt-engine We are seeing below message between starting of two instances The message "E [MSGID: 101012] [common-utils.c:4075:gf_is_service_running] 0-: Unable to read pidfile: /var/run/gluster/vols/ovirt-engine/10.32.9.5-data-gfs-bricks-brick1-ovirt-engine.pid" repeated 2 times between [2019-04-01 10:23:21.748492] and [2019-04-01 10:23:21.752432] I will backport the same. Thanks, Mohit Agrawal On Wed, Apr 3, 2019 at 3:58 PM Olaf Buitelaar wrote: > Dear Mohit, > > Sorry i thought Krutika was referring to the ovirt-kube brick logs. due > the large size (18MB compressed), i've placed the files here; > https://edgecastcdn.net/0004FA/files/bricklogs.tar.bz2 > Also i see i've attached the wrong files, i intended to > attach profile_data4.txt | profile_data3.txt > Sorry for the confusion. > > Thanks Olaf > > Op wo 3 apr. 2019 om 04:56 schreef Mohit Agrawal : > >> Hi Olaf, >> >> As per current attached "multi-glusterfsd-vol3.txt | >> multi-glusterfsd-vol4.txt" it is showing multiple processes are running >> for "ovirt-core ovirt-engine" brick names but there are no logs >> available in bricklogs.zip specific to this bricks, bricklogs.zip >> has a dump of ovirt-kube logs only >> >> Kindly share brick logs specific to the bricks "ovirt-core >> ovirt-engine" and share glusterd logs also. >> >> Regards >> Mohit Agrawal >> >> On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar >> wrote: >> >>> Dear Krutika, >>> >>> 1. >>> I've changed the volume settings, write performance seems to increased >>> somewhat, however the profile doesn't really support that since latencies >>> increased. However read performance has diminished, which does seem to be >>> supported by the profile runs (attached). >>> Also the IO does seem to behave more consistent than before. >>> I don't really understand the idea behind them, maybe you can explain >>> why these suggestions are good? >>> These settings seems to avoid as much local caching and access as >>> possible and push everything to the gluster processes. While i would expect >>> local access and local caches are a good thing, since it would lead to >>> having less network access or disk access. >>> I tried to investigate these settings a bit more, and this is what i >>> understood of them; >>> - network.remote-dio; when on it seems to ignore the O_DIRECT flag in >>> the client, thus causing the files to be cached and buffered in the page >>> cache on the client, i would expect this to be a good thing especially if >>> the server process would access the same page cache? >>> At least that is what grasp from this commit; >>> https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c line >>> 867 >>> Also found this commit; >>> https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c suggesting >>> remote-dio actually improves performance, not sure it's a write or read >>> benchmark >>> When a file is opened with O_DIRECT it will also disable the >>> write-behind functionality >>> >>> - performance.strict-o-direct: when on, the AFR, will not ignore the >>> O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper, >>> which seems to stack the operation, no idea why that is. But generally i >>> suppose not ignoring the O_DIRECT flag in the AFR is a good thing, when a >>> processes requests to have O_DIRECT. So this makes sense to me. >>> >>> - cluster.choose-local: when off, it doesn't prefer the local node, but >>> would always choose a brick. Since it's a 9 node cluster, with 3 >>> subvolumes, only a 1/3 could end-up local, and the other 2/3 should be >>> pushed to external nodes anyway. Or am I making the total wrong assumption >>> here? >>> >>> It seems to this config is moving to the gluster-block config side of >>> things, which does make sense. >>> Since we're running quite some mysql instances, which opens the files >>> with O_DIRECt i believe, it would mean the only layer of cache is within >>> mysql it self. Which you could argue is a good thing. But i would expect a >>> little of write-behind buffer, and maybe some of the data cached within >>> gluster would alleviate things a bit on gluster's side. But i wouldn't know >>> if that's the correct mind set, and so might be totally off here. >>> Also i would expect these gluster v set command to be online >>> operations, but somehow the bricks went down, after applying these changes. >>> What appears to have happened is that after the update the brick process >>> was restarted, but due to multiple brick process start issue, multiple >>> processes were started, and the brick didn't came online again. >>> However i'll try to reproduce this, since i would like to test with >>> cluster.choose-local: on, and see how performance compares. And hopefully >>> when it occurs collect some useful info. >>> Question; are network.remote-dio and performance.strict-o-direct >>> mutually exclusive settings, or can they both be on? >>> >>> 2. I've attached all brick logs, the only thing relevant i found was; >>> [2019-03-28 20:20:07.170452] I [MSGID: 113030] >>> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >>> open-fd-key-status: 0 for >>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>> [2019-03-28 20:20:07.170491] I [MSGID: 113031] >>> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >>> status: 0 for >>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>> [2019-03-28 20:20:07.248480] I [MSGID: 113030] >>> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >>> open-fd-key-status: 0 for >>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>> [2019-03-28 20:20:07.248491] I [MSGID: 113031] >>> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >>> status: 0 for >>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>> >>> Thanks Olaf >>> >>> ps. sorry needed to resend since it exceed the file limit >>> >>> Op ma 1 apr. 2019 om 07:56 schreef Krutika Dhananjay < >>> kdhananj at redhat.com>: >>> >>>> Adding back gluster-users >>>> Comments inline ... >>>> >>>> On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar < >>>> olaf.buitelaar at gmail.com> wrote: >>>> >>>>> Dear Krutika, >>>>> >>>>> >>>>> >>>>> 1. I?ve made 2 profile runs of around 10 minutes (see files >>>>> profile_data.txt and profile_data2.txt). Looking at it, most time seems be >>>>> spent at the fop?s fsync and readdirp. >>>>> >>>>> Unfortunate I don?t have the profile info for the 3.12.15 version so >>>>> it?s a bit hard to compare. >>>>> >>>>> One additional thing I do notice on 1 machine (10.32.9.5) the iowait >>>>> time increased a lot, from an average below the 1% it?s now around the 12% >>>>> after the upgrade. >>>>> >>>>> So first suspicion with be lighting strikes twice, and I?ve also just >>>>> now a bad disk, but that doesn?t appear to be the case, since all smart >>>>> status report ok. >>>>> >>>>> Also dd shows performance I would more or less expect; >>>>> >>>>> dd if=/dev/zero of=/data/test_file bs=100M count=1 oflag=dsync >>>>> >>>>> 1+0 records in >>>>> >>>>> 1+0 records out >>>>> >>>>> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s >>>>> >>>>> dd if=/dev/zero of=/data/test_file bs=1G count=1 oflag=dsync >>>>> >>>>> 1+0 records in >>>>> >>>>> 1+0 records out >>>>> >>>>> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s >>>>> >>>>> if=/dev/urandom of=/data/test_file bs=1024 count=1000000 >>>>> >>>>> 1000000+0 records in >>>>> >>>>> 1000000+0 records out >>>>> >>>>> 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s >>>>> >>>>> dd if=/dev/zero of=/data/test_file bs=1024 count=1000000 >>>>> >>>>> 1000000+0 records in >>>>> >>>>> 1000000+0 records out >>>>> >>>>> 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s >>>>> >>>>> When I disable this brick (service glusterd stop; pkill glusterfsd) >>>>> performance in gluster is better, but not on par with what it was. Also the >>>>> cpu usages on the ?neighbor? nodes which hosts the other bricks in the same >>>>> subvolume increases quite a lot in this case, which I wouldn?t expect >>>>> actually since they shouldn't handle much more work, except flagging shards >>>>> to heal. Iowait also goes to idle once gluster is stopped, so it?s for >>>>> sure gluster which waits for io. >>>>> >>>>> >>>>> >>>> >>>> So I see that FSYNC %-latency is on the higher side. And I also noticed >>>> you don't have direct-io options enabled on the volume. >>>> Could you set the following options on the volume - >>>> # gluster volume set network.remote-dio off >>>> # gluster volume set performance.strict-o-direct on >>>> and also disable choose-local >>>> # gluster volume set cluster.choose-local off >>>> >>>> let me know if this helps. >>>> >>>> 2. I?ve attached the mnt log and volume info, but I couldn?t find >>>>> anything relevant in in those logs. I think this is because we run the VM?s >>>>> with libgfapi; >>>>> >>>>> [root at ovirt-host-01 ~]# engine-config -g LibgfApiSupported >>>>> >>>>> LibgfApiSupported: true version: 4.2 >>>>> >>>>> LibgfApiSupported: true version: 4.1 >>>>> >>>>> LibgfApiSupported: true version: 4.3 >>>>> >>>>> And I can confirm the qemu process is invoked with the gluster:// >>>>> address for the images. >>>>> >>>>> The message is logged in the /var/lib/libvert/qemu/ file, >>>>> which I?ve also included. For a sample case see around; 2019-03-28 20:20:07 >>>>> >>>>> Which has the error; E [MSGID: 133010] >>>>> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >>>>> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >>>>> [Stale file handle] >>>>> >>>> >>>> Could you also attach the brick logs for this volume? >>>> >>>> >>>>> >>>>> 3. yes I see multiple instances for the same brick directory, like; >>>>> >>>>> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id >>>>> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p >>>>> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid >>>>> -S /var/run/gluster/452591c9165945d9.socket --brick-name >>>>> /data/gfs/bricks/brick1/ovirt-core -l >>>>> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log >>>>> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1 >>>>> --process-name brick --brick-port 49154 --xlator-option >>>>> ovirt-core-server.listen-port=49154 >>>>> >>>>> >>>>> >>>>> I?ve made an export of the output of ps from the time I observed these >>>>> multiple processes. >>>>> >>>>> In addition the brick_mux bug as noted by Atin. I might also have >>>>> another possible cause, as ovirt moves nodes from none-operational state or >>>>> maintenance state to active/activating, it also seems to restart gluster, >>>>> however I don?t have direct proof for this theory. >>>>> >>>>> >>>>> >>>> >>>> +Atin Mukherjee ^^ >>>> +Mohit Agrawal ^^ >>>> >>>> -Krutika >>>> >>>> Thanks Olaf >>>>> >>>>> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola < >>>>> sbonazzo at redhat.com>: >>>>> >>>>>> >>>>>> >>>>>> Il giorno gio 28 mar 2019 alle ore 17:48 >>>>>> ha scritto: >>>>>> >>>>>>> Dear All, >>>>>>> >>>>>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. While >>>>>>> previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one was a >>>>>>> different experience. After first trying a test upgrade on a 3 node setup, >>>>>>> which went fine. i headed to upgrade the 9 node production platform, >>>>>>> unaware of the backward compatibility issues between gluster 3.12.15 -> >>>>>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start. >>>>>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata >>>>>>> was missing or couldn't be accessed. Restoring this file by getting a good >>>>>>> copy of the underlying bricks, removing the file from the underlying bricks >>>>>>> where the file was 0 bytes and mark with the stickybit, and the >>>>>>> corresponding gfid's. Removing the file from the mount point, and copying >>>>>>> back the file on the mount point. Manually mounting the engine domain, and >>>>>>> manually creating the corresponding symbolic links in /rhev/data-center and >>>>>>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was >>>>>>> root.root), i was able to start the HA engine again. Since the engine was >>>>>>> up again, and things seemed rather unstable i decided to continue the >>>>>>> upgrade on the other nodes suspecting an incompatibility in gluster >>>>>>> versions, i thought would be best to have them all on the same version >>>>>>> rather soonish. However things went from bad to worse, the engine stopped >>>>>>> again, and all vm?s stopped working as well. So on a machine outside the >>>>>>> setup and restored a backup of the engine taken from version 4.2.8 just >>>>>>> before the upgrade. With this engine I was at least able to start some vm?s >>>>>>> again, and finalize the upgrade. Once the upgraded, things didn?t stabilize >>>>>>> and also lose 2 vm?s during the process due to image corruption. After >>>>>>> figuring out gluster 5.3 had quite some issues I was as lucky to see >>>>>>> gluster 5.5 was about to be released, on the moment the RPM?s were >>>>>>> available I?ve installed those. This helped a lot in terms of stability, >>>>>>> for which I?m very grateful! However the performance is unfortunate >>>>>>> terrible, it?s about 15% of what the performance was running gluster >>>>>>> 3.12.15. It?s strange since a simple dd shows ok performance, but our >>>>>>> actual workload doesn?t. While I would expect the performance to be better, >>>>>>> due to all improvements made since gluster version 3.12. Does anybody share >>>>>>> the same experience? >>>>>>> I really hope gluster 6 will soon be tested with ovirt and released, >>>>>>> and things start to perform and stabilize again..like the good old days. Of >>>>>>> course when I can do anything, I?m happy to help. >>>>>>> >>>>>> >>>>>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track >>>>>> the rebase on Gluster 6. >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I think the following short list of issues we have after the >>>>>>> migration; >>>>>>> Gluster 5.5; >>>>>>> - Poor performance for our workload (mostly write dependent) >>>>>>> - VM?s randomly pause on unknown storage errors, which are >>>>>>> ?stale file?s?. corresponding log; Lookup on shard 797 failed. Base file >>>>>>> gfid = 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle] >>>>>>> - Some files are listed twice in a directory (probably related >>>>>>> the stale file issue?) >>>>>>> Example; >>>>>>> ls -la >>>>>>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/ >>>>>>> total 3081 >>>>>>> drwxr-x---. 2 vdsm kvm 4096 Mar 18 11:34 . >>>>>>> drwxr-xr-x. 13 vdsm kvm 4096 Mar 19 09:42 .. >>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Jan 27 2018 >>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease >>>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>>> >>>>>>> - brick processes sometimes starts multiple times. Sometimes I?ve 5 >>>>>>> brick processes for a single volume. Killing all glusterfsd?s for the >>>>>>> volume on the machine and running gluster v start force usually just >>>>>>> starts one after the event, from then on things look all right. >>>>>>> >>>>>>> >>>>>> May I kindly ask to open bugs on Gluster for above issues at >>>>>> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ? >>>>>> Sahina? >>>>>> >>>>>> >>>>>>> Ovirt 4.3.2.1-1.el7 >>>>>>> - All vms images ownership are changed to root.root after the >>>>>>> vm is shutdown, probably related to; >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only >>>>>>> scoped to the HA engine. I?m still in compatibility mode 4.2 for the >>>>>>> cluster and for the vm?s, but upgraded to version ovirt 4.3.2 >>>>>>> >>>>>> >>>>>> Ryan? >>>>>> >>>>>> >>>>>>> - The network provider is set to ovn, which is fine..actually >>>>>>> cool, only the ?ovs-vswitchd? is a CPU hog, and utilizes 100% >>>>>>> >>>>>> >>>>>> Miguel? Dominik? >>>>>> >>>>>> >>>>>>> - It seems on all nodes vdsm tries to get the the stats for >>>>>>> the HA engine, which is filling the logs with (not sure if this is new); >>>>>>> [api.virt] FINISH getStats return={'status': {'message': "Virtual >>>>>>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}", >>>>>>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9 >>>>>>> (api:54) >>>>>>> >>>>>> >>>>>> Simone? >>>>>> >>>>>> >>>>>>> - It seems the package os_brick [root] managedvolume not >>>>>>> supported: Managed Volume Not Supported. Missing package os-brick.: >>>>>>> ('Cannot import os_brick',) (caps:149) which fills the vdsm.log, but for >>>>>>> this I also saw another message, so I suspect this will already be resolved >>>>>>> shortly >>>>>>> - The machine I used to run the backup HA engine, doesn?t want >>>>>>> to get removed from the hosted-engine ?vm-status, not even after running; >>>>>>> hosted-engine --clean-metadata --host-id=10 --force-clean or hosted-engine >>>>>>> --clean-metadata --force-clean from the machine itself. >>>>>>> >>>>>> >>>>>> Simone? >>>>>> >>>>>> >>>>>>> >>>>>>> Think that's about it. >>>>>>> >>>>>>> Don?t get me wrong, I don?t want to rant, I just wanted to share my >>>>>>> experience and see where things can made better. >>>>>>> >>>>>> >>>>>> If not already done, can you please open bugs for above issues at >>>>>> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ? >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Best Olaf >>>>>>> _______________________________________________ >>>>>>> Users mailing list -- users at ovirt.org >>>>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>> oVirt Code of Conduct: >>>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>>> List Archives: >>>>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/ >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> SANDRO BONAZZOLA >>>>>> >>>>>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >>>>>> >>>>>> Red Hat EMEA >>>>>> >>>>>> sbonazzo at redhat.com >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Users mailing list -- users at ovirt.org >>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> List Archives: >>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/ >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.buitelaar at gmail.com Wed Apr 3 17:05:49 2019 From: olaf.buitelaar at gmail.com (Olaf Buitelaar) Date: Wed, 3 Apr 2019 19:05:49 +0200 Subject: [Gluster-users] [ovirt-users] Re: Announcing Gluster release 5.5 In-Reply-To: References: <20190328164716.27693.35887@mail.ovirt.org>

Message-ID: Dear Mohit, Thanks for backporting this issue. Hopefully we can address the others as well, if i can do anything let me know. On my side i've tested with: gluster volume reset cluster.choose-local, but haven't noticed really a change in performance. On the good side, the brick processes didn't crash with updating this config. I'll experiment with the other changes as well, and see how the combinations affect performance. I also saw this commit; https://review.gluster.org/#/c/glusterfs/+/21333/ which looks very useful, will this be an recommended option for VM/block workloads? Thanks Olaf Op wo 3 apr. 2019 om 17:56 schreef Mohit Agrawal : > > Hi, > > Thanks Olaf for sharing the relevant logs. > > @Atin, > You are right patch https://review.gluster.org/#/c/glusterfs/+/22344/ > will resolve the issue running multiple brick instance for same brick. > > As we can see in below logs glusterd is trying to start the same brick > instance twice at the same time > > [2019-04-01 10:23:21.752401] I > [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh > brick process for brick /data/gfs/bricks/brick1/ovirt-engine > [2019-04-01 10:23:30.348091] I > [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh > brick process for brick /data/gfs/bricks/brick1/ovirt-engine > [2019-04-01 10:24:13.353396] I > [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh > brick process for brick /data/gfs/bricks/brick1/ovirt-engine > [2019-04-01 10:24:24.253764] I > [glusterd-utils.c:6301:glusterd_brick_start] 0-management: starting a fresh > brick process for brick /data/gfs/bricks/brick1/ovirt-engine > > We are seeing below message between starting of two instances > The message "E [MSGID: 101012] [common-utils.c:4075:gf_is_service_running] > 0-: Unable to read pidfile: > /var/run/gluster/vols/ovirt-engine/10.32.9.5-data-gfs-bricks-brick1-ovirt-engine.pid" > repeated 2 times between [2019-04-01 10:23:21.748492] and [2019-04-01 > 10:23:21.752432] > > I will backport the same. > Thanks, > Mohit Agrawal > > On Wed, Apr 3, 2019 at 3:58 PM Olaf Buitelaar > wrote: > >> Dear Mohit, >> >> Sorry i thought Krutika was referring to the ovirt-kube brick logs. due >> the large size (18MB compressed), i've placed the files here; >> https://edgecastcdn.net/0004FA/files/bricklogs.tar.bz2 >> Also i see i've attached the wrong files, i intended to >> attach profile_data4.txt | profile_data3.txt >> Sorry for the confusion. >> >> Thanks Olaf >> >> Op wo 3 apr. 2019 om 04:56 schreef Mohit Agrawal : >> >>> Hi Olaf, >>> >>> As per current attached "multi-glusterfsd-vol3.txt | >>> multi-glusterfsd-vol4.txt" it is showing multiple processes are running >>> for "ovirt-core ovirt-engine" brick names but there are no logs >>> available in bricklogs.zip specific to this bricks, bricklogs.zip >>> has a dump of ovirt-kube logs only >>> >>> Kindly share brick logs specific to the bricks "ovirt-core >>> ovirt-engine" and share glusterd logs also. >>> >>> Regards >>> Mohit Agrawal >>> >>> On Tue, Apr 2, 2019 at 9:18 PM Olaf Buitelaar >>> wrote: >>> >>>> Dear Krutika, >>>> >>>> 1. >>>> I've changed the volume settings, write performance seems to increased >>>> somewhat, however the profile doesn't really support that since latencies >>>> increased. However read performance has diminished, which does seem to be >>>> supported by the profile runs (attached). >>>> Also the IO does seem to behave more consistent than before. >>>> I don't really understand the idea behind them, maybe you can explain >>>> why these suggestions are good? >>>> These settings seems to avoid as much local caching and access as >>>> possible and push everything to the gluster processes. While i would expect >>>> local access and local caches are a good thing, since it would lead to >>>> having less network access or disk access. >>>> I tried to investigate these settings a bit more, and this is what i >>>> understood of them; >>>> - network.remote-dio; when on it seems to ignore the O_DIRECT flag in >>>> the client, thus causing the files to be cached and buffered in the page >>>> cache on the client, i would expect this to be a good thing especially if >>>> the server process would access the same page cache? >>>> At least that is what grasp from this commit; >>>> https://review.gluster.org/#/c/glusterfs/+/4206/2/xlators/protocol/client/src/client.c line >>>> 867 >>>> Also found this commit; >>>> https://github.com/gluster/glusterfs/commit/06c4ba589102bf92c58cd9fba5c60064bc7a504e#diff-938709e499b4383c3ed33c3979b9080c suggesting >>>> remote-dio actually improves performance, not sure it's a write or read >>>> benchmark >>>> When a file is opened with O_DIRECT it will also disable the >>>> write-behind functionality >>>> >>>> - performance.strict-o-direct: when on, the AFR, will not ignore the >>>> O_DIRECT flag. and will invoke: fop_writev_stub with the wb_writev_helper, >>>> which seems to stack the operation, no idea why that is. But generally i >>>> suppose not ignoring the O_DIRECT flag in the AFR is a good thing, when a >>>> processes requests to have O_DIRECT. So this makes sense to me. >>>> >>>> - cluster.choose-local: when off, it doesn't prefer the local node, but >>>> would always choose a brick. Since it's a 9 node cluster, with 3 >>>> subvolumes, only a 1/3 could end-up local, and the other 2/3 should be >>>> pushed to external nodes anyway. Or am I making the total wrong assumption >>>> here? >>>> >>>> It seems to this config is moving to the gluster-block config side of >>>> things, which does make sense. >>>> Since we're running quite some mysql instances, which opens the files >>>> with O_DIRECt i believe, it would mean the only layer of cache is within >>>> mysql it self. Which you could argue is a good thing. But i would expect a >>>> little of write-behind buffer, and maybe some of the data cached within >>>> gluster would alleviate things a bit on gluster's side. But i wouldn't know >>>> if that's the correct mind set, and so might be totally off here. >>>> Also i would expect these gluster v set command to be online >>>> operations, but somehow the bricks went down, after applying these changes. >>>> What appears to have happened is that after the update the brick process >>>> was restarted, but due to multiple brick process start issue, multiple >>>> processes were started, and the brick didn't came online again. >>>> However i'll try to reproduce this, since i would like to test with >>>> cluster.choose-local: on, and see how performance compares. And hopefully >>>> when it occurs collect some useful info. >>>> Question; are network.remote-dio and performance.strict-o-direct >>>> mutually exclusive settings, or can they both be on? >>>> >>>> 2. I've attached all brick logs, the only thing relevant i found was; >>>> [2019-03-28 20:20:07.170452] I [MSGID: 113030] >>>> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >>>> open-fd-key-status: 0 for >>>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>>> [2019-03-28 20:20:07.170491] I [MSGID: 113031] >>>> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >>>> status: 0 for >>>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>>> [2019-03-28 20:20:07.248480] I [MSGID: 113030] >>>> [posix-entry-ops.c:1146:posix_unlink] 0-ovirt-kube-posix: >>>> open-fd-key-status: 0 for >>>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>>> [2019-03-28 20:20:07.248491] I [MSGID: 113031] >>>> [posix-entry-ops.c:1053:posix_skip_non_linkto_unlink] 0-posix: linkto_xattr >>>> status: 0 for >>>> /data/gfs/bricks/brick1/ovirt-kube/.shard/a38d64bc-a28b-4ee1-a0bb-f919e7a1022c.109886 >>>> >>>> Thanks Olaf >>>> >>>> ps. sorry needed to resend since it exceed the file limit >>>> >>>> Op ma 1 apr. 2019 om 07:56 schreef Krutika Dhananjay < >>>> kdhananj at redhat.com>: >>>> >>>>> Adding back gluster-users >>>>> Comments inline ... >>>>> >>>>> On Fri, Mar 29, 2019 at 8:11 PM Olaf Buitelaar < >>>>> olaf.buitelaar at gmail.com> wrote: >>>>> >>>>>> Dear Krutika, >>>>>> >>>>>> >>>>>> >>>>>> 1. I?ve made 2 profile runs of around 10 minutes (see files >>>>>> profile_data.txt and profile_data2.txt). Looking at it, most time seems be >>>>>> spent at the fop?s fsync and readdirp. >>>>>> >>>>>> Unfortunate I don?t have the profile info for the 3.12.15 version so >>>>>> it?s a bit hard to compare. >>>>>> >>>>>> One additional thing I do notice on 1 machine (10.32.9.5) the iowait >>>>>> time increased a lot, from an average below the 1% it?s now around the 12% >>>>>> after the upgrade. >>>>>> >>>>>> So first suspicion with be lighting strikes twice, and I?ve also just >>>>>> now a bad disk, but that doesn?t appear to be the case, since all smart >>>>>> status report ok. >>>>>> >>>>>> Also dd shows performance I would more or less expect; >>>>>> >>>>>> dd if=/dev/zero of=/data/test_file bs=100M count=1 oflag=dsync >>>>>> >>>>>> 1+0 records in >>>>>> >>>>>> 1+0 records out >>>>>> >>>>>> 104857600 bytes (105 MB) copied, 0.686088 s, 153 MB/s >>>>>> >>>>>> dd if=/dev/zero of=/data/test_file bs=1G count=1 oflag=dsync >>>>>> >>>>>> 1+0 records in >>>>>> >>>>>> 1+0 records out >>>>>> >>>>>> 1073741824 bytes (1.1 GB) copied, 7.61138 s, 141 MB/s >>>>>> >>>>>> if=/dev/urandom of=/data/test_file bs=1024 count=1000000 >>>>>> >>>>>> 1000000+0 records in >>>>>> >>>>>> 1000000+0 records out >>>>>> >>>>>> 1024000000 bytes (1.0 GB) copied, 6.35051 s, 161 MB/s >>>>>> >>>>>> dd if=/dev/zero of=/data/test_file bs=1024 count=1000000 >>>>>> >>>>>> 1000000+0 records in >>>>>> >>>>>> 1000000+0 records out >>>>>> >>>>>> 1024000000 bytes (1.0 GB) copied, 1.6899 s, 606 MB/s >>>>>> >>>>>> When I disable this brick (service glusterd stop; pkill glusterfsd) >>>>>> performance in gluster is better, but not on par with what it was. Also the >>>>>> cpu usages on the ?neighbor? nodes which hosts the other bricks in the same >>>>>> subvolume increases quite a lot in this case, which I wouldn?t expect >>>>>> actually since they shouldn't handle much more work, except flagging shards >>>>>> to heal. Iowait also goes to idle once gluster is stopped, so it?s for >>>>>> sure gluster which waits for io. >>>>>> >>>>>> >>>>>> >>>>> >>>>> So I see that FSYNC %-latency is on the higher side. And I also >>>>> noticed you don't have direct-io options enabled on the volume. >>>>> Could you set the following options on the volume - >>>>> # gluster volume set network.remote-dio off >>>>> # gluster volume set performance.strict-o-direct on >>>>> and also disable choose-local >>>>> # gluster volume set cluster.choose-local off >>>>> >>>>> let me know if this helps. >>>>> >>>>> 2. I?ve attached the mnt log and volume info, but I couldn?t find >>>>>> anything relevant in in those logs. I think this is because we run the VM?s >>>>>> with libgfapi; >>>>>> >>>>>> [root at ovirt-host-01 ~]# engine-config -g LibgfApiSupported >>>>>> >>>>>> LibgfApiSupported: true version: 4.2 >>>>>> >>>>>> LibgfApiSupported: true version: 4.1 >>>>>> >>>>>> LibgfApiSupported: true version: 4.3 >>>>>> >>>>>> And I can confirm the qemu process is invoked with the gluster:// >>>>>> address for the images. >>>>>> >>>>>> The message is logged in the /var/lib/libvert/qemu/ file, >>>>>> which I?ve also included. For a sample case see around; 2019-03-28 20:20:07 >>>>>> >>>>>> Which has the error; E [MSGID: 133010] >>>>>> [shard.c:2294:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >>>>>> shard 109886 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >>>>>> [Stale file handle] >>>>>> >>>>> >>>>> Could you also attach the brick logs for this volume? >>>>> >>>>> >>>>>> >>>>>> 3. yes I see multiple instances for the same brick directory, like; >>>>>> >>>>>> /usr/sbin/glusterfsd -s 10.32.9.6 --volfile-id >>>>>> ovirt-core.10.32.9.6.data-gfs-bricks-brick1-ovirt-core -p >>>>>> /var/run/gluster/vols/ovirt-core/10.32.9.6-data-gfs-bricks-brick1-ovirt-core.pid >>>>>> -S /var/run/gluster/452591c9165945d9.socket --brick-name >>>>>> /data/gfs/bricks/brick1/ovirt-core -l >>>>>> /var/log/glusterfs/bricks/data-gfs-bricks-brick1-ovirt-core.log >>>>>> --xlator-option *-posix.glusterd-uuid=fb513da6-f3bd-4571-b8a2-db5efaf60cc1 >>>>>> --process-name brick --brick-port 49154 --xlator-option >>>>>> ovirt-core-server.listen-port=49154 >>>>>> >>>>>> >>>>>> >>>>>> I?ve made an export of the output of ps from the time I observed >>>>>> these multiple processes. >>>>>> >>>>>> In addition the brick_mux bug as noted by Atin. I might also have >>>>>> another possible cause, as ovirt moves nodes from none-operational state or >>>>>> maintenance state to active/activating, it also seems to restart gluster, >>>>>> however I don?t have direct proof for this theory. >>>>>> >>>>>> >>>>>> >>>>> >>>>> +Atin Mukherjee ^^ >>>>> +Mohit Agrawal ^^ >>>>> >>>>> -Krutika >>>>> >>>>> Thanks Olaf >>>>>> >>>>>> Op vr 29 mrt. 2019 om 10:03 schreef Sandro Bonazzola < >>>>>> sbonazzo at redhat.com>: >>>>>> >>>>>>> >>>>>>> >>>>>>> Il giorno gio 28 mar 2019 alle ore 17:48 >>>>>>> ha scritto: >>>>>>> >>>>>>>> Dear All, >>>>>>>> >>>>>>>> I wanted to share my experience upgrading from 4.2.8 to 4.3.1. >>>>>>>> While previous upgrades from 4.1 to 4.2 etc. went rather smooth, this one >>>>>>>> was a different experience. After first trying a test upgrade on a 3 node >>>>>>>> setup, which went fine. i headed to upgrade the 9 node production platform, >>>>>>>> unaware of the backward compatibility issues between gluster 3.12.15 -> >>>>>>>> 5.3. After upgrading 2 nodes, the HA engine stopped and wouldn't start. >>>>>>>> Vdsm wasn't able to mount the engine storage domain, since /dom_md/metadata >>>>>>>> was missing or couldn't be accessed. Restoring this file by getting a good >>>>>>>> copy of the underlying bricks, removing the file from the underlying bricks >>>>>>>> where the file was 0 bytes and mark with the stickybit, and the >>>>>>>> corresponding gfid's. Removing the file from the mount point, and copying >>>>>>>> back the file on the mount point. Manually mounting the engine domain, and >>>>>>>> manually creating the corresponding symbolic links in /rhev/data-center and >>>>>>>> /var/run/vdsm/storage and fixing the ownership back to vdsm.kvm (which was >>>>>>>> root.root), i was able to start the HA engine again. Since the engine was >>>>>>>> up again, and things seemed rather unstable i decided to continue the >>>>>>>> upgrade on the other nodes suspecting an incompatibility in gluster >>>>>>>> versions, i thought would be best to have them all on the same version >>>>>>>> rather soonish. However things went from bad to worse, the engine stopped >>>>>>>> again, and all vm?s stopped working as well. So on a machine outside the >>>>>>>> setup and restored a backup of the engine taken from version 4.2.8 just >>>>>>>> before the upgrade. With this engine I was at least able to start some vm?s >>>>>>>> again, and finalize the upgrade. Once the upgraded, things didn?t stabilize >>>>>>>> and also lose 2 vm?s during the process due to image corruption. After >>>>>>>> figuring out gluster 5.3 had quite some issues I was as lucky to see >>>>>>>> gluster 5.5 was about to be released, on the moment the RPM?s were >>>>>>>> available I?ve installed those. This helped a lot in terms of stability, >>>>>>>> for which I?m very grateful! However the performance is unfortunate >>>>>>>> terrible, it?s about 15% of what the performance was running gluster >>>>>>>> 3.12.15. It?s strange since a simple dd shows ok performance, but our >>>>>>>> actual workload doesn?t. While I would expect the performance to be better, >>>>>>>> due to all improvements made since gluster version 3.12. Does anybody share >>>>>>>> the same experience? >>>>>>>> I really hope gluster 6 will soon be tested with ovirt and >>>>>>>> released, and things start to perform and stabilize again..like the good >>>>>>>> old days. Of course when I can do anything, I?m happy to help. >>>>>>>> >>>>>>> >>>>>>> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1693998 to track >>>>>>> the rebase on Gluster 6. >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I think the following short list of issues we have after the >>>>>>>> migration; >>>>>>>> Gluster 5.5; >>>>>>>> - Poor performance for our workload (mostly write dependent) >>>>>>>> - VM?s randomly pause on unknown storage errors, which are >>>>>>>> ?stale file?s?. corresponding log; Lookup on shard 797 failed. Base file >>>>>>>> gfid = 8a27b91a-ff02-42dc-bd4c-caa019424de8 [Stale file handle] >>>>>>>> - Some files are listed twice in a directory (probably >>>>>>>> related the stale file issue?) >>>>>>>> Example; >>>>>>>> ls -la >>>>>>>> /rhev/data-center/59cd53a9-0003-02d7-00eb-0000000001e3/313f5d25-76af-4ecd-9a20-82a2fe815a3c/images/4add6751-3731-4bbd-ae94-aaeed12ea450/ >>>>>>>> total 3081 >>>>>>>> drwxr-x---. 2 vdsm kvm 4096 Mar 18 11:34 . >>>>>>>> drwxr-xr-x. 13 vdsm kvm 4096 Mar 19 09:42 .. >>>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Mar 28 12:55 >>>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c >>>>>>>> -rw-rw----. 1 vdsm kvm 1048576 Jan 27 2018 >>>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.lease >>>>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>>>> -rw-r--r--. 1 vdsm kvm 290 Jan 27 2018 >>>>>>>> 1a7cf259-6b29-421d-9688-b25dfaafb13c.meta >>>>>>>> >>>>>>>> - brick processes sometimes starts multiple times. Sometimes I?ve 5 >>>>>>>> brick processes for a single volume. Killing all glusterfsd?s for the >>>>>>>> volume on the machine and running gluster v start force usually just >>>>>>>> starts one after the event, from then on things look all right. >>>>>>>> >>>>>>>> >>>>>>> May I kindly ask to open bugs on Gluster for above issues at >>>>>>> https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS ? >>>>>>> Sahina? >>>>>>> >>>>>>> >>>>>>>> Ovirt 4.3.2.1-1.el7 >>>>>>>> - All vms images ownership are changed to root.root after the >>>>>>>> vm is shutdown, probably related to; >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795 but not only >>>>>>>> scoped to the HA engine. I?m still in compatibility mode 4.2 for the >>>>>>>> cluster and for the vm?s, but upgraded to version ovirt 4.3.2 >>>>>>>> >>>>>>> >>>>>>> Ryan? >>>>>>> >>>>>>> >>>>>>>> - The network provider is set to ovn, which is fine..actually >>>>>>>> cool, only the ?ovs-vswitchd? is a CPU hog, and utilizes 100% >>>>>>>> >>>>>>> >>>>>>> Miguel? Dominik? >>>>>>> >>>>>>> >>>>>>>> - It seems on all nodes vdsm tries to get the the stats for >>>>>>>> the HA engine, which is filling the logs with (not sure if this is new); >>>>>>>> [api.virt] FINISH getStats return={'status': {'message': "Virtual >>>>>>>> machine does not exist: {'vmId': u'20d69acd-edfd-4aeb-a2ae-49e9c121b7e9'}", >>>>>>>> 'code': 1}} from=::1,59290, vmId=20d69acd-edfd-4aeb-a2ae-49e9c121b7e9 >>>>>>>> (api:54) >>>>>>>> >>>>>>> >>>>>>> Simone? >>>>>>> >>>>>>> >>>>>>>> - It seems the package os_brick [root] managedvolume not >>>>>>>> supported: Managed Volume Not Supported. Missing package os-brick.: >>>>>>>> ('Cannot import os_brick',) (caps:149) which fills the vdsm.log, but for >>>>>>>> this I also saw another message, so I suspect this will already be resolved >>>>>>>> shortly >>>>>>>> - The machine I used to run the backup HA engine, doesn?t >>>>>>>> want to get removed from the hosted-engine ?vm-status, not even after >>>>>>>> running; hosted-engine --clean-metadata --host-id=10 --force-clean or >>>>>>>> hosted-engine --clean-metadata --force-clean from the machine itself. >>>>>>>> >>>>>>> >>>>>>> Simone? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Think that's about it. >>>>>>>> >>>>>>>> Don?t get me wrong, I don?t want to rant, I just wanted to share my >>>>>>>> experience and see where things can made better. >>>>>>>> >>>>>>> >>>>>>> If not already done, can you please open bugs for above issues at >>>>>>> https://bugzilla.redhat.com/enter_bug.cgi?classification=oVirt ? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Best Olaf >>>>>>>> _______________________________________________ >>>>>>>> Users mailing list -- users at ovirt.org >>>>>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>>> oVirt Code of Conduct: >>>>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>>>> List Archives: >>>>>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/3CO35Q7VZMWNHS4LPUJNO7S47MGLSKS5/ >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> SANDRO BONAZZOLA >>>>>>> >>>>>>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >>>>>>> >>>>>>> Red Hat EMEA >>>>>>> >>>>>>> sbonazzo at redhat.com >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list -- users at ovirt.org >>>>>> To unsubscribe send an email to users-leave at ovirt.org >>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>> oVirt Code of Conduct: >>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>> List Archives: >>>>>> https://lists.ovirt.org/archives/list/users at ovirt.org/message/HAGTA64LF7LLE6YMHQ6DLT26MD2GZ2PK/ >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From budic at onholyground.com Wed Apr 3 20:37:48 2019 From: budic at onholyground.com (Darrell Budic) Date: Wed, 3 Apr 2019 15:37:48 -0500 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References:

Message-ID: <13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> Hari- I was upgrading my test cluster from 5.5 to 6 and I hit this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1694010 ) or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn?t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look? My old config (all nodes): volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option ping-timeout 10 option event-threads 1 option rpc-auth-allow-insecure on # option transport.address-family inet6 # option base-port 49152 end-volume changed to: volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option transport.socket.listen-port 24007 option transport.rdma.listen-port 24008 option ping-timeout 0 option event-threads 1 option rpc-auth-allow-insecure on # option lock-timer 180 # option transport.address-family inet6 # option base-port 49152 option max-port 60999 end-volume the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened: [2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state , has disconnected from glusterd. > On Apr 2, 2019, at 4:53 AM, Atin Mukherjee wrote: > > > > On Mon, 1 Apr 2019 at 10:28, Hari Gowtham > wrote: > Comments inline. > > On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay > > wrote: > > > > Quite a considerable amount of detail here. Thank you! > > > > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham > wrote: > > > > > > Hello Gluster users, > > > > > > As you all aware that glusterfs-6 is out, we would like to inform you > > > that, we have spent a significant amount of time in testing > > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to > > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. > > > > > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. > > > There were xlators (and respective options to enable/disable them) > > > added and deprecated in glusterfs-6 from various versions [1]. > > > > > > We had to check the following upgrade scenarios for all such options > > > Identified in [1]: > > > 1) option never enabled and upgraded > > > 2) option enabled and then upgraded > > > 3) option enabled and then disabled and then upgraded > > > > > > We weren't manually able to check all the combinations for all the options. > > > So the options involving enabling and disabling xlators were prioritized. > > > The below are the result of the ones tested. > > > > > > Never enabled and upgraded: > > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. > > > > > > Enabled and upgraded: > > > Tested for tier which is deprecated, It is not a recommended upgrade. > > > As expected the volume won't be consumable and will have a few more > > > issues as well. > > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. > > > > > > Enabled, disabled before upgrade. > > > Tested for tier with 3.12 and the upgrade went fine. > > > > > > There is one common issue to note in every upgrade. The node being > > > upgraded is going into disconnected state. You have to flush the iptables > > > and the restart glusterd on all nodes to fix this. > > > > > > > Is this something that is written in the upgrade notes? I do not seem > > to recall, if not, I'll send a PR > > No this wasn't mentioned in the release notes. PRs are welcome. > > > > > > The testing for enabling new options is still pending. The new options > > > won't cause as much issues as the deprecated ones so this was put at > > > the end of the priority list. It would be nice to get contributions > > > for this. > > > > > > > Did the range of tests lead to any new issues? > > Yes. In the first round of testing we found an issue and had to postpone the > release of 6 until the fix was made available. > https://bugzilla.redhat.com/show_bug.cgi?id=1684029 > > And then we tested it again after this patch was made available. > and came across this: > https://bugzilla.redhat.com/show_bug.cgi?id=1694010 > > This isn?t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release. > > > > Have mentioned this in the second mail as to how to over this situation > for now until the fix is available. > > > > > > For the disable testing, tier was used as it covers most of the xlator > > > that was removed. And all of these tests were done on a replica 3 volume. > > > > > > > I'm not sure if the Glusto team is reading this, but it would be > > pertinent to understand if the approach you have taken can be > > converted into a form of automated testing pre-release. > > I don't have an answer for this, have CCed Vijay. > He might have an idea. > > > > > > Note: This is only for upgrade testing of the newly added and removed > > > xlators. Does not involve the normal tests for the xlator. > > > > > > If you have any questions, please feel free to reach us. > > > > > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing > > > > > > Regards, > > > Hari and Sanju. > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Regards, > Hari Gowtham. > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > -- > --Atin > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From meira at cesup.ufrgs.br Wed Apr 3 22:06:47 2019 From: meira at cesup.ufrgs.br (Lindolfo Meira) Date: Wed, 3 Apr 2019 19:06:47 -0300 (-03) Subject: [Gluster-users] Enabling quotas on gluster Message-ID: Hi folks. Does anyone know how significant is the performance penalty for enabling directory level quotas on a gluster fs, compared to the case with no quotas at all? Lindolfo Meira, MSc Diretor Geral, Centro Nacional de Supercomputa??o Universidade Federal do Rio Grande do Sul +55 (51) 3308-3139 From hunter86_bg at yahoo.com Thu Apr 4 07:11:09 2019 From: hunter86_bg at yahoo.com (Strahil) Date: Thu, 04 Apr 2019 10:11:09 +0300 Subject: [Gluster-users] Gluster 5.5 slower than 3.12.15 Message-ID: Hi Amar, I would like to test Cluster v6 , but as I'm quite new to oVirt - I'm not sure if oVirt <-> Gluster will communicate properly Did anyone test rollback from v6 to v5.5 ? If rollback is possible - I would be happy to give it a try. Best Regards, Strahil NikolovOn Apr 3, 2019 11:35, Amar Tumballi Suryanarayan wrote: > > Strahil, > > With some basic testing, we are noticing the similar behavior too. > > One of the issue we identified was increased n/w usage in 5.x series (being addressed by?https://review.gluster.org/#/c/glusterfs/+/22404/), and there are few other features which write extended attributes which caused some delay. > > We are in the process of publishing some numbers with release-3.12.x, release-5 and release-6 comparison soon. With some numbers we are already seeing release-6 currently is giving really good performance in many configurations, specially for 1x3 replicate volume type. > > While we continue to identify and fix issues in 5.x series, one of the request is to validate release-6.x (6.0 or 6.1 which would happen on April 10th), so you can see the difference in your workload. > > Regards, > Amar > > > > On Wed, Apr 3, 2019 at 5:57 AM Strahil Nikolov wrote: >> >> Hi Community, >> >> I have the feeling that with gluster v5.5 I have poorer performance than it used to be on 3.12.15. Did you observe something like that? >> >> I have a 3 node Hyperconverged Cluster (ovirt + glusterfs with replica 3 arbiter1 volumes) with NFS Ganesha and since I have upgraded to v5 - the issues came up. >> First it was 5.3 notorious experience and now with 5.5 - my sanlock is having problems and higher latency than it used to be. I have switched from NFS-Ganesha to pure FUSE , but the latency problems do not go away. >> >> Of course , this is partially due to the consumer hardware, but as the hardware has not changed I was hoping that the performance will remain as is. >> >> So, do you expect 5.5 to perform less than 3.12 ? >> >> Some info: >> Volume Name: engine >> Type: Replicate >> Volume ID: 30ca1cc2-f2f7-4749-9e2e-cee9d7099ded >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x (2 + 1) = 3 >> Transport-type: tcp >> Bricks: >> Brick1: ovirt1:/gluster_bricks/engine/engine >> Brick2: ovirt2:/gluster_bricks/engine/engine >> Brick3: ovirt3:/gluster_bricks/engine/engine (arbiter) >> Options Reconfigured: >> performance.client-io-threads: off >> nfs.disable: on >> transport.address-family: inet >> performance.quick-read: off >> performance.read-ahead: off >> performance.io-cache: off >> performance.low-prio-threads: 32 >> network.remote-dio: off >> cluster.eager-lock: enable >> cluster.quorum-type: auto >> cluster.server-quorum-type: server >> cluster.data-self-heal-algorithm: full >> cluster.locking-scheme: granular >> cluster.shd-max-threads: 8 >> cluster.shd-wait-qlength: 10000 >> features.shard: on >> user.cifs: off >> storage.owner-uid: 36 >> storage.owner-gid: 36 >> network.ping-timeout: 30 >> performance.strict-o-direct: on >> cluster.granular-entry-heal: enable >> cluster.enable-shared-storage: enable >> >> Network: 1 gbit/s >> >> Filesystem:XFS >> >> Best Regards, >> Strahil Nikolov >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srakonde at redhat.com Thu Apr 4 07:54:53 2019 From: srakonde at redhat.com (Sanju Rakonde) Date: Thu, 4 Apr 2019 13:24:53 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: <13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> References:

<13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> Message-ID: We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while upgrading to glusterfs-6. We tested it in different setups and understood that this issue is seen because of some issue in setup. regarding the issue you have faced, can you please let us know which documentation you have followed for the upgrade? During our testing, we didn't hit any such issue. we would like to understand what went wrong. On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic wrote: > Hari- > > I was upgrading my test cluster from 5.5 to 6 and I hit this bug ( > https://bugzilla.redhat.com/show_bug.cgi?id=1694010) or something > similar. In my case, the workaround did not work, and I was left with a > gluster that had gone into no-quorum mode and stopped all the bricks. > Wasn?t much in the logs either, but I noticed my > /etc/glusterfs/glusterd.vol files were not the same as the newer versions, > so I updated them, restarted glusterd, and suddenly the updated node showed > as peer-in-cluster again. Once I updated other notes the same way, things > started working again. Maybe a place to look? > > My old config (all nodes): > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option ping-timeout 10 > option event-threads 1 > option rpc-auth-allow-insecure on > # option transport.address-family inet6 > # option base-port 49152 > end-volume > > changed to: > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option transport.socket.listen-port 24007 > option transport.rdma.listen-port 24008 > option ping-timeout 0 > option event-threads 1 > option rpc-auth-allow-insecure on > # option lock-timer 180 > # option transport.address-family inet6 > # option base-port 49152 > option max-port 60999 > end-volume > > the only thing I found in the glusterd logs that looks relevant was > (repeated for both of the other nodes in this cluster), so no clue why it > happened: > [2019-04-03 20:19:16.802638] I [MSGID: 106004] > [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer > (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state Cluster>, has disconnected from glusterd. > > > On Apr 2, 2019, at 4:53 AM, Atin Mukherjee > wrote: > > > > On Mon, 1 Apr 2019 at 10:28, Hari Gowtham wrote: > >> Comments inline. >> >> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >> wrote: >> > >> > Quite a considerable amount of detail here. Thank you! >> > >> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham >> wrote: >> > > >> > > Hello Gluster users, >> > > >> > > As you all aware that glusterfs-6 is out, we would like to inform you >> > > that, we have spent a significant amount of time in testing >> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >> > > >> > > As glusterfs-6 has got in a lot of changes, we wanted to test those >> portions. >> > > There were xlators (and respective options to enable/disable them) >> > > added and deprecated in glusterfs-6 from various versions [1]. >> > > >> > > We had to check the following upgrade scenarios for all such options >> > > Identified in [1]: >> > > 1) option never enabled and upgraded >> > > 2) option enabled and then upgraded >> > > 3) option enabled and then disabled and then upgraded >> > > >> > > We weren't manually able to check all the combinations for all the >> options. >> > > So the options involving enabling and disabling xlators were >> prioritized. >> > > The below are the result of the ones tested. >> > > >> > > Never enabled and upgraded: >> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >> > > >> > > Enabled and upgraded: >> > > Tested for tier which is deprecated, It is not a recommended upgrade. >> > > As expected the volume won't be consumable and will have a few more >> > > issues as well. >> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >> > > >> > > Enabled, disabled before upgrade. >> > > Tested for tier with 3.12 and the upgrade went fine. >> > > >> > > There is one common issue to note in every upgrade. The node being >> > > upgraded is going into disconnected state. You have to flush the >> iptables >> > > and the restart glusterd on all nodes to fix this. >> > > >> > >> > Is this something that is written in the upgrade notes? I do not seem >> > to recall, if not, I'll send a PR >> >> No this wasn't mentioned in the release notes. PRs are welcome. >> >> > >> > > The testing for enabling new options is still pending. The new options >> > > won't cause as much issues as the deprecated ones so this was put at >> > > the end of the priority list. It would be nice to get contributions >> > > for this. >> > > >> > >> > Did the range of tests lead to any new issues? >> >> Yes. In the first round of testing we found an issue and had to postpone >> the >> release of 6 until the fix was made available. >> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >> >> And then we tested it again after this patch was made available. >> and came across this: >> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 > > > This isn?t a bug as we found that upgrade worked seamelessly in two > different setup. So we have no issues in the upgrade path to glusterfs-6 > release. > > >> >> Have mentioned this in the second mail as to how to over this situation >> for now until the fix is available. >> >> > >> > > For the disable testing, tier was used as it covers most of the xlator >> > > that was removed. And all of these tests were done on a replica 3 >> volume. >> > > >> > >> > I'm not sure if the Glusto team is reading this, but it would be >> > pertinent to understand if the approach you have taken can be >> > converted into a form of automated testing pre-release. >> >> I don't have an answer for this, have CCed Vijay. >> He might have an idea. >> >> > >> > > Note: This is only for upgrade testing of the newly added and removed >> > > xlators. Does not involve the normal tests for the xlator. >> > > >> > > If you have any questions, please feel free to reach us. >> > > >> > > [1] >> https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >> > > >> > > Regards, >> > > Hari and Sanju. >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> -- >> Regards, >> Hari Gowtham. >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > -- > --Atin > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Thanks, Sanju -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgowtham at redhat.com Thu Apr 4 09:41:19 2019 From: hgowtham at redhat.com (Hari Gowtham) Date: Thu, 4 Apr 2019 15:11:19 +0530 Subject: [Gluster-users] Enabling quotas on gluster In-Reply-To: References: Message-ID: Hi, The performance hit that quota causes depended on a number of factors like: 1) the number of files, 2) the depth of the directories in the FS 3) the breadth of the directories in the FS 4) the number of bricks. These are the main contributions to the performance hit. If the volume is of lesser size then quota should work fine. Let us know more about your use case to help you better. Note: gluster quota is not being actively worked on. On Thu, Apr 4, 2019 at 3:45 AM Lindolfo Meira wrote: > > Hi folks. > > Does anyone know how significant is the performance penalty for enabling > directory level quotas on a gluster fs, compared to the case with no > quotas at all? > > > Lindolfo Meira, MSc > Diretor Geral, Centro Nacional de Supercomputa??o > Universidade Federal do Rio Grande do Sul > +55 (51) 3308-3139_______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Regards, Hari Gowtham. From pascal.suter at dalco.ch Thu Apr 4 10:03:19 2019 From: pascal.suter at dalco.ch (Pascal Suter) Date: Thu, 4 Apr 2019 12:03:19 +0200 Subject: [Gluster-users] performance - what can I expect In-Reply-To: References: Message-ID: <381efa03-78b3-e244-9f52-054b357b5d57@dalco.ch> I just noticed i left the most important parameters out :) here's the write command with filesize and recordsize in it as well :) ./iozone -i 0 -t 1 -F /mnt/gluster/storage/thread1 -+n -c -C -e -I -w -+S 0 -s 200G -r 16384k also i ran the benchmark without direct_io which resulted in an even worse performance. i also tried to mount the gluster volume via nfs-ganesha which further reduced throughput down to about 450MB/s if i run the iozone benchmark with 3 threads writing to all three bricks directly (from the xfs filesystem) i get throughputs of around 6GB/s .. if I run the same benchmark through gluster mounted locally using the fuse client and with enough threads so that each brick gets at least one file written to it, i end up seing throughputs around 1.5GB/s .. that's a 4x decrease in performance. at it actually is the same if i run the benchmark with less threads and files only get written to two out of three bricks. cpu load on the server is around 25% by the way, nicely distributed across all available cores. i can't believe that gluster should really be so slow and everybody is just happily using it. any hints on what i'm doing wrong are very welcome. i'm using gluster 6.0 by the way. regards Pascal On 03.04.19 12:28, Pascal Suter wrote: > Hi all > > I am currently testing gluster on a single server. I have three > bricks, each a hardware RAID6 volume with thin provisioned LVM that > was aligned to the RAID and then formatted with xfs. > > i've created a distributed volume so that entire files get distributed > across my three bricks. > > first I ran a iozone benchmark across each brick testing the read and > write perofrmance of a single large file per brick > > i then mounted my gluster volume locally and ran another iozone run > with the same parameters writing a single file. the file went to brick > 1 which, when used driectly, would write with 2.3GB/s and read with > 1.5GB/s. however, through gluster i got only 800MB/s read and 750MB/s > write throughput > > another run with two processes each writing a file, where one file > went to the first brick and the other file to the second brick (which > by itself when directly accessed wrote at 2.8GB/s and read at 2.7GB/s) > resulted in 1.2GB/s of aggregated write and also aggregated read > throughput. > > Is this a normal performance i can expect out of a glusterfs or is it > worth tuning in order to really get closer to the actual brick > filesystem performance? > > here are the iozone commands i use for writing and reading.. note that > i am using directIO in order to make sure i don't get fooled by cache :) > > ./iozone -i 0 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 > -s $filesize -r $recordsize > iozone-brick${b}-write.txt > > ./iozone -i 1 -t 1 -F /mnt/brick${b}/thread1 -+n -c -C -e -I -w -+S 0 > -s $filesize -r $recordsize > iozone-brick${b}-read.txt > > cheers > > Pascal > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users From bandabasotti at gmail.com Thu Apr 4 10:15:01 2019 From: bandabasotti at gmail.com (banda bassotti) Date: Thu, 4 Apr 2019 12:15:01 +0200 Subject: [Gluster-users] thin arbiter setup Message-ID: Hi all, is there a detailed guide on how to configure a two-node cluster with a thin arbiter? I tried to follow the guide: https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/#setting-up-thin-arbiter-volume but it doesn't work. I'm using debian stretch and gluster 6 repository. thnx a lot. banda. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspandey at redhat.com Thu Apr 4 11:37:50 2019 From: aspandey at redhat.com (Ashish Pandey) Date: Thu, 4 Apr 2019 07:37:50 -0400 (EDT) Subject: [Gluster-users] thin arbiter setup In-Reply-To: References: Message-ID: <1634249416.10967979.1554377870313.JavaMail.zimbra@redhat.com> Hi, Currently, thin-arbiter can be setup using GD2. glustercli command is provided by GD2 only. Have you installed and started GD2 first? Could you please mention in which step you faced issue? --- Ashish ----- Original Message ----- From: "banda bassotti" To: gluster-users at gluster.org Sent: Thursday, April 4, 2019 3:45:01 PM Subject: [Gluster-users] thin arbiter setup Hi all, is there a detailed guide on how to configure a two-node cluster with a thin arbiter? I tried to follow the guide: https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/#setting-up-thin-arbiter-volume but it doesn't work. I'm using debian stretch and gluster 6 repository. thnx a lot. banda. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From hunter86_bg at yahoo.com Thu Apr 4 13:13:01 2019 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Thu, 4 Apr 2019 13:13:01 +0000 (UTC) Subject: [Gluster-users] thin arbiter setup In-Reply-To: References: Message-ID: <1610020491.16495722.1554383581498@mail.yahoo.com> Hi Banda, As far as I know (mentioned here in the mail list) , you need to use GlusterD2 and not the standard one . Best Regards,Strahil Nikolov ? ?????????, 4 ????? 2019 ?., 13:19:51 ?. ???????+3, banda bassotti ??????: Hi all, is there a detailed guide on how to configure a two-node cluster with a thin arbiter? I tried to follow the guide:? https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/#setting-up-thin-arbiter-volume?? but it doesn't work.? I'm using debian stretch and gluster 6 repository. thnx a lot. banda._______________________________________________ Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From hunter86_bg at yahoo.com Thu Apr 4 13:26:56 2019 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Thu, 4 Apr 2019 13:26:56 +0000 (UTC) Subject: [Gluster-users] thin arbiter setup In-Reply-To: <1634249416.10967979.1554377870313.JavaMail.zimbra@redhat.com> References: <1634249416.10967979.1554377870313.JavaMail.zimbra@redhat.com> Message-ID: <276872486.16505307.1554384416533@mail.yahoo.com> I have proposed a change in the Docs about thin arbiters, as it is quite deceptive. Best Regards,Strahil Nikolov ? ?????????, 4 ????? 2019 ?., 14:38:16 ?. ???????+3, Ashish Pandey ??????: Hi, Currently, thin-arbiter can be setup using GD2. glustercli command is provided by GD2 only. Have you installed and started GD2 first? Could you please mention in which step you faced issue? --- Ashish From: "banda bassotti" To: gluster-users at gluster.org Sent: Thursday, April 4, 2019 3:45:01 PM Subject: [Gluster-users] thin arbiter setup Hi all, is there a detailed guide on how to configure a two-node cluster with a thin arbiter? I tried to follow the guide:? https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/#setting-up-thin-arbiter-volume?? but it doesn't work.? I'm using debian stretch and gluster 6 repository. thnx a lot. banda. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From budic at onholyground.com Thu Apr 4 15:56:33 2019 From: budic at onholyground.com (Darrell Budic) Date: Thu, 4 Apr 2019 10:56:33 -0500 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References:

<13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> Message-ID: <493E624F-FC40-4242-BF9D-BAD0385B2DA5@onholyground.com> I didn?t follow any specific documents, just a generic rolling upgrade one node at a time. Once the first node didn?t reconnect, I tried to follow the workaround in the bug during the upgrade. Basic procedure was: - take 3 nodes that were initially installed with 3.12.x (forget which, but low number) and had been upgraded directly to 5.5 from 3.12.15 - op-version was 50400 - on node A: - yum install centos-release-gluster6 - yum upgrade (was some ovirt cockpit components, gluster, and a lib or two this time), hit yes - discover glusterd was dead - systemctl restart glusterd - no peer connections, try iptables -F; systemctl restart glusterd, no change - following the workaround in the bug, try iptables -F & restart glusterd on other 2 nodes, no effect - nodes B & C were still connected to each other and all bricks were fine at this point - try upgrading other 2 nodes and restarting gluster, no effect (iptables still empty) - lost quota here, so all bricks went offline - read logs, not finding much, but looked at glusterd.vol and compared to new versions - updated glusterd.vol on A and restarted glusterd - A doesn?t show any connected peers, but both other nodes show A as connected - update glusterd.vol on B & C, restart glusterd - all nodes show connected and volumes are active and healing The only odd thing in my process was that node A did not have any active bricks on it at the time of the upgrade. It doesn?t seem like this mattered since B & C showed the same symptoms between themselves while being upgraded, but I don?t know. The only log entry that referenced anything about peer connections is included below already. Looks like it was related to my glusterd settings, since that?s what fixed it for me. Unfortunately, I don?t have the bandwidth or the systems to test different versions of that specifically, but maybe you guys can on some test resources? Otherwise, I?ve got another cluster (my production one!) that?s midway through the upgrade from 3.12.15 -> 5.5. I paused when I started getting multiple brick processes on the two nodes that had gone to 5.5 already. I think I?m going to jump the last node right to 6 to try and avoid that mess, and it has the same glusterd.vol settings. I?ll try and capture it?s logs during the upgrade and see if there?s any new info, or if it has the same issues as this group did. -Darrell > On Apr 4, 2019, at 2:54 AM, Sanju Rakonde wrote: > > We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while upgrading to glusterfs-6. We tested it in different setups and understood that this issue is seen because of some issue in setup. > > regarding the issue you have faced, can you please let us know which documentation you have followed for the upgrade? During our testing, we didn't hit any such issue. we would like to understand what went wrong. > > On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic > wrote: > Hari- > > I was upgrading my test cluster from 5.5 to 6 and I hit this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1694010 ) or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn?t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look? > > My old config (all nodes): > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option ping-timeout 10 > option event-threads 1 > option rpc-auth-allow-insecure on > # option transport.address-family inet6 > # option base-port 49152 > end-volume > > changed to: > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option transport.socket.listen-port 24007 > option transport.rdma.listen-port 24008 > option ping-timeout 0 > option event-threads 1 > option rpc-auth-allow-insecure on > # option lock-timer 180 > # option transport.address-family inet6 > # option base-port 49152 > option max-port 60999 > end-volume > > the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened: > [2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state , has disconnected from glusterd. > > >> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee > wrote: >> >> >> >> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham > wrote: >> Comments inline. >> >> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >> > wrote: >> > >> > Quite a considerable amount of detail here. Thank you! >> > >> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham > wrote: >> > > >> > > Hello Gluster users, >> > > >> > > As you all aware that glusterfs-6 is out, we would like to inform you >> > > that, we have spent a significant amount of time in testing >> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >> > > >> > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. >> > > There were xlators (and respective options to enable/disable them) >> > > added and deprecated in glusterfs-6 from various versions [1]. >> > > >> > > We had to check the following upgrade scenarios for all such options >> > > Identified in [1]: >> > > 1) option never enabled and upgraded >> > > 2) option enabled and then upgraded >> > > 3) option enabled and then disabled and then upgraded >> > > >> > > We weren't manually able to check all the combinations for all the options. >> > > So the options involving enabling and disabling xlators were prioritized. >> > > The below are the result of the ones tested. >> > > >> > > Never enabled and upgraded: >> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >> > > >> > > Enabled and upgraded: >> > > Tested for tier which is deprecated, It is not a recommended upgrade. >> > > As expected the volume won't be consumable and will have a few more >> > > issues as well. >> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >> > > >> > > Enabled, disabled before upgrade. >> > > Tested for tier with 3.12 and the upgrade went fine. >> > > >> > > There is one common issue to note in every upgrade. The node being >> > > upgraded is going into disconnected state. You have to flush the iptables >> > > and the restart glusterd on all nodes to fix this. >> > > >> > >> > Is this something that is written in the upgrade notes? I do not seem >> > to recall, if not, I'll send a PR >> >> No this wasn't mentioned in the release notes. PRs are welcome. >> >> > >> > > The testing for enabling new options is still pending. The new options >> > > won't cause as much issues as the deprecated ones so this was put at >> > > the end of the priority list. It would be nice to get contributions >> > > for this. >> > > >> > >> > Did the range of tests lead to any new issues? >> >> Yes. In the first round of testing we found an issue and had to postpone the >> release of 6 until the fix was made available. >> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >> >> And then we tested it again after this patch was made available. >> and came across this: >> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >> >> This isn?t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release. >> >> >> >> Have mentioned this in the second mail as to how to over this situation >> for now until the fix is available. >> >> > >> > > For the disable testing, tier was used as it covers most of the xlator >> > > that was removed. And all of these tests were done on a replica 3 volume. >> > > >> > >> > I'm not sure if the Glusto team is reading this, but it would be >> > pertinent to understand if the approach you have taken can be >> > converted into a form of automated testing pre-release. >> >> I don't have an answer for this, have CCed Vijay. >> He might have an idea. >> >> > >> > > Note: This is only for upgrade testing of the newly added and removed >> > > xlators. Does not involve the normal tests for the xlator. >> > > >> > > If you have any questions, please feel free to reach us. >> > > >> > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >> > > >> > > Regards, >> > > Hari and Sanju. >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> -- >> Regards, >> Hari Gowtham. >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> -- >> --Atin >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > -- > Thanks, > Sanju -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Thu Apr 4 16:25:05 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Thu, 4 Apr 2019 21:55:05 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: <493E624F-FC40-4242-BF9D-BAD0385B2DA5@onholyground.com> References:

<13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> <493E624F-FC40-4242-BF9D-BAD0385B2DA5@onholyground.com> Message-ID: Darell, I fully understand that you can't reproduce it and you don't have bandwidth to test it again, but would you be able to send us the glusterd log from all the nodes when this happened. We would like to go through the logs and get back. I would particularly like to see if something has gone wrong with transport.socket.listen-port option. But with out the log files we can't find out anything. Hope you understand it. On Thu, Apr 4, 2019 at 9:27 PM Darrell Budic wrote: > I didn?t follow any specific documents, just a generic rolling upgrade one > node at a time. Once the first node didn?t reconnect, I tried to follow the > workaround in the bug during the upgrade. Basic procedure was: > > - take 3 nodes that were initially installed with 3.12.x (forget which, > but low number) and had been upgraded directly to 5.5 from 3.12.15 > - op-version was 50400 > - on node A: > - yum install centos-release-gluster6 > - yum upgrade (was some ovirt cockpit components, gluster, and a lib or > two this time), hit yes > - discover glusterd was dead > - systemctl restart glusterd > - no peer connections, try iptables -F; systemctl restart glusterd, no > change > - following the workaround in the bug, try iptables -F & restart glusterd > on other 2 nodes, no effect > - nodes B & C were still connected to each other and all bricks were > fine at this point > - try upgrading other 2 nodes and restarting gluster, no effect (iptables > still empty) > - lost quota here, so all bricks went offline > - read logs, not finding much, but looked at glusterd.vol and compared to > new versions > - updated glusterd.vol on A and restarted glusterd > - A doesn?t show any connected peers, but both other nodes show A as > connected > - update glusterd.vol on B & C, restart glusterd > - all nodes show connected and volumes are active and healing > > The only odd thing in my process was that node A did not have any active > bricks on it at the time of the upgrade. It doesn?t seem like this mattered > since B & C showed the same symptoms between themselves while being > upgraded, but I don?t know. The only log entry that referenced anything > about peer connections is included below already. > > Looks like it was related to my glusterd settings, since that?s what fixed > it for me. Unfortunately, I don?t have the bandwidth or the systems to test > different versions of that specifically, but maybe you guys can on some > test resources? Otherwise, I?ve got another cluster (my production one!) > that?s midway through the upgrade from 3.12.15 -> 5.5. I paused when I > started getting multiple brick processes on the two nodes that had gone to > 5.5 already. I think I?m going to jump the last node right to 6 to try and > avoid that mess, and it has the same glusterd.vol settings. I?ll try and > capture it?s logs during the upgrade and see if there?s any new info, or if > it has the same issues as this group did. > > -Darrell > > On Apr 4, 2019, at 2:54 AM, Sanju Rakonde wrote: > > We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while > upgrading to glusterfs-6. We tested it in different setups and understood > that this issue is seen because of some issue in setup. > > regarding the issue you have faced, can you please let us know which > documentation you have followed for the upgrade? During our testing, we > didn't hit any such issue. we would like to understand what went wrong. > > On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic > wrote: > >> Hari- >> >> I was upgrading my test cluster from 5.5 to 6 and I hit this bug ( >> https://bugzilla.redhat.com/show_bug.cgi?id=1694010) or something >> similar. In my case, the workaround did not work, and I was left with a >> gluster that had gone into no-quorum mode and stopped all the bricks. >> Wasn?t much in the logs either, but I noticed my >> /etc/glusterfs/glusterd.vol files were not the same as the newer versions, >> so I updated them, restarted glusterd, and suddenly the updated node showed >> as peer-in-cluster again. Once I updated other notes the same way, things >> started working again. Maybe a place to look? >> >> My old config (all nodes): >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option ping-timeout 10 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option transport.address-family inet6 >> # option base-port 49152 >> end-volume >> >> changed to: >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket,rdma >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option transport.socket.listen-port 24007 >> option transport.rdma.listen-port 24008 >> option ping-timeout 0 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option lock-timer 180 >> # option transport.address-family inet6 >> # option base-port 49152 >> option max-port 60999 >> end-volume >> >> the only thing I found in the glusterd logs that looks relevant was >> (repeated for both of the other nodes in this cluster), so no clue why it >> happened: >> [2019-04-03 20:19:16.802638] I [MSGID: 106004] >> [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer >> (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state > Cluster>, has disconnected from glusterd. >> >> >> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee >> wrote: >> >> >> >> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham wrote: >> >>> Comments inline. >>> >>> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >>> wrote: >>> > >>> > Quite a considerable amount of detail here. Thank you! >>> > >>> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham >>> wrote: >>> > > >>> > > Hello Gluster users, >>> > > >>> > > As you all aware that glusterfs-6 is out, we would like to inform you >>> > > that, we have spent a significant amount of time in testing >>> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >>> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >>> > > >>> > > As glusterfs-6 has got in a lot of changes, we wanted to test those >>> portions. >>> > > There were xlators (and respective options to enable/disable them) >>> > > added and deprecated in glusterfs-6 from various versions [1]. >>> > > >>> > > We had to check the following upgrade scenarios for all such options >>> > > Identified in [1]: >>> > > 1) option never enabled and upgraded >>> > > 2) option enabled and then upgraded >>> > > 3) option enabled and then disabled and then upgraded >>> > > >>> > > We weren't manually able to check all the combinations for all the >>> options. >>> > > So the options involving enabling and disabling xlators were >>> prioritized. >>> > > The below are the result of the ones tested. >>> > > >>> > > Never enabled and upgraded: >>> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >>> > > >>> > > Enabled and upgraded: >>> > > Tested for tier which is deprecated, It is not a recommended upgrade. >>> > > As expected the volume won't be consumable and will have a few more >>> > > issues as well. >>> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >>> > > >>> > > Enabled, disabled before upgrade. >>> > > Tested for tier with 3.12 and the upgrade went fine. >>> > > >>> > > There is one common issue to note in every upgrade. The node being >>> > > upgraded is going into disconnected state. You have to flush the >>> iptables >>> > > and the restart glusterd on all nodes to fix this. >>> > > >>> > >>> > Is this something that is written in the upgrade notes? I do not seem >>> > to recall, if not, I'll send a PR >>> >>> No this wasn't mentioned in the release notes. PRs are welcome. >>> >>> > >>> > > The testing for enabling new options is still pending. The new >>> options >>> > > won't cause as much issues as the deprecated ones so this was put at >>> > > the end of the priority list. It would be nice to get contributions >>> > > for this. >>> > > >>> > >>> > Did the range of tests lead to any new issues? >>> >>> Yes. In the first round of testing we found an issue and had to postpone >>> the >>> release of 6 until the fix was made available. >>> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >>> >>> And then we tested it again after this patch was made available. >>> and came across this: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >> >> >> This isn?t a bug as we found that upgrade worked seamelessly in two >> different setup. So we have no issues in the upgrade path to glusterfs-6 >> release. >> >> >>> >>> Have mentioned this in the second mail as to how to over this situation >>> for now until the fix is available. >>> >>> > >>> > > For the disable testing, tier was used as it covers most of the >>> xlator >>> > > that was removed. And all of these tests were done on a replica 3 >>> volume. >>> > > >>> > >>> > I'm not sure if the Glusto team is reading this, but it would be >>> > pertinent to understand if the approach you have taken can be >>> > converted into a form of automated testing pre-release. >>> >>> I don't have an answer for this, have CCed Vijay. >>> He might have an idea. >>> >>> > >>> > > Note: This is only for upgrade testing of the newly added and removed >>> > > xlators. Does not involve the normal tests for the xlator. >>> > > >>> > > If you have any questions, please feel free to reach us. >>> > > >>> > > [1] >>> https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >>> > > >>> > > Regards, >>> > > Hari and Sanju. >>> > _______________________________________________ >>> > Gluster-users mailing list >>> > Gluster-users at gluster.org >>> > https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> -- >>> Regards, >>> Hari Gowtham. >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> -- >> --Atin >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Thanks, > Sanju > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From budic at onholyground.com Thu Apr 4 16:40:38 2019 From: budic at onholyground.com (Darrell Budic) Date: Thu, 4 Apr 2019 11:40:38 -0500 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: References:

<13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> <493E624F-FC40-4242-BF9D-BAD0385B2DA5@onholyground.com> Message-ID: <0F2C3FE9-E190-47D4-A043-319F086447E9@onholyground.com> Just the glusterd.log from each node, right? > On Apr 4, 2019, at 11:25 AM, Atin Mukherjee wrote: > > Darell, > > I fully understand that you can't reproduce it and you don't have bandwidth to test it again, but would you be able to send us the glusterd log from all the nodes when this happened. We would like to go through the logs and get back. I would particularly like to see if something has gone wrong with transport.socket.listen-port option. But with out the log files we can't find out anything. Hope you understand it. > > On Thu, Apr 4, 2019 at 9:27 PM Darrell Budic > wrote: > I didn?t follow any specific documents, just a generic rolling upgrade one node at a time. Once the first node didn?t reconnect, I tried to follow the workaround in the bug during the upgrade. Basic procedure was: > > - take 3 nodes that were initially installed with 3.12.x (forget which, but low number) and had been upgraded directly to 5.5 from 3.12.15 > - op-version was 50400 > - on node A: > - yum install centos-release-gluster6 > - yum upgrade (was some ovirt cockpit components, gluster, and a lib or two this time), hit yes > - discover glusterd was dead > - systemctl restart glusterd > - no peer connections, try iptables -F; systemctl restart glusterd, no change > - following the workaround in the bug, try iptables -F & restart glusterd on other 2 nodes, no effect > - nodes B & C were still connected to each other and all bricks were fine at this point > - try upgrading other 2 nodes and restarting gluster, no effect (iptables still empty) > - lost quota here, so all bricks went offline > - read logs, not finding much, but looked at glusterd.vol and compared to new versions > - updated glusterd.vol on A and restarted glusterd > - A doesn?t show any connected peers, but both other nodes show A as connected > - update glusterd.vol on B & C, restart glusterd > - all nodes show connected and volumes are active and healing > > The only odd thing in my process was that node A did not have any active bricks on it at the time of the upgrade. It doesn?t seem like this mattered since B & C showed the same symptoms between themselves while being upgraded, but I don?t know. The only log entry that referenced anything about peer connections is included below already. > > Looks like it was related to my glusterd settings, since that?s what fixed it for me. Unfortunately, I don?t have the bandwidth or the systems to test different versions of that specifically, but maybe you guys can on some test resources? Otherwise, I?ve got another cluster (my production one!) that?s midway through the upgrade from 3.12.15 -> 5.5. I paused when I started getting multiple brick processes on the two nodes that had gone to 5.5 already. I think I?m going to jump the last node right to 6 to try and avoid that mess, and it has the same glusterd.vol settings. I?ll try and capture it?s logs during the upgrade and see if there?s any new info, or if it has the same issues as this group did. > > -Darrell > >> On Apr 4, 2019, at 2:54 AM, Sanju Rakonde > wrote: >> >> We don't hit https://bugzilla.redhat.com/show_bug.cgi?id=1694010 while upgrading to glusterfs-6. We tested it in different setups and understood that this issue is seen because of some issue in setup. >> >> regarding the issue you have faced, can you please let us know which documentation you have followed for the upgrade? During our testing, we didn't hit any such issue. we would like to understand what went wrong. >> >> On Thu, Apr 4, 2019 at 2:08 AM Darrell Budic > wrote: >> Hari- >> >> I was upgrading my test cluster from 5.5 to 6 and I hit this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1694010 ) or something similar. In my case, the workaround did not work, and I was left with a gluster that had gone into no-quorum mode and stopped all the bricks. Wasn?t much in the logs either, but I noticed my /etc/glusterfs/glusterd.vol files were not the same as the newer versions, so I updated them, restarted glusterd, and suddenly the updated node showed as peer-in-cluster again. Once I updated other notes the same way, things started working again. Maybe a place to look? >> >> My old config (all nodes): >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option ping-timeout 10 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option transport.address-family inet6 >> # option base-port 49152 >> end-volume >> >> changed to: >> volume management >> type mgmt/glusterd >> option working-directory /var/lib/glusterd >> option transport-type socket,rdma >> option transport.socket.keepalive-time 10 >> option transport.socket.keepalive-interval 2 >> option transport.socket.read-fail-log off >> option transport.socket.listen-port 24007 >> option transport.rdma.listen-port 24008 >> option ping-timeout 0 >> option event-threads 1 >> option rpc-auth-allow-insecure on >> # option lock-timer 180 >> # option transport.address-family inet6 >> # option base-port 49152 >> option max-port 60999 >> end-volume >> >> the only thing I found in the glusterd logs that looks relevant was (repeated for both of the other nodes in this cluster), so no clue why it happened: >> [2019-04-03 20:19:16.802638] I [MSGID: 106004] [glusterd-handler.c:6427:__glusterd_peer_rpc_notify] 0-management: Peer (<0ecbf953-681b-448f-9746-d1c1fe7a0978>), in state , has disconnected from glusterd. >> >> >>> On Apr 2, 2019, at 4:53 AM, Atin Mukherjee > wrote: >>> >>> >>> >>> On Mon, 1 Apr 2019 at 10:28, Hari Gowtham > wrote: >>> Comments inline. >>> >>> On Mon, Apr 1, 2019 at 5:55 AM Sankarshan Mukhopadhyay >>> > wrote: >>> > >>> > Quite a considerable amount of detail here. Thank you! >>> > >>> > On Fri, Mar 29, 2019 at 11:42 AM Hari Gowtham > wrote: >>> > > >>> > > Hello Gluster users, >>> > > >>> > > As you all aware that glusterfs-6 is out, we would like to inform you >>> > > that, we have spent a significant amount of time in testing >>> > > glusterfs-6 in upgrade scenarios. We have done upgrade testing to >>> > > glusterfs-6 from various releases like 3.12, 4.1 and 5.3. >>> > > >>> > > As glusterfs-6 has got in a lot of changes, we wanted to test those portions. >>> > > There were xlators (and respective options to enable/disable them) >>> > > added and deprecated in glusterfs-6 from various versions [1]. >>> > > >>> > > We had to check the following upgrade scenarios for all such options >>> > > Identified in [1]: >>> > > 1) option never enabled and upgraded >>> > > 2) option enabled and then upgraded >>> > > 3) option enabled and then disabled and then upgraded >>> > > >>> > > We weren't manually able to check all the combinations for all the options. >>> > > So the options involving enabling and disabling xlators were prioritized. >>> > > The below are the result of the ones tested. >>> > > >>> > > Never enabled and upgraded: >>> > > checked from 3.12, 4.1, 5.3 to 6 the upgrade works. >>> > > >>> > > Enabled and upgraded: >>> > > Tested for tier which is deprecated, It is not a recommended upgrade. >>> > > As expected the volume won't be consumable and will have a few more >>> > > issues as well. >>> > > Tested with 3.12, 4.1 and 5.3 to 6 upgrade. >>> > > >>> > > Enabled, disabled before upgrade. >>> > > Tested for tier with 3.12 and the upgrade went fine. >>> > > >>> > > There is one common issue to note in every upgrade. The node being >>> > > upgraded is going into disconnected state. You have to flush the iptables >>> > > and the restart glusterd on all nodes to fix this. >>> > > >>> > >>> > Is this something that is written in the upgrade notes? I do not seem >>> > to recall, if not, I'll send a PR >>> >>> No this wasn't mentioned in the release notes. PRs are welcome. >>> >>> > >>> > > The testing for enabling new options is still pending. The new options >>> > > won't cause as much issues as the deprecated ones so this was put at >>> > > the end of the priority list. It would be nice to get contributions >>> > > for this. >>> > > >>> > >>> > Did the range of tests lead to any new issues? >>> >>> Yes. In the first round of testing we found an issue and had to postpone the >>> release of 6 until the fix was made available. >>> https://bugzilla.redhat.com/show_bug.cgi?id=1684029 >>> >>> And then we tested it again after this patch was made available. >>> and came across this: >>> https://bugzilla.redhat.com/show_bug.cgi?id=1694010 >>> >>> This isn?t a bug as we found that upgrade worked seamelessly in two different setup. So we have no issues in the upgrade path to glusterfs-6 release. >>> >>> >>> >>> Have mentioned this in the second mail as to how to over this situation >>> for now until the fix is available. >>> >>> > >>> > > For the disable testing, tier was used as it covers most of the xlator >>> > > that was removed. And all of these tests were done on a replica 3 volume. >>> > > >>> > >>> > I'm not sure if the Glusto team is reading this, but it would be >>> > pertinent to understand if the approach you have taken can be >>> > converted into a form of automated testing pre-release. >>> >>> I don't have an answer for this, have CCed Vijay. >>> He might have an idea. >>> >>> > >>> > > Note: This is only for upgrade testing of the newly added and removed >>> > > xlators. Does not involve the normal tests for the xlator. >>> > > >>> > > If you have any questions, please feel free to reach us. >>> > > >>> > > [1] https://docs.google.com/spreadsheets/d/1nh7T5AXaV6kc5KgILOy2pEqjzC3t_R47f1XUXSVFetI/edit?usp=sharing >>> > > >>> > > Regards, >>> > > Hari and Sanju. >>> > _______________________________________________ >>> > Gluster-users mailing list >>> > Gluster-users at gluster.org >>> > https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> -- >>> Regards, >>> Hari Gowtham. >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> -- >>> --Atin >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> -- >> Thanks, >> Sanju > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel_patterson at verizon.net Thu Apr 4 16:48:44 2019 From: joel_patterson at verizon.net (Joel Patterson) Date: Thu, 4 Apr 2019 12:48:44 -0400 Subject: [Gluster-users] backupvolfile-server (servers) not working for new mounts? Message-ID: <6e5457c0-ed98-d41a-fe58-ce77705f892d@verizon.net> I have a gluster 4.1 system with three servers running Docker/Kubernetes.??? The pods mount filesystems using gluster. 10.13.112.31 is the primary server [A] and all mounts specify it with two other servers [10.13.113.116 [B] and 10.13.114.16 [C]] specified in backup-volfile-servers. I'm testing what happens when a server goes down. If I bring down [B] or [C], no problem, everything restages and works. But if I bring down [A], any *existing* mount continues to work, but any new mounts fail.? I'm seeing messages about all subvolumes being down in the pod. But I've mounted this exact same volume on the same system (before I bring down the server) and I can access all the data fine. Why the failure for new mounts???? I'm on AWS and all servers are in different availability zones, but I don't see how that would be an issue. I tried using just backupvolfile-server and that didn't work either. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus From jorge.crespo at avature.net Thu Apr 4 17:17:08 2019 From: jorge.crespo at avature.net (Jorge Crespo) Date: Thu, 4 Apr 2019 19:17:08 +0200 Subject: [Gluster-users] Gluster with KVM - VM migration Message-ID: Hi everyone, First message in this list, hope I can help out as much as I can. I was wondering if someone could point out any solution already working or this would be a matter of scripting. We are using Gluster for a kind of strange infrastructure , where we have let's say 2 NODES , 2 bricks each , 2 volumes total. And both servers are mounting as clients both volumes. We exclusively use these volumes to run VM's. And the reason of the infrastructure is to be able to LiveMigrate VM's from one node to the other. VM's defined and running in NODE 1, MV files in /gluster_gv1 , this GlusterFS is also mounted in NODE2 ,but NODE2 doesn't make any real use of it. VM's defined and running in NODE 2, MV files in /gluster_gv2 , as before, this GlusterFS is also mounted in NODE1, but it doesn't make any real use of it. So the question comes now: - Let's say we come to an scenario where NODE 1 comes down. I have the VM's files copied to NODE2, I define them in NODE2 and start them , no problem with that. - Now the NODE 1 comes back UP , I guess the safest solution should be to have the VM's without Autostart so things don't go messy. But let's imagine I want my system to know which VM's are started in NODE 2, and start the ones that haven't been started in NODE 2. Is there any "official" way to achieve this? Basically achieve something like vCenter where the cluster keeps track of where the VM's are running at any given time, and also being able to start them in a different node if their node goes down. If there is no "official" answer, I'd like to hear your opinions. Cheers! -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4012 bytes Desc: Firma criptogr??fica S/MIME URL: From amukherj at redhat.com Thu Apr 4 17:43:43 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Thu, 4 Apr 2019 23:13:43 +0530 Subject: [Gluster-users] [Gluster-devel] Upgrade testing to gluster 6 In-Reply-To: <0F2C3FE9-E190-47D4-A043-319F086447E9@onholyground.com> References:

<13CA0DC5-C248-40B8-B2D4-E6664812303A@onholyground.com> <493E624F-FC40-4242-BF9D-BAD0385B2DA5@onholyground.com>