From vbellur at redhat.com Fri Feb 1 06:53:48 2019 From: vbellur at redhat.com (Vijay Bellur) Date: Thu, 31 Jan 2019 22:53:48 -0800 Subject: [Gluster-devel] I/O performance In-Reply-To: References: Message-ID: On Thu, Jan 31, 2019 at 10:01 AM Xavi Hernandez wrote: > Hi, > > I've been doing some tests with the global thread pool [1], and I've > observed one important thing: > > Since this new thread pool has very low contention (apparently), it > exposes other problems when the number of threads grows. What I've seen is > that some workloads use all available threads on bricks to do I/O, causing > avgload to grow rapidly and saturating the machine (or it seems so), which > really makes everything slower. Reducing the maximum number of threads > improves performance actually. Other workloads, though, do little I/O > (probably most is locking or smallfile operations). In this case limiting > the number of threads to a small value causes a performance reduction. To > increase performance we need more threads. > > So this is making me thing that maybe we should implement some sort of I/O > queue with a maximum I/O depth for each brick (or disk if bricks share same > disk). This way we can limit the amount of requests physically accessing > the underlying FS concurrently, without actually limiting the number of > threads that can be doing other things on each brick. I think this could > improve performance. > Perhaps we could throttle both aspects - number of I/O requests per disk and the number of threads too? That way we will have the ability to behave well when there is bursty I/O to the same disk and when there are multiple concurrent requests to different disks. Do you have a reason to not limit the number of threads? > Maybe this approach could also be useful in client side, but I think it's > not so critical there. > Agree, rate limiting on the server side would be more appropriate. Thanks, Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Fri Feb 1 07:12:05 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Fri, 1 Feb 2019 08:12:05 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: Message-ID: On Fri, Feb 1, 2019 at 7:54 AM Vijay Bellur wrote: > > > On Thu, Jan 31, 2019 at 10:01 AM Xavi Hernandez > wrote: > >> Hi, >> >> I've been doing some tests with the global thread pool [1], and I've >> observed one important thing: >> >> Since this new thread pool has very low contention (apparently), it >> exposes other problems when the number of threads grows. What I've seen is >> that some workloads use all available threads on bricks to do I/O, causing >> avgload to grow rapidly and saturating the machine (or it seems so), which >> really makes everything slower. Reducing the maximum number of threads >> improves performance actually. Other workloads, though, do little I/O >> (probably most is locking or smallfile operations). In this case limiting >> the number of threads to a small value causes a performance reduction. To >> increase performance we need more threads. >> >> So this is making me thing that maybe we should implement some sort of >> I/O queue with a maximum I/O depth for each brick (or disk if bricks share >> same disk). This way we can limit the amount of requests physically >> accessing the underlying FS concurrently, without actually limiting the >> number of threads that can be doing other things on each brick. I think >> this could improve performance. >> > > Perhaps we could throttle both aspects - number of I/O requests per disk > and the number of threads too? That way we will have the ability to behave > well when there is bursty I/O to the same disk and when there are multiple > concurrent requests to different disks. Do you have a reason to not limit > the number of threads? > No, in fact the global thread pool does have a limit for the number of threads. I'm not saying to replace the thread limit for I/O depth control, I think we need both. I think we need to clearly identify which threads are doing I/O and limit them, even if there are more threads available. The reason is easy: suppose we have a fixed number of threads. If we have heavy load sent in parallel, it's quite possible that all threads get blocked doing some I/O. This has two consequences: 1. There are no more threads to execute other things, like sending answers to the client, or start processing new incoming requests. So CPU is underutilized. 2. Massive parallel access to a FS actually decreases performance This means that we can do less work and this work takes more time, which is bad. If we limit the number of threads that can actually be doing FS I/O, it's easy to keep FS responsive and we'll still have more threads to do other work. > >> Maybe this approach could also be useful in client side, but I think it's >> not so critical there. >> > > Agree, rate limiting on the server side would be more appropriate. > Only thing to consider here is that if we limit rate on servers but clients can generate more requests without limit, we may require lots of memory to track all ongoing requests. Anyway, I think this is not the most important thing now, so if we solve the server-side problem, then we can check if this is really needed or not (it could happen that client applications limit themselves automatically because they will be waiting for answers from server before sending more requests, unless the number of application running concurrently is really huge). Xavi -------------- next part -------------- An HTML attachment was scrubbed... URL: From vbellur at redhat.com Fri Feb 1 07:27:27 2019 From: vbellur at redhat.com (Vijay Bellur) Date: Thu, 31 Jan 2019 23:27:27 -0800 Subject: [Gluster-devel] I/O performance In-Reply-To: References: Message-ID: On Thu, Jan 31, 2019 at 11:12 PM Xavi Hernandez wrote: > On Fri, Feb 1, 2019 at 7:54 AM Vijay Bellur wrote: > >> >> >> On Thu, Jan 31, 2019 at 10:01 AM Xavi Hernandez >> wrote: >> >>> Hi, >>> >>> I've been doing some tests with the global thread pool [1], and I've >>> observed one important thing: >>> >>> Since this new thread pool has very low contention (apparently), it >>> exposes other problems when the number of threads grows. What I've seen is >>> that some workloads use all available threads on bricks to do I/O, causing >>> avgload to grow rapidly and saturating the machine (or it seems so), which >>> really makes everything slower. Reducing the maximum number of threads >>> improves performance actually. Other workloads, though, do little I/O >>> (probably most is locking or smallfile operations). In this case limiting >>> the number of threads to a small value causes a performance reduction. To >>> increase performance we need more threads. >>> >>> So this is making me thing that maybe we should implement some sort of >>> I/O queue with a maximum I/O depth for each brick (or disk if bricks share >>> same disk). This way we can limit the amount of requests physically >>> accessing the underlying FS concurrently, without actually limiting the >>> number of threads that can be doing other things on each brick. I think >>> this could improve performance. >>> >> >> Perhaps we could throttle both aspects - number of I/O requests per disk >> and the number of threads too? That way we will have the ability to behave >> well when there is bursty I/O to the same disk and when there are multiple >> concurrent requests to different disks. Do you have a reason to not limit >> the number of threads? >> > > No, in fact the global thread pool does have a limit for the number of > threads. I'm not saying to replace the thread limit for I/O depth control, > I think we need both. I think we need to clearly identify which threads are > doing I/O and limit them, even if there are more threads available. The > reason is easy: suppose we have a fixed number of threads. If we have heavy > load sent in parallel, it's quite possible that all threads get blocked > doing some I/O. This has two consequences: > > 1. There are no more threads to execute other things, like sending > answers to the client, or start processing new incoming requests. So CPU is > underutilized. > 2. Massive parallel access to a FS actually decreases performance > > This means that we can do less work and this work takes more time, which > is bad. > > If we limit the number of threads that can actually be doing FS I/O, it's > easy to keep FS responsive and we'll still have more threads to do other > work. > Got it, thx. > > >> >>> Maybe this approach could also be useful in client side, but I think >>> it's not so critical there. >>> >> >> Agree, rate limiting on the server side would be more appropriate. >> > > Only thing to consider here is that if we limit rate on servers but > clients can generate more requests without limit, we may require lots of > memory to track all ongoing requests. Anyway, I think this is not the most > important thing now, so if we solve the server-side problem, then we can > check if this is really needed or not (it could happen that client > applications limit themselves automatically because they will be waiting > for answers from server before sending more requests, unless the number of > application running concurrently is really huge). > We could enable throttling in the rpc layer to handle a client performing aggressive I/O. RPC throttling should be able to handle the scenario described above. -Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: From manu at netbsd.org Fri Feb 1 12:03:49 2019 From: manu at netbsd.org (Emmanuel Dreyfus) Date: Fri, 1 Feb 2019 12:03:49 +0000 Subject: [Gluster-devel] I/O performance In-Reply-To: References: Message-ID: <20190201120349.GM4509@homeworld.netbsd.org> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: > Perhaps we could throttle both aspects - number of I/O requests per disk While there it would be nice to detect and report a disk with lower than peer performance: that happen sometimes when a disk is dying, and last time I was hit by that performance problem, I had a hard time finding the culprit. -- Emmanuel Dreyfus manu at netbsd.org From pgurusid at redhat.com Fri Feb 1 12:25:31 2019 From: pgurusid at redhat.com (Poornima Gurusiddaiah) Date: Fri, 1 Feb 2019 17:55:31 +0530 Subject: [Gluster-devel] I/O performance In-Reply-To: <20190201120349.GM4509@homeworld.netbsd.org> References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: Can the threads be categorised to do certain kinds of fops? Read/write affinitise to certain set of threads, the other metadata fops to other set of threads. So we limit the read/write threads and not the metadata threads? Also if aio is enabled in the backend the threads will not be blocked on disk IO right? All this is based on the assumption that large number of parallel read writes make the disk perf bad but not the large number of dentry and metadata ops. Is that true? Thanks, Poornima On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: > > Perhaps we could throttle both aspects - number of I/O requests per disk > > While there it would be nice to detect and report a disk with lower than > peer performance: that happen sometimes when a disk is dying, and last > time I was hit by that performance problem, I had a hard time finding > the culprit. > > -- > Emmanuel Dreyfus > manu at netbsd.org > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Fri Feb 1 12:51:54 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Fri, 1 Feb 2019 13:51:54 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah wrote: > Can the threads be categorised to do certain kinds of fops? > Could be, but creating multiple thread groups for different tasks is generally bad because many times you end up with lots of idle threads which waste resources and could increase contention. I think we should only differentiate threads if it's absolutely necessary. > Read/write affinitise to certain set of threads, the other metadata fops > to other set of threads. So we limit the read/write threads and not the > metadata threads? Also if aio is enabled in the backend the threads will > not be blocked on disk IO right? > If we don't block the thread but we don't prevent more requests to go to the disk, then we'll probably have the same problem. Anyway, I'll try to run some tests with AIO to see if anything changes. All this is based on the assumption that large number of parallel read > writes make the disk perf bad but not the large number of dentry and > metadata ops. Is that true? > It depends. If metadata is not cached, it's as bad as a read or write since it requires a disk access (a clear example of this is the bad performance of 'ls' in cold cache, which is basically metadata reads). In fact, cached data reads are also very fast, and data writes could go to the cache and be updated later in background, so I think the important point is if things are cached or not, instead of if they are data or metadata. Since we don't have this information from the user side, it's hard to tell what's better. My opinion is that we shouldn't differentiate requests of data/metadata. If metadata requests happen to be faster, then that thread will be able to handle other requests immediately, which seems good enough. However there's one thing that I would do. I would differentiate reads (data or metadata) from writes. Normally writes come from cached information that is flushed to disk at some point, so this normally happens in the background. But reads tend to be in foreground, meaning that someone (user or application) is waiting for it. So I would give preference to reads over writes. To do so effectively, we need to not saturate the backend, otherwise when we need to send a read, it will still need to wait for all pending requests to complete. If disks are not saturated, we can have the answer to the read quite fast, and then continue processing the remaining writes. Anyway, I may be wrong, since all these things depend on too many factors. I haven't done any specific tests about this. It's more like a brainstorming. As soon as I can I would like to experiment with this and get some empirical data. Xavi > Thanks, > Poornima > > > On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus >> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: >> > Perhaps we could throttle both aspects - number of I/O requests per disk >> >> While there it would be nice to detect and report a disk with lower than >> peer performance: that happen sometimes when a disk is dying, and last >> time I was hit by that performance problem, I had a hard time finding >> the culprit. >> >> -- >> Emmanuel Dreyfus >> manu at netbsd.org >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Feb 4 01:45:03 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 4 Feb 2019 01:45:03 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <2040632523.2.1549244703864.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1668227 / core: gluster(8) - Add SELinux context glusterd_brick_t to man page https://bugzilla.redhat.com/1670334 / core: Some memory leaks found in GlusterFS 5.3 https://bugzilla.redhat.com/1668239 / disperse: [man page] Gluster(8) - Missing disperse-data parameter Gluster Console Manager man page https://bugzilla.redhat.com/1671556 / fuse: glusterfs FUSE client crashing every few days with 'Failed to dispatch handler' https://bugzilla.redhat.com/1671014 / fuse: gluster-fuse seg fault PTHREAD_MUTEX_TYPE_ELISION https://bugzilla.redhat.com/1668118 / geo-replication: Failure to start geo-replication for tiered volume. https://bugzilla.redhat.com/1664524 / geo-replication: Non-root geo-replication session goes to faulty state, when the session is started https://bugzilla.redhat.com/1672076 / glusterd: chrome / chromium crash on gluster, sqlite issue? https://bugzilla.redhat.com/1670382 / gluster-smb: parallel-readdir prevents directories and files listing https://bugzilla.redhat.com/1666326 / open-behind: reopening bug 1405147: Failed to dispatch handler: glusterfs seems to check for "write permission" instead for "file owner" during open() when writing to a file https://bugzilla.redhat.com/1668259 / packaging: Glusterfs 5.3 RPMs can't be build on rhel7 https://bugzilla.redhat.com/1665361 / project-infrastructure: Alerts for offline nodes https://bugzilla.redhat.com/1671647 / project-infrastructure: Anomalies in python-lint build job https://bugzilla.redhat.com/1663780 / project-infrastructure: On docs.gluster.org, we should convert spaces in folder or file names to 301 redirects to hypens https://bugzilla.redhat.com/1666634 / protocol: nfs client cannot compile files on dispersed volume https://bugzilla.redhat.com/1665677 / rdma: volume create and transport change with rdma failed https://bugzilla.redhat.com/1668286 / read-ahead: READDIRP incorrectly updates posix-acl inode ctx https://bugzilla.redhat.com/1664215 / read-ahead: Toggling readdir-ahead translator off causes some clients to umount some of its volumes https://bugzilla.redhat.com/1671207 / rpc: Several fixes on socket pollin and pollout return value https://bugzilla.redhat.com/1664398 / tests: ./tests/00-geo-rep/00-georep-verify-setup.t does not work with ./run-tests-in-vagrant.sh https://bugzilla.redhat.com/1670155 / tiering: Tiered volume files disappear when a hot brick is failed/restored until the tier detached. [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2821 bytes Desc: not available URL: From rgowdapp at redhat.com Mon Feb 4 10:02:42 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Mon, 4 Feb 2019 15:32:42 +0530 Subject: [Gluster-devel] Memory management, OOM kills and glusterfs Message-ID: All, Me, Csaba and Manoj are presenting our experiences with using FUSE as an interface for Glusterfs at Vault'19 [1]. One of the areas Glusterfs has faced difficulties is with memory management. One of the reasons for high memory consumption has been the amount of memory consumed by glusterfs mount process to maintain the inodes looked up by the kernel. Though we've a solution [2] for this problem, things would've been much easier and effective if Glusterfs was in kernel space (for the case of memory management). In kernel space, the memory consumed by inodes would be accounted for kernel's inode cache and hence kernel memory management would manage the inodes more effectively and intelligently. However, being in userspace there is no way to account the memory consumed for an inode in user space and hence only a very small part of the memory gets accounted (the inode maintained by fuse kernel module). The objective of this mail is to collect more cases/user issues/bugs such as these so that we can present them as evidence. I am currently aware of a tracker issue [3] which covers the issue I mentioned above. Also, if you are aware of any other memory management issues, we are interested in them. [1] https://www.usenix.org/conference/vault19/presentation/pillai [2] https://review.gluster.org/#/c/glusterfs/+/19778/ [3] https://bugzilla.redhat.com/show_bug.cgi?id=1647277 -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Tue Feb 5 08:47:38 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Tue, 5 Feb 2019 14:17:38 +0530 Subject: [Gluster-devel] Meeting minutes: Feb 04th, 2019 Message-ID: BJ Link - Bridge: https://bluejeans.com/217609845 - Watch: https://bluejeans.com/s/37SS6/ Attendance - Nigel Babu - Sunil Heggodu - Amar Tumballi - Aravinda VK - Atin Mukherjee Agenda - Gluster Performance Runs (on Master): - Some regression since 3.12 compared to current master. - Few operations had major regresions. - Entry serialization (SDFS) feature caused regression. We have disable it by default, plan to ask users to turn it on for edge cases. - Some patches are currently being reviewed for perf improvements which are not enabled by default. - See Xavi?s email for perf improvements in self-heal. This can cause some regression on sequential IO. - [Nigel]Can we publish posts on 3.12 perf and our machine specs. Then we can do a follow up post after 6 release. - Yes. This is a release highlight that we want to talk about. - GlusterFS 6.0 branching: - upgrade tests, specially with some removed volume types and options. - [Atin] I?ve started testing some of the upgrade tests (glusterfs-5 to latest master), have some observations around some of the tiering related options which are leading to peer rejection issue post upgrade, we need changes to avoid the peer rejection failures. GlusterD team will focus on this testing in coming days. - performance patches - Discussed earlier - shd-mux - [Atin] Shyam highlighted concern in accepting this big change such late and near to branching timelines, so most likely not going to make into 6.0. - A risk because of the timeline. We will currently keep testing it on master and once stable we could do an exception to merge it to release-6 - The changes are glusterd heavy, so we want to make sure it?s thoroughly tested so we don?t cause regressions. - GCS - v1.0 - Can we announce it, yet? - [Atin] Hit a blocker issue in GD2, https://github.com/gluster/gcs/issues/129 , root cause is in progress. Testing of https://github.com/gluster/gcs/pull/130 is blocked because of this. We are still postive to nail this down by tomorrow and call out GCS 1.0 by tomorrow. - GCS has a website now - https://gluster.github.io/gcs/Contribute by sending patches to the gh-pages branch on github.com/gluster/gcs repo. - What does it take to run the containers from Gluster (CSI/GD2 etc) on ARM architecture host machines? - It should theoretically work given Gluster has been known to work on ARM. And we know that k8s on ARM is something that people do. - Might be useful to kick it off on a Raspberry pi and see what breaks. - We need more content on website, and in general on internet. How to motivate developers to write blogs? - New theme is proposed for upstream documentation via the pull request https://github.com/gluster/glusterdocs/pull/454 - Test website: https://my-doc-sunil.readthedocs.io/en/latest/ - Round Table: - Nigel: AWS migration will happen this week and regressions will be a little flakey. Please bear with us. -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Tue Feb 5 17:23:21 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Tue, 5 Feb 2019 18:23:21 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez wrote: > On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah > wrote: > >> Can the threads be categorised to do certain kinds of fops? >> > > Could be, but creating multiple thread groups for different tasks is > generally bad because many times you end up with lots of idle threads which > waste resources and could increase contention. I think we should only > differentiate threads if it's absolutely necessary. > > >> Read/write affinitise to certain set of threads, the other metadata fops >> to other set of threads. So we limit the read/write threads and not the >> metadata threads? Also if aio is enabled in the backend the threads will >> not be blocked on disk IO right? >> > > If we don't block the thread but we don't prevent more requests to go to > the disk, then we'll probably have the same problem. Anyway, I'll try to > run some tests with AIO to see if anything changes. > I've run some simple tests with AIO enabled and results are not good. A simple dd takes >25% more time. Multiple parallel dd take 35% more time to complete. Xavi > All this is based on the assumption that large number of parallel read >> writes make the disk perf bad but not the large number of dentry and >> metadata ops. Is that true? >> > > It depends. If metadata is not cached, it's as bad as a read or write > since it requires a disk access (a clear example of this is the bad > performance of 'ls' in cold cache, which is basically metadata reads). In > fact, cached data reads are also very fast, and data writes could go to the > cache and be updated later in background, so I think the important point is > if things are cached or not, instead of if they are data or metadata. Since > we don't have this information from the user side, it's hard to tell what's > better. My opinion is that we shouldn't differentiate requests of > data/metadata. If metadata requests happen to be faster, then that thread > will be able to handle other requests immediately, which seems good enough. > > However there's one thing that I would do. I would differentiate reads > (data or metadata) from writes. Normally writes come from cached > information that is flushed to disk at some point, so this normally happens > in the background. But reads tend to be in foreground, meaning that someone > (user or application) is waiting for it. So I would give preference to > reads over writes. To do so effectively, we need to not saturate the > backend, otherwise when we need to send a read, it will still need to wait > for all pending requests to complete. If disks are not saturated, we can > have the answer to the read quite fast, and then continue processing the > remaining writes. > > Anyway, I may be wrong, since all these things depend on too many factors. > I haven't done any specific tests about this. It's more like a > brainstorming. As soon as I can I would like to experiment with this and > get some empirical data. > > Xavi > > >> Thanks, >> Poornima >> >> >> On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus > >>> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: >>> > Perhaps we could throttle both aspects - number of I/O requests per >>> disk >>> >>> While there it would be nice to detect and report a disk with lower than >>> peer performance: that happen sometimes when a disk is dying, and last >>> time I was hit by that performance problem, I had a hard time finding >>> the culprit. >>> >>> -- >>> Emmanuel Dreyfus >>> manu at netbsd.org >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From srangana at redhat.com Wed Feb 6 01:25:35 2019 From: srangana at redhat.com (Shyam Ranganathan) Date: Tue, 5 Feb 2019 20:25:35 -0500 Subject: [Gluster-devel] Release 6: Branched and next steps Message-ID: <54d9fe44-5d59-db0d-2e76-4583351c7eba@redhat.com> Hi, Release 6 is branched, and tracker bug for 6.0 is created [1]. Do mark blockers for the release against [1]. As of now we are only tracking [2] "core: implement a global thread pool " for a backport as a feature into the release. We expect to create RC0 tag and builds for upgrade and other testing close to mid-week next week (around 13th Feb), and the release is slated for the first week of March for GA. I will post updates to this thread around release notes and other related activity. Thanks, Shyam [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0 [2] Patches tracked for a backport: - https://review.gluster.org/c/glusterfs/+/20636 From pgurusid at redhat.com Wed Feb 6 06:00:36 2019 From: pgurusid at redhat.com (Poornima Gurusiddaiah) Date: Wed, 6 Feb 2019 11:30:36 +0530 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez > wrote: > >> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah >> wrote: >> >>> Can the threads be categorised to do certain kinds of fops? >>> >> >> Could be, but creating multiple thread groups for different tasks is >> generally bad because many times you end up with lots of idle threads which >> waste resources and could increase contention. I think we should only >> differentiate threads if it's absolutely necessary. >> >> >>> Read/write affinitise to certain set of threads, the other metadata fops >>> to other set of threads. So we limit the read/write threads and not the >>> metadata threads? Also if aio is enabled in the backend the threads will >>> not be blocked on disk IO right? >>> >> >> If we don't block the thread but we don't prevent more requests to go to >> the disk, then we'll probably have the same problem. Anyway, I'll try to >> run some tests with AIO to see if anything changes. >> > > I've run some simple tests with AIO enabled and results are not good. A > simple dd takes >25% more time. Multiple parallel dd take 35% more time to > complete. > Thank you. That is strange! Had few questions, what tests are you running for measuring the io-threads performance(not particularly aoi)? is it dd from multiple clients? Regards, Poornima > Xavi > > >> All this is based on the assumption that large number of parallel read >>> writes make the disk perf bad but not the large number of dentry and >>> metadata ops. Is that true? >>> >> >> It depends. If metadata is not cached, it's as bad as a read or write >> since it requires a disk access (a clear example of this is the bad >> performance of 'ls' in cold cache, which is basically metadata reads). In >> fact, cached data reads are also very fast, and data writes could go to the >> cache and be updated later in background, so I think the important point is >> if things are cached or not, instead of if they are data or metadata. Since >> we don't have this information from the user side, it's hard to tell what's >> better. My opinion is that we shouldn't differentiate requests of >> data/metadata. If metadata requests happen to be faster, then that thread >> will be able to handle other requests immediately, which seems good enough. >> >> However there's one thing that I would do. I would differentiate reads >> (data or metadata) from writes. Normally writes come from cached >> information that is flushed to disk at some point, so this normally happens >> in the background. But reads tend to be in foreground, meaning that someone >> (user or application) is waiting for it. So I would give preference to >> reads over writes. To do so effectively, we need to not saturate the >> backend, otherwise when we need to send a read, it will still need to wait >> for all pending requests to complete. If disks are not saturated, we can >> have the answer to the read quite fast, and then continue processing the >> remaining writes. >> >> Anyway, I may be wrong, since all these things depend on too many >> factors. I haven't done any specific tests about this. It's more like a >> brainstorming. As soon as I can I would like to experiment with this and >> get some empirical data. >> >> Xavi >> >> >>> Thanks, >>> Poornima >>> >>> >>> On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus >> >>>> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: >>>> > Perhaps we could throttle both aspects - number of I/O requests per >>>> disk >>>> >>>> While there it would be nice to detect and report a disk with lower >>>> than >>>> peer performance: that happen sometimes when a disk is dying, and last >>>> time I was hit by that performance problem, I had a hard time finding >>>> the culprit. >>>> >>>> -- >>>> Emmanuel Dreyfus >>>> manu at netbsd.org >>>> _______________________________________________ >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Wed Feb 6 06:57:41 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Wed, 6 Feb 2019 07:57:41 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah wrote: > > > On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez >> On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez >> wrote: >> >>> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah < >>> pgurusid at redhat.com> wrote: >>> >>>> Can the threads be categorised to do certain kinds of fops? >>>> >>> >>> Could be, but creating multiple thread groups for different tasks is >>> generally bad because many times you end up with lots of idle threads which >>> waste resources and could increase contention. I think we should only >>> differentiate threads if it's absolutely necessary. >>> >>> >>>> Read/write affinitise to certain set of threads, the other metadata >>>> fops to other set of threads. So we limit the read/write threads and not >>>> the metadata threads? Also if aio is enabled in the backend the threads >>>> will not be blocked on disk IO right? >>>> >>> >>> If we don't block the thread but we don't prevent more requests to go to >>> the disk, then we'll probably have the same problem. Anyway, I'll try to >>> run some tests with AIO to see if anything changes. >>> >> >> I've run some simple tests with AIO enabled and results are not good. A >> simple dd takes >25% more time. Multiple parallel dd take 35% more time to >> complete. >> > > > Thank you. That is strange! Had few questions, what tests are you running > for measuring the io-threads performance(not particularly aoi)? is it dd > from multiple clients? > Yes, it's a bit strange. What I see is that many threads from the thread pool are active but using very little CPU. I also see an AIO thread for each brick, but its CPU usage is not big either. Wait time is always 0 (I think this is a side effect of AIO activity). However system load grows very high. I've seen around 50, while on the normal test without AIO it's stays around 20-25. Right now I'm running the tests on a single machine (no real network communication) using an NVMe disk as storage. I use a single mount point. The tests I'm running are these: - Single dd, 128 GiB, blocks of 1MiB - 16 parallel dd, 8 GiB per dd, blocks of 1MiB - fio in sequential write mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file - fio in sequential read mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file - fio in random write mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file - fio in random read mode, direct I/O, blocks of 128k, 16 threads, 8GiB per file - smallfile create, 16 threads, 256 files per thread, 32 MiB per file (with one brick down, for the following test) - self-heal of an entire brick (from the previous smallfile test) - pgbench init phase with scale 100 I run all these tests for a replica 3 volume and a disperse 4+2 volume. Xavi > Regards, > Poornima > > >> Xavi >> >> >>> All this is based on the assumption that large number of parallel read >>>> writes make the disk perf bad but not the large number of dentry and >>>> metadata ops. Is that true? >>>> >>> >>> It depends. If metadata is not cached, it's as bad as a read or write >>> since it requires a disk access (a clear example of this is the bad >>> performance of 'ls' in cold cache, which is basically metadata reads). In >>> fact, cached data reads are also very fast, and data writes could go to the >>> cache and be updated later in background, so I think the important point is >>> if things are cached or not, instead of if they are data or metadata. Since >>> we don't have this information from the user side, it's hard to tell what's >>> better. My opinion is that we shouldn't differentiate requests of >>> data/metadata. If metadata requests happen to be faster, then that thread >>> will be able to handle other requests immediately, which seems good enough. >>> >>> However there's one thing that I would do. I would differentiate reads >>> (data or metadata) from writes. Normally writes come from cached >>> information that is flushed to disk at some point, so this normally happens >>> in the background. But reads tend to be in foreground, meaning that someone >>> (user or application) is waiting for it. So I would give preference to >>> reads over writes. To do so effectively, we need to not saturate the >>> backend, otherwise when we need to send a read, it will still need to wait >>> for all pending requests to complete. If disks are not saturated, we can >>> have the answer to the read quite fast, and then continue processing the >>> remaining writes. >>> >>> Anyway, I may be wrong, since all these things depend on too many >>> factors. I haven't done any specific tests about this. It's more like a >>> brainstorming. As soon as I can I would like to experiment with this and >>> get some empirical data. >>> >>> Xavi >>> >>> >>>> Thanks, >>>> Poornima >>>> >>>> >>>> On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus >>> >>>>> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: >>>>> > Perhaps we could throttle both aspects - number of I/O requests per >>>>> disk >>>>> >>>>> While there it would be nice to detect and report a disk with lower >>>>> than >>>>> peer performance: that happen sometimes when a disk is dying, and last >>>>> time I was hit by that performance problem, I had a hard time finding >>>>> the culprit. >>>>> >>>>> -- >>>>> Emmanuel Dreyfus >>>>> manu at netbsd.org >>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgurusid at redhat.com Wed Feb 6 08:54:13 2019 From: pgurusid at redhat.com (Poornima Gurusiddaiah) Date: Wed, 6 Feb 2019 14:24:13 +0530 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: Thank You for all the detailed explanation. If its the disk saturating, if we run some of the above mentioned tests(with multithreads) on plain xfs, we should hit the saturation right. Will try out some tests, this is interesting. Thanks, Poornima On Wed, Feb 6, 2019 at 12:27 PM Xavi Hernandez wrote: > On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah > wrote: > >> >> >> On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez > wrote: >> >>> On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez >>> wrote: >>> >>>> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah < >>>> pgurusid at redhat.com> wrote: >>>> >>>>> Can the threads be categorised to do certain kinds of fops? >>>>> >>>> >>>> Could be, but creating multiple thread groups for different tasks is >>>> generally bad because many times you end up with lots of idle threads which >>>> waste resources and could increase contention. I think we should only >>>> differentiate threads if it's absolutely necessary. >>>> >>>> >>>>> Read/write affinitise to certain set of threads, the other metadata >>>>> fops to other set of threads. So we limit the read/write threads and not >>>>> the metadata threads? Also if aio is enabled in the backend the threads >>>>> will not be blocked on disk IO right? >>>>> >>>> >>>> If we don't block the thread but we don't prevent more requests to go >>>> to the disk, then we'll probably have the same problem. Anyway, I'll try to >>>> run some tests with AIO to see if anything changes. >>>> >>> >>> I've run some simple tests with AIO enabled and results are not good. A >>> simple dd takes >25% more time. Multiple parallel dd take 35% more time to >>> complete. >>> >> >> >> Thank you. That is strange! Had few questions, what tests are you running >> for measuring the io-threads performance(not particularly aoi)? is it dd >> from multiple clients? >> > > Yes, it's a bit strange. What I see is that many threads from the thread > pool are active but using very little CPU. I also see an AIO thread for > each brick, but its CPU usage is not big either. Wait time is always 0 (I > think this is a side effect of AIO activity). However system load grows > very high. I've seen around 50, while on the normal test without AIO it's > stays around 20-25. > > Right now I'm running the tests on a single machine (no real network > communication) using an NVMe disk as storage. I use a single mount point. > The tests I'm running are these: > > - Single dd, 128 GiB, blocks of 1MiB > - 16 parallel dd, 8 GiB per dd, blocks of 1MiB > - fio in sequential write mode, direct I/O, blocks of 128k, 16 > threads, 8GiB per file > - fio in sequential read mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - fio in random write mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - fio in random read mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - smallfile create, 16 threads, 256 files per thread, 32 MiB per file > (with one brick down, for the following test) > - self-heal of an entire brick (from the previous smallfile test) > - pgbench init phase with scale 100 > > I run all these tests for a replica 3 volume and a disperse 4+2 volume. > > Xavi > > >> Regards, >> Poornima >> >> >>> Xavi >>> >>> >>>> All this is based on the assumption that large number of parallel read >>>>> writes make the disk perf bad but not the large number of dentry and >>>>> metadata ops. Is that true? >>>>> >>>> >>>> It depends. If metadata is not cached, it's as bad as a read or write >>>> since it requires a disk access (a clear example of this is the bad >>>> performance of 'ls' in cold cache, which is basically metadata reads). In >>>> fact, cached data reads are also very fast, and data writes could go to the >>>> cache and be updated later in background, so I think the important point is >>>> if things are cached or not, instead of if they are data or metadata. Since >>>> we don't have this information from the user side, it's hard to tell what's >>>> better. My opinion is that we shouldn't differentiate requests of >>>> data/metadata. If metadata requests happen to be faster, then that thread >>>> will be able to handle other requests immediately, which seems good enough. >>>> >>>> However there's one thing that I would do. I would differentiate reads >>>> (data or metadata) from writes. Normally writes come from cached >>>> information that is flushed to disk at some point, so this normally happens >>>> in the background. But reads tend to be in foreground, meaning that someone >>>> (user or application) is waiting for it. So I would give preference to >>>> reads over writes. To do so effectively, we need to not saturate the >>>> backend, otherwise when we need to send a read, it will still need to wait >>>> for all pending requests to complete. If disks are not saturated, we can >>>> have the answer to the read quite fast, and then continue processing the >>>> remaining writes. >>>> >>>> Anyway, I may be wrong, since all these things depend on too many >>>> factors. I haven't done any specific tests about this. It's more like a >>>> brainstorming. As soon as I can I would like to experiment with this and >>>> get some empirical data. >>>> >>>> Xavi >>>> >>>> >>>>> Thanks, >>>>> Poornima >>>>> >>>>> >>>>> On Fri, Feb 1, 2019, 5:34 PM Emmanuel Dreyfus >>>> >>>>>> On Thu, Jan 31, 2019 at 10:53:48PM -0800, Vijay Bellur wrote: >>>>>> > Perhaps we could throttle both aspects - number of I/O requests per >>>>>> disk >>>>>> >>>>>> While there it would be nice to detect and report a disk with lower >>>>>> than >>>>>> peer performance: that happen sometimes when a disk is dying, and last >>>>>> time I was hit by that performance problem, I had a hard time finding >>>>>> the culprit. >>>>>> >>>>>> -- >>>>>> Emmanuel Dreyfus >>>>>> manu at netbsd.org >>>>>> _______________________________________________ >>>>>> Gluster-devel mailing list >>>>>> Gluster-devel at gluster.org >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nigelb at redhat.com Thu Feb 7 13:16:59 2019 From: nigelb at redhat.com (Nigel Babu) Date: Thu, 7 Feb 2019 18:46:59 +0530 Subject: [Gluster-devel] Regression logs issue Message-ID: Hello folks, In the last week, if you have had a regression job that failed, you will not find a log for it. This is due to a mistake I made while deleting code. Rather than deleting the code for the push to an internal HTTP server, I also deleted a line which handled the log creation. Apologies for the mistake. This has now been corrected and the fix pushed to all regression nodes. Any future failures should have logs attached as artifacts. -- nigelb -------------- next part -------------- An HTML attachment was scrubbed... URL: From srangana at redhat.com Thu Feb 7 14:35:13 2019 From: srangana at redhat.com (Shyam Ranganathan) Date: Thu, 7 Feb 2019 09:35:13 -0500 Subject: [Gluster-devel] gfapi: Changed API signatures in release-6 Message-ID: <3693277e-97cd-7fef-b801-349bb44ecb89@redhat.com> Hi, A few GFAPI signatures have changed in release-6 and the list can be seen here [1]. The change is to adapt to a more elaborate stat structure than what POSIX provides and details around the same can be seen here [2]. If you have an application that compiles against gfapi, then it needs to adapt to the new APIs in case you are buildnig against master or against release-6. Existing compiled applications will continue to work as the older versions of the symbol are present and have not changed their ABI. Further, if your build environment still depends on a version less than release-6 or master, the builds and functionality remains the same (IOW, there is not immediate need to adapt to the new APIs). Components that I know integrate to gfapi and hence may need to change are, - Samba Gluster plugin - NFS Ganesha FSAL for Gluster - TCMU-Gluster integration - QEMU Gluster integration - Other language API bindings - go - python - - Anything else? Request respective maintainers or members working with these integration to post required patches to the respective projects. Thanks, Shyam [1] APIs changed/added in release-6: https://github.com/gluster/glusterfs/blob/release-6/api/src/gfapi.map#L245 (NOTE: Version will change to 6.0 as this patch is merged https://review.gluster.org/c/glusterfs/+/22173 ) [2] Issue dealing with statx like stat information returns from gfapi: - https://github.com/gluster/glusterfs/issues/389 - https://github.com/gluster/glusterfs/issues/273 From nigelb at redhat.com Fri Feb 8 02:19:57 2019 From: nigelb at redhat.com (Nigel Babu) Date: Fri, 8 Feb 2019 07:49:57 +0530 Subject: [Gluster-devel] Jenkins switched over to new builders for regression Message-ID: Hello, We've reached the half way mark in the migration and half our builders today are now running on AWS. I've turned off the RAX builders and have them try to be online only if the AWS builders cannot handle the number of jobs running at any given point. The new builders are named builder2xx.aws.gluster.org. If you notice an infra issue with them, please file a bug. I will be working on adding more AWS builders during the day today. -- nigelb -------------- next part -------------- An HTML attachment was scrubbed... URL: From avishwan at redhat.com Fri Feb 8 12:01:15 2019 From: avishwan at redhat.com (Aravinda) Date: Fri, 08 Feb 2019 17:31:15 +0530 Subject: [Gluster-devel] Path based Geo-replication Message-ID: <42033e7a4dd95af649dcef5de1af8c6fe4024be8.camel@redhat.com> Hi All, I prepared a design document for Path based Geo-replication feature. Similar to existing GFID based solution, it uses Changelogs to detect the changes but converts to Path using GFID-to-path feature before syncing. Feel free to add comments, suggestions or any issues or challenges if I have not considered. https://docs.google.com/document/d/1gW5ETQxNiy9tt4uV1ohRH1g5AMmWLtbQYD3QPs_v8Ec/edit?usp=sharing -- regards Aravinda From nigelb at redhat.com Fri Feb 8 12:49:52 2019 From: nigelb at redhat.com (Nigel Babu) Date: Fri, 8 Feb 2019 18:19:52 +0530 Subject: [Gluster-devel] Jenkins switched over to new builders for regression In-Reply-To: References: Message-ID: All the RAX builders are now gone. We're running off AWS entirely now. Please file an infra bug if you notice something odd. For future reference, logs and cores are going to be available on https://logs.aws.gluster.org rather than individual build servers. This should, in the future, be printed in the logs. On Fri, Feb 8, 2019 at 7:49 AM Nigel Babu wrote: > Hello, > > We've reached the half way mark in the migration and half our builders > today are now running on AWS. I've turned off the RAX builders and have > them try to be online only if the AWS builders cannot handle the number of > jobs running at any given point. > > The new builders are named builder2xx.aws.gluster.org. If you notice an > infra issue with them, please file a bug. I will be working on adding more > AWS builders during the day today. > > -- > nigelb > -- nigelb -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Feb 11 01:45:03 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 11 Feb 2019 01:45:03 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <1866052718.15.1549849503746.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1672076 / core: chrome / chromium crash on gluster, sqlite issue? https://bugzilla.redhat.com/1668227 / core: gluster(8) - Add SELinux context glusterd_brick_t to man page https://bugzilla.redhat.com/1670334 / core: Some memory leaks found in GlusterFS 5.3 https://bugzilla.redhat.com/1668239 / disperse: [man page] Gluster(8) - Missing disperse-data parameter Gluster Console Manager man page https://bugzilla.redhat.com/1672656 / eventsapi: glustereventsd: crash, ABRT report for package glusterfs has reached 100 occurrences https://bugzilla.redhat.com/1672258 / fuse: fuse takes memory and doesn't free https://bugzilla.redhat.com/1671014 / fuse: gluster-fuse seg fault PTHREAD_MUTEX_TYPE_ELISION https://bugzilla.redhat.com/1668118 / geo-replication: Failure to start geo-replication for tiered volume. https://bugzilla.redhat.com/1673058 / glusterd: Network throughput usage increased x5 https://bugzilla.redhat.com/1670382 / gluster-smb: parallel-readdir prevents directories and files listing https://bugzilla.redhat.com/1666326 / open-behind: reopening bug 1405147: Failed to dispatch handler: glusterfs seems to check for "write permission" instead for "file owner" during open() when writing to a file https://bugzilla.redhat.com/1668259 / packaging: Glusterfs 5.3 RPMs can't be build on rhel7 https://bugzilla.redhat.com/1672711 / packaging: Upgrade from glusterfs 3.12 to gluster 4/5 broken https://bugzilla.redhat.com/1666634 / protocol: nfs client cannot compile files on dispersed volume https://bugzilla.redhat.com/1668286 / read-ahead: READDIRP incorrectly updates posix-acl inode ctx https://bugzilla.redhat.com/1671207 / rpc: Several fixes on socket pollin and pollout return value https://bugzilla.redhat.com/1672480 / tests: Bugs Test Module tests failing on s390x https://bugzilla.redhat.com/1670155 / tiering: Tiered volume files disappear when a hot brick is failed/restored until the tier detached. [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2327 bytes Desc: not available URL: From vbellur at redhat.com Tue Feb 12 00:29:45 2019 From: vbellur at redhat.com (Vijay Bellur) Date: Mon, 11 Feb 2019 16:29:45 -0800 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Tue, Feb 5, 2019 at 10:57 PM Xavi Hernandez wrote: > On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah > wrote: > >> >> >> On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez > wrote: >> >>> On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez >>> wrote: >>> >>>> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah < >>>> pgurusid at redhat.com> wrote: >>>> >>>>> Can the threads be categorised to do certain kinds of fops? >>>>> >>>> >>>> Could be, but creating multiple thread groups for different tasks is >>>> generally bad because many times you end up with lots of idle threads which >>>> waste resources and could increase contention. I think we should only >>>> differentiate threads if it's absolutely necessary. >>>> >>>> >>>>> Read/write affinitise to certain set of threads, the other metadata >>>>> fops to other set of threads. So we limit the read/write threads and not >>>>> the metadata threads? Also if aio is enabled in the backend the threads >>>>> will not be blocked on disk IO right? >>>>> >>>> >>>> If we don't block the thread but we don't prevent more requests to go >>>> to the disk, then we'll probably have the same problem. Anyway, I'll try to >>>> run some tests with AIO to see if anything changes. >>>> >>> >>> I've run some simple tests with AIO enabled and results are not good. A >>> simple dd takes >25% more time. Multiple parallel dd take 35% more time to >>> complete. >>> >> >> >> Thank you. That is strange! Had few questions, what tests are you running >> for measuring the io-threads performance(not particularly aoi)? is it dd >> from multiple clients? >> > > Yes, it's a bit strange. What I see is that many threads from the thread > pool are active but using very little CPU. I also see an AIO thread for > each brick, but its CPU usage is not big either. Wait time is always 0 (I > think this is a side effect of AIO activity). However system load grows > very high. I've seen around 50, while on the normal test without AIO it's > stays around 20-25. > > Right now I'm running the tests on a single machine (no real network > communication) using an NVMe disk as storage. I use a single mount point. > The tests I'm running are these: > > - Single dd, 128 GiB, blocks of 1MiB > - 16 parallel dd, 8 GiB per dd, blocks of 1MiB > - fio in sequential write mode, direct I/O, blocks of 128k, 16 > threads, 8GiB per file > - fio in sequential read mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - fio in random write mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - fio in random read mode, direct I/O, blocks of 128k, 16 threads, > 8GiB per file > - smallfile create, 16 threads, 256 files per thread, 32 MiB per file > (with one brick down, for the following test) > - self-heal of an entire brick (from the previous smallfile test) > - pgbench init phase with scale 100 > > I run all these tests for a replica 3 volume and a disperse 4+2 volume. > Are these performance results available somewhere? I am quite curious to understand the performance gains on NVMe! Thanks, Vijay -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Tue Feb 12 12:08:00 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Tue, 12 Feb 2019 17:38:00 +0530 Subject: [Gluster-devel] Disabling read-ahead and io-cache for native fuse mounts Message-ID: All, We've found perf xlators io-cache and read-ahead not adding any performance improvement. At best read-ahead is redundant due to kernel read-ahead and at worst io-cache is degrading the performance for workloads that doesn't involve re-read. Given that VFS already have both these functionalities, I am proposing to have these two translators turned off by default for native fuse mounts. For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have these xlators on by having custom profiles. Comments? [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 regards, Raghavendra -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Tue Feb 12 13:22:43 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Tue, 12 Feb 2019 18:52:43 +0530 Subject: [Gluster-devel] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: https://review.gluster.org/22203 On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa wrote: > All, > > We've found perf xlators io-cache and read-ahead not adding any > performance improvement. At best read-ahead is redundant due to kernel > read-ahead and at worst io-cache is degrading the performance for workloads > that doesn't involve re-read. Given that VFS already have both these > functionalities, I am proposing to have these two translators turned off by > default for native fuse mounts. > > For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have > these xlators on by having custom profiles. Comments? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 > > regards, > Raghavendra > -------------- next part -------------- An HTML attachment was scrubbed... URL: From moagrawa at redhat.com Tue Feb 12 13:44:29 2019 From: moagrawa at redhat.com (Mohit Agrawal) Date: Tue, 12 Feb 2019 19:14:29 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t Message-ID: Hi, I have observed the test case ./tests/bugs/distribute/bug-1161311.t is getting timed out on build server at the time of running centos regression on one of my patch https://review.gluster.org/22166 I have executed test case for i in {1..30}; do time prove -vf ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is similar to build infra, the test case is not taking time more than 3 minutes but on build server test case is getting timed out. Kindly share your input if you are facing the same. Thanks, Mohit Agrawal -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Tue Feb 12 13:58:45 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Tue, 12 Feb 2019 19:28:45 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t In-Reply-To: References: Message-ID: On Tue, Feb 12, 2019 at 7:16 PM Mohit Agrawal wrote: > Hi, > > I have observed the test case ./tests/bugs/distribute/bug-1161311.t is > getting timed > I've seen failure of this too in some of my patches. out on build server at the time of running centos regression on one of my > patch https://review.gluster.org/22166 > > I have executed test case for i in {1..30}; do time prove -vf > ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is > similar to build infra, the test case is not taking time more than 3 > minutes but on build server test case is getting timed out. > > Kindly share your input if you are facing the same. > > Thanks, > Mohit Agrawal > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Tue Feb 12 14:01:34 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Tue, 12 Feb 2019 19:31:34 +0530 Subject: [Gluster-devel] [Gluster-infra]Softserve will be down Message-ID: Hi, After all the RAX builders are moved to AWS. We are planning to migrate softserve application to AWS completely. As a part of this migration, we're bringing down softserve for a few days. So softserve will not be able to lend any more machines until we are ready with the migrated code. Thanks, Deepshikha From nbalacha at redhat.com Wed Feb 13 04:19:44 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Wed, 13 Feb 2019 09:49:44 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t In-Reply-To: References: Message-ID: I'll take a look at this today. The logs indicate the test completed in under 3 minutes but something seems to be holding up the cleanup. On Tue, 12 Feb 2019 at 19:30, Raghavendra Gowdappa wrote: > > > On Tue, Feb 12, 2019 at 7:16 PM Mohit Agrawal wrote: > >> Hi, >> >> I have observed the test case ./tests/bugs/distribute/bug-1161311.t is >> getting timed >> > > I've seen failure of this too in some of my patches. > > out on build server at the time of running centos regression on one of my >> patch https://review.gluster.org/22166 >> >> I have executed test case for i in {1..30}; do time prove -vf >> ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is >> similar to build infra, the test case is not taking time more than 3 >> minutes but on build server test case is getting timed out. >> >> Kindly share your input if you are facing the same. >> >> Thanks, >> Mohit Agrawal >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Feb 13 04:23:21 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 13 Feb 2019 09:53:21 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t In-Reply-To: References: Message-ID: On Wed, Feb 13, 2019 at 9:51 AM Nithya Balachandran wrote: > I'll take a look at this today. The logs indicate the test completed in > under 3 minutes but something seems to be holding up the cleanup. > > Just a look on some successful runs show output like below: -- *17:44:49* ok 57, LINENUM:155*17:44:49* umount: /d/backends/patchy1: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy2: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy3: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* N*17:44:49* ok -- This is just before finish, so , the cleanup is being held for sure. Regards, Amar On Tue, 12 Feb 2019 at 19:30, Raghavendra Gowdappa > wrote: > >> >> >> On Tue, Feb 12, 2019 at 7:16 PM Mohit Agrawal >> wrote: >> >>> Hi, >>> >>> I have observed the test case ./tests/bugs/distribute/bug-1161311.t is >>> getting timed >>> >> >> I've seen failure of this too in some of my patches. >> >> out on build server at the time of running centos regression on one of my >>> patch https://review.gluster.org/22166 >>> >>> I have executed test case for i in {1..30}; do time prove -vf >>> ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is >>> similar to build infra, the test case is not taking time more than 3 >>> minutes but on build server test case is getting timed out. >>> >>> Kindly share your input if you are facing the same. >>> >>> Thanks, >>> Mohit Agrawal >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Wed Feb 13 04:30:36 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Wed, 13 Feb 2019 10:00:36 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t In-Reply-To: References: Message-ID: On Wed, Feb 13, 2019 at 9:54 AM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > > > On Wed, Feb 13, 2019 at 9:51 AM Nithya Balachandran > wrote: > >> I'll take a look at this today. The logs indicate the test completed in >> under 3 minutes but something seems to be holding up the cleanup. >> >> > Just a look on some successful runs show output like below: > > -- > > *17:44:49* ok 57, LINENUM:155*17:44:49* umount: /d/backends/patchy1: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy2: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy3: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* N*17:44:49* ok > > -- > > This is just before finish, so , the cleanup is being held for sure. > Yes. In my tests too, I saw these msgs. But, i thought they are not accounted in waiting time. > Regards, > Amar > > On Tue, 12 Feb 2019 at 19:30, Raghavendra Gowdappa >> wrote: >> >>> >>> >>> On Tue, Feb 12, 2019 at 7:16 PM Mohit Agrawal >>> wrote: >>> >>>> Hi, >>>> >>>> I have observed the test case ./tests/bugs/distribute/bug-1161311.t is >>>> getting timed >>>> >>> >>> I've seen failure of this too in some of my patches. >>> >>> out on build server at the time of running centos regression on one of >>>> my patch https://review.gluster.org/22166 >>>> >>>> I have executed test case for i in {1..30}; do time prove -vf >>>> ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is >>>> similar to build infra, the test case is not taking time more than 3 >>>> minutes but on build server test case is getting timed out. >>>> >>>> Kindly share your input if you are facing the same. >>>> >>>> Thanks, >>>> Mohit Agrawal >>>> _______________________________________________ >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > -- > Amar Tumballi (amarts) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbalacha at redhat.com Wed Feb 13 04:39:00 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Wed, 13 Feb 2019 10:09:00 +0530 Subject: [Gluster-devel] Failing test case ./tests/bugs/distribute/bug-1161311.t In-Reply-To: References: Message-ID: The volume is not stopped before unmounting the bricks. I will send a fix. On Wed, 13 Feb 2019 at 10:00, Raghavendra Gowdappa wrote: > > > On Wed, Feb 13, 2019 at 9:54 AM Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > >> >> >> On Wed, Feb 13, 2019 at 9:51 AM Nithya Balachandran >> wrote: >> >>> I'll take a look at this today. The logs indicate the test completed in >>> under 3 minutes but something seems to be holding up the cleanup. >>> >>> >> Just a look on some successful runs show output like below: >> >> -- >> >> *17:44:49* ok 57, LINENUM:155*17:44:49* umount: /d/backends/patchy1: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy2: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* umount: /d/backends/patchy3: target is busy.*17:44:49* (In some cases useful info about processes that use*17:44:49* the device is found by lsof(8) or fuser(1))*17:44:49* N*17:44:49* ok >> >> -- >> >> This is just before finish, so , the cleanup is being held for sure. >> > > Yes. In my tests too, I saw these msgs. But, i thought they are not > accounted in waiting time. > > >> Regards, >> Amar >> >> On Tue, 12 Feb 2019 at 19:30, Raghavendra Gowdappa >>> wrote: >>> >>>> >>>> >>>> On Tue, Feb 12, 2019 at 7:16 PM Mohit Agrawal >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have observed the test case ./tests/bugs/distribute/bug-1161311.t is >>>>> getting timed >>>>> >>>> >>>> I've seen failure of this too in some of my patches. >>>> >>>> out on build server at the time of running centos regression on one of >>>>> my patch https://review.gluster.org/22166 >>>>> >>>>> I have executed test case for i in {1..30}; do time prove -vf >>>>> ./tests/bugs/distribute/bug-1161311.t; done 30 times on softserv vm that is >>>>> similar to build infra, the test case is not taking time more than 3 >>>>> minutes but on build server test case is getting timed out. >>>>> >>>>> Kindly share your input if you are facing the same. >>>>> >>>>> Thanks, >>>>> Mohit Agrawal >>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> >>>> _______________________________________________ >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> >> >> -- >> Amar Tumballi (amarts) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Wed Feb 13 05:14:37 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Wed, 13 Feb 2019 10:44:37 +0530 Subject: [Gluster-devel] [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: <59A9002B-F427-4D94-A653-31A99DEF6CD8@onholyground.com> References: <59A9002B-F427-4D94-A653-31A99DEF6CD8@onholyground.com> Message-ID: On Tue, Feb 12, 2019 at 11:09 PM Darrell Budic wrote: > Is there an example of a custom profile you can share for my ovirt use > case (with gfapi enabled)? > I was speaking about a group setting like "group metadata-cache". Its just that custom options one would turn on for a class of applications or problems. Or are you just talking about the standard group settings for virt as a > custom profile? > > On Feb 12, 2019, at 7:22 AM, Raghavendra Gowdappa > wrote: > > https://review.gluster.org/22203 > > On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa > wrote: > >> All, >> >> We've found perf xlators io-cache and read-ahead not adding any >> performance improvement. At best read-ahead is redundant due to kernel >> read-ahead and at worst io-cache is degrading the performance for workloads >> that doesn't involve re-read. Given that VFS already have both these >> functionalities, I am proposing to have these two translators turned off by >> default for native fuse mounts. >> >> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have >> these xlators on by having custom profiles. Comments? >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >> >> regards, >> Raghavendra >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Wed Feb 13 05:21:38 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Wed, 13 Feb 2019 10:51:38 +0530 Subject: [Gluster-devel] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa wrote: > All, > > We've found perf xlators io-cache and read-ahead not adding any > performance improvement. At best read-ahead is redundant due to kernel > read-ahead > One thing we are still figuring out is whether kernel read-ahead is tunable. From what we've explored, it _looks_ like (may not be entirely correct), ra is capped at 128KB. If that's the case, I am interested in few things: * Are there any realworld applications/usecases, which would benefit from larger read-ahead (Manoj says block devices can do ra of 4MB)? * Is the limit on kernel ra tunable a hard one? IOW, what does it take to make it to do higher ra? If its difficult, can glusterfs read-ahead provide the expected performance improvement for these applications that would benefit from aggressive ra (as glusterfs can support larger ra sizes)? I am still inclined to prefer kernel ra as I think its more intelligent and can identify more sequential patterns than Glusterfs read-ahead [1][2]. [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf [2] https://lwn.net/Articles/155510/ and at worst io-cache is degrading the performance for workloads that > doesn't involve re-read. Given that VFS already have both these > functionalities, I am proposing to have these two translators turned off by > default for native fuse mounts. > > For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have > these xlators on by having custom profiles. Comments? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 > > regards, > Raghavendra > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpillai at redhat.com Wed Feb 13 05:45:32 2019 From: mpillai at redhat.com (Manoj Pillai) Date: Wed, 13 Feb 2019 11:15:32 +0530 Subject: [Gluster-devel] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: On Wed, Feb 13, 2019 at 10:51 AM Raghavendra Gowdappa wrote: > > > On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa > wrote: > >> All, >> >> We've found perf xlators io-cache and read-ahead not adding any >> performance improvement. At best read-ahead is redundant due to kernel >> read-ahead >> > > One thing we are still figuring out is whether kernel read-ahead is > tunable. From what we've explored, it _looks_ like (may not be entirely > correct), ra is capped at 128KB. If that's the case, I am interested in few > things: > * Are there any realworld applications/usecases, which would benefit from > larger read-ahead (Manoj says block devices can do ra of 4MB)? > kernel read-ahead is adaptive but influenced by the read-ahead setting on the block device (/sys/block//queue/read_ahead_kb), which can be tuned. For RHEL specifically, the default is 128KB (last I checked) but the default RHEL tuned-profile, throughput-performance, bumps that up to 4MB. It should be fairly easy to rig up a test where 4MB read-ahead on the block device gives better performance than 128KB read-ahead. -- Manoj * Is the limit on kernel ra tunable a hard one? IOW, what does it take to > make it to do higher ra? If its difficult, can glusterfs read-ahead provide > the expected performance improvement for these applications that would > benefit from aggressive ra (as glusterfs can support larger ra sizes)? > > I am still inclined to prefer kernel ra as I think its more intelligent > and can identify more sequential patterns than Glusterfs read-ahead [1][2]. > [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf > [2] https://lwn.net/Articles/155510/ > > and at worst io-cache is degrading the performance for workloads that >> doesn't involve re-read. Given that VFS already have both these >> functionalities, I am proposing to have these two translators turned off by >> default for native fuse mounts. >> >> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have >> these xlators on by having custom profiles. Comments? >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >> >> regards, >> Raghavendra >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rgowdapp at redhat.com Wed Feb 13 06:03:05 2019 From: rgowdapp at redhat.com (Raghavendra Gowdappa) Date: Wed, 13 Feb 2019 11:33:05 +0530 Subject: [Gluster-devel] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: On Wed, Feb 13, 2019 at 11:16 AM Manoj Pillai wrote: > > > On Wed, Feb 13, 2019 at 10:51 AM Raghavendra Gowdappa > wrote: > >> >> >> On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa >> wrote: >> >>> All, >>> >>> We've found perf xlators io-cache and read-ahead not adding any >>> performance improvement. At best read-ahead is redundant due to kernel >>> read-ahead >>> >> >> One thing we are still figuring out is whether kernel read-ahead is >> tunable. From what we've explored, it _looks_ like (may not be entirely >> correct), ra is capped at 128KB. If that's the case, I am interested in few >> things: >> * Are there any realworld applications/usecases, which would benefit from >> larger read-ahead (Manoj says block devices can do ra of 4MB)? >> > > kernel read-ahead is adaptive but influenced by the read-ahead setting on > the block device (/sys/block//queue/read_ahead_kb), which can be > tuned. For RHEL specifically, the default is 128KB (last I checked) but the > default RHEL tuned-profile, throughput-performance, bumps that up to 4MB. > It should be fairly easy to rig up a test where 4MB read-ahead on the > block device gives better performance than 128KB read-ahead. > Thanks Manoj. To add to what Manoj said and give more context here, Glusterfs being a fuse-based fs is not exposed as a block device. So, that's the first problem of where/how to tune and I've listed other problems earlier. > -- Manoj > > * Is the limit on kernel ra tunable a hard one? IOW, what does it take to >> make it to do higher ra? If its difficult, can glusterfs read-ahead provide >> the expected performance improvement for these applications that would >> benefit from aggressive ra (as glusterfs can support larger ra sizes)? >> >> I am still inclined to prefer kernel ra as I think its more intelligent >> and can identify more sequential patterns than Glusterfs read-ahead [1][2]. >> [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf >> [2] https://lwn.net/Articles/155510/ >> >> and at worst io-cache is degrading the performance for workloads that >>> doesn't involve re-read. Given that VFS already have both these >>> functionalities, I am proposing to have these two translators turned off by >>> default for native fuse mounts. >>> >>> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have >>> these xlators on by having custom profiles. Comments? >>> >>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >>> >>> regards, >>> Raghavendra >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Wed Feb 13 10:34:50 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Wed, 13 Feb 2019 11:34:50 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: On Tue, Feb 12, 2019 at 1:30 AM Vijay Bellur wrote: > > > On Tue, Feb 5, 2019 at 10:57 PM Xavi Hernandez > wrote: > >> On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah >> wrote: >> >>> >>> >>> On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez >> wrote: >>> >>>> On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez >>>> wrote: >>>> >>>>> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah < >>>>> pgurusid at redhat.com> wrote: >>>>> >>>>>> Can the threads be categorised to do certain kinds of fops? >>>>>> >>>>> >>>>> Could be, but creating multiple thread groups for different tasks is >>>>> generally bad because many times you end up with lots of idle threads which >>>>> waste resources and could increase contention. I think we should only >>>>> differentiate threads if it's absolutely necessary. >>>>> >>>>> >>>>>> Read/write affinitise to certain set of threads, the other metadata >>>>>> fops to other set of threads. So we limit the read/write threads and not >>>>>> the metadata threads? Also if aio is enabled in the backend the threads >>>>>> will not be blocked on disk IO right? >>>>>> >>>>> >>>>> If we don't block the thread but we don't prevent more requests to go >>>>> to the disk, then we'll probably have the same problem. Anyway, I'll try to >>>>> run some tests with AIO to see if anything changes. >>>>> >>>> >>>> I've run some simple tests with AIO enabled and results are not good. A >>>> simple dd takes >25% more time. Multiple parallel dd take 35% more time to >>>> complete. >>>> >>> >>> >>> Thank you. That is strange! Had few questions, what tests are you >>> running for measuring the io-threads performance(not particularly aoi)? is >>> it dd from multiple clients? >>> >> >> Yes, it's a bit strange. What I see is that many threads from the thread >> pool are active but using very little CPU. I also see an AIO thread for >> each brick, but its CPU usage is not big either. Wait time is always 0 (I >> think this is a side effect of AIO activity). However system load grows >> very high. I've seen around 50, while on the normal test without AIO it's >> stays around 20-25. >> >> Right now I'm running the tests on a single machine (no real network >> communication) using an NVMe disk as storage. I use a single mount point. >> The tests I'm running are these: >> >> - Single dd, 128 GiB, blocks of 1MiB >> - 16 parallel dd, 8 GiB per dd, blocks of 1MiB >> - fio in sequential write mode, direct I/O, blocks of 128k, 16 >> threads, 8GiB per file >> - fio in sequential read mode, direct I/O, blocks of 128k, 16 >> threads, 8GiB per file >> - fio in random write mode, direct I/O, blocks of 128k, 16 threads, >> 8GiB per file >> - fio in random read mode, direct I/O, blocks of 128k, 16 threads, >> 8GiB per file >> - smallfile create, 16 threads, 256 files per thread, 32 MiB per file >> (with one brick down, for the following test) >> - self-heal of an entire brick (from the previous smallfile test) >> - pgbench init phase with scale 100 >> >> I run all these tests for a replica 3 volume and a disperse 4+2 volume. >> > > > Are these performance results available somewhere? I am quite curious to > understand the performance gains on NVMe! > I'm updating test results with the latest build. I'll report it here once it's complete. Xavi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Thu Feb 14 05:41:36 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Thu, 14 Feb 2019 11:11:36 +0530 Subject: [Gluster-devel] Gluster Container Storage: Release Update Message-ID: Hello everyone, We are announcing v1.0RC release of GlusterCS this week!** The version 1.0 is due along with *glusterfs-6.0* next month. Below are the Goals for v1.0: - RWX PVs - Scale and Performance - RWO PVs - Simple, leaner stack with Gluster?s Virtual Block. - Thin Arbiter (2 DataCenter Replicate) Support for RWX volume. - RWO hosting volume to use Thin Arbiter volume type would be still in Alpha. - Integrated monitoring. - Simple Install / Overall user-experience. Along with above, we are in Alpha state to support GCS on ARM architecture. We are also trying to get the website done for GCS @ https://gluster.github.io/gcs We are looking for some validation of the GCS containers, and the overall gluster stack, in your k8s setup. While we are focusing more on getting stability, and better user-experience, we are also trying to ship few tech-preview items, for early preview. The main item on this is loopback based bricks ( https://github.com/gluster/glusterd2/pull/1473), which allows us to bring more data services on top of Gluster with more options in container world, specially with backup and recovery. The above feature also makes better snapshot/clone story for gluster in containers with reflink support on XFS. *(NOTE: this will be a future improvement)* This email is a request for help with regard to testing and feedback on this new stack, in its alpha release tag. Do let us know if there are any concerns. We are ready to take anything from ?This is BS!!? to ?Wow! this looks really simple, works without hassle? [image: :smile:] Btw, if you are interested to try / help, few things to note: - GCS uses CSI spec v1.0, which is only available from k8s 1.13+ - We do have weekly meetings on GCS as announced in https://lists.gluster.org/pipermail/gluster-devel/2019-January/055774.html - Feel free to jump in if interested. - ie, Every Thursday, 15:00 UTC. - GCS doesn?t have any operator support yet, but for simplicity, you can also try using https://github.com/aravindavk/kubectl-gluster - Planned to be integrated in later versions. - We are not great at creating cool website, help in making GCS homepage would be great too :-) Interested? feel free to jump into Architecture call today. Regards, Gluster Container Storage Team PS: The meeting minutes, where the release pointers were discussed is @ https://hackmd.io/sj9ik9SCTYm81YcQDOOrtw?both ** - subject to resolving some blockers @ https://waffle.io/gluster/gcs?label=GCS%2F1.0 -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhernandez at redhat.com Thu Feb 14 07:35:16 2019 From: xhernandez at redhat.com (Xavi Hernandez) Date: Thu, 14 Feb 2019 08:35:16 +0100 Subject: [Gluster-devel] I/O performance In-Reply-To: References: <20190201120349.GM4509@homeworld.netbsd.org> Message-ID: Here are the results of the last run: https://docs.google.com/spreadsheets/d/19JqvuFKZxKifgrhLF-5-bgemYj8XKldUox1QwsmGj2k/edit?usp=sharing Each test has been run with a rough approximation of the best configuration I've found (in number of client and brick threads), but I haven't done an exhaustive search of the best configuration in each case. The "fio rand write" test seems to have a big regression. An initial check of the data shows that 2 of the 5 runs have taken > 50% more time. I'll try to check why. Many of the tests show a very high disk utilization, so comparisons may not be accurate. In any case it's clear that we need a method to automatically adjust the number of worker threads to the given load to make this useful. Without that it's virtually impossible to find a fixed number of threads that will work fine in all cases. I'm currently working on this. Xavi On Wed, Feb 13, 2019 at 11:34 AM Xavi Hernandez wrote: > On Tue, Feb 12, 2019 at 1:30 AM Vijay Bellur wrote: > >> >> >> On Tue, Feb 5, 2019 at 10:57 PM Xavi Hernandez >> wrote: >> >>> On Wed, Feb 6, 2019 at 7:00 AM Poornima Gurusiddaiah < >>> pgurusid at redhat.com> wrote: >>> >>>> >>>> >>>> On Tue, Feb 5, 2019, 10:53 PM Xavi Hernandez >>> wrote: >>>> >>>>> On Fri, Feb 1, 2019 at 1:51 PM Xavi Hernandez >>>>> wrote: >>>>> >>>>>> On Fri, Feb 1, 2019 at 1:25 PM Poornima Gurusiddaiah < >>>>>> pgurusid at redhat.com> wrote: >>>>>> >>>>>>> Can the threads be categorised to do certain kinds of fops? >>>>>>> >>>>>> >>>>>> Could be, but creating multiple thread groups for different tasks is >>>>>> generally bad because many times you end up with lots of idle threads which >>>>>> waste resources and could increase contention. I think we should only >>>>>> differentiate threads if it's absolutely necessary. >>>>>> >>>>>> >>>>>>> Read/write affinitise to certain set of threads, the other metadata >>>>>>> fops to other set of threads. So we limit the read/write threads and not >>>>>>> the metadata threads? Also if aio is enabled in the backend the threads >>>>>>> will not be blocked on disk IO right? >>>>>>> >>>>>> >>>>>> If we don't block the thread but we don't prevent more requests to go >>>>>> to the disk, then we'll probably have the same problem. Anyway, I'll try to >>>>>> run some tests with AIO to see if anything changes. >>>>>> >>>>> >>>>> I've run some simple tests with AIO enabled and results are not good. >>>>> A simple dd takes >25% more time. Multiple parallel dd take 35% more time >>>>> to complete. >>>>> >>>> >>>> >>>> Thank you. That is strange! Had few questions, what tests are you >>>> running for measuring the io-threads performance(not particularly aoi)? is >>>> it dd from multiple clients? >>>> >>> >>> Yes, it's a bit strange. What I see is that many threads from the thread >>> pool are active but using very little CPU. I also see an AIO thread for >>> each brick, but its CPU usage is not big either. Wait time is always 0 (I >>> think this is a side effect of AIO activity). However system load grows >>> very high. I've seen around 50, while on the normal test without AIO it's >>> stays around 20-25. >>> >>> Right now I'm running the tests on a single machine (no real network >>> communication) using an NVMe disk as storage. I use a single mount point. >>> The tests I'm running are these: >>> >>> - Single dd, 128 GiB, blocks of 1MiB >>> - 16 parallel dd, 8 GiB per dd, blocks of 1MiB >>> - fio in sequential write mode, direct I/O, blocks of 128k, 16 >>> threads, 8GiB per file >>> - fio in sequential read mode, direct I/O, blocks of 128k, 16 >>> threads, 8GiB per file >>> - fio in random write mode, direct I/O, blocks of 128k, 16 threads, >>> 8GiB per file >>> - fio in random read mode, direct I/O, blocks of 128k, 16 threads, >>> 8GiB per file >>> - smallfile create, 16 threads, 256 files per thread, 32 MiB per >>> file (with one brick down, for the following test) >>> - self-heal of an entire brick (from the previous smallfile test) >>> - pgbench init phase with scale 100 >>> >>> I run all these tests for a replica 3 volume and a disperse 4+2 volume. >>> >> >> >> Are these performance results available somewhere? I am quite curious to >> understand the performance gains on NVMe! >> > > I'm updating test results with the latest build. I'll report it here once > it's complete. > > Xavi > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenkins at build.gluster.org Mon Feb 18 01:45:03 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 18 Feb 2019 01:45:03 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <2077170077.36.1550454304159.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1672076 / core: chrome / chromium crash on gluster, sqlite issue? https://bugzilla.redhat.com/1668227 / core: gluster(8) - Add SELinux context glusterd_brick_t to man page https://bugzilla.redhat.com/1677555 / core: Glusterfs brick is crashed due to segfault caused by broken gfid symlink https://bugzilla.redhat.com/1674412 / core: listing a file while writing to it causes deadlock https://bugzilla.redhat.com/1673058 / core: Network throughput usage increased x5 https://bugzilla.redhat.com/1670334 / core: Some memory leaks found in GlusterFS 5.3 https://bugzilla.redhat.com/1668239 / disperse: [man page] Gluster(8) - Missing disperse-data parameter Gluster Console Manager man page https://bugzilla.redhat.com/1676429 / distribute: distribute: Perf regression in mkdir path https://bugzilla.redhat.com/1672656 / eventsapi: glustereventsd: crash, ABRT report for package glusterfs has reached 100 occurrences https://bugzilla.redhat.com/1672258 / fuse: fuse takes memory and doesn't free https://bugzilla.redhat.com/1668118 / geo-replication: Failure to start geo-replication for tiered volume. https://bugzilla.redhat.com/1676546 / glusterd: Getting client connection error in gluster logs https://bugzilla.redhat.com/1670382 / gluster-smb: parallel-readdir prevents directories and files listing https://bugzilla.redhat.com/1677557 / nfs: gNFS crashed when processing "gluster v profile [vol] info nfs" https://bugzilla.redhat.com/1677559 / nfs: gNFS crashed when processing "gluster v profile [vol] info nfs" https://bugzilla.redhat.com/1677804 / posix-acl: POSIX ACLs are absent on FUSE-mounted volume using tmpfs bricks (posix-acl-autoload usually returns -1) https://bugzilla.redhat.com/1668286 / read-ahead: READDIRP incorrectly updates posix-acl inode ctx https://bugzilla.redhat.com/1671207 / rpc: Several fixes on socket pollin and pollout return value https://bugzilla.redhat.com/1670155 / tiering: Tiered volume files disappear when a hot brick is failed/restored until the tier detached. https://bugzilla.redhat.com/1676356 / write-behind: glusterfs FUSE client crashing every few days with 'Failed to dispatch handler' [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2540 bytes Desc: not available URL: From budic at onholyground.com Tue Feb 12 17:39:12 2019 From: budic at onholyground.com (Darrell Budic) Date: Tue, 12 Feb 2019 11:39:12 -0600 Subject: [Gluster-devel] [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: <59A9002B-F427-4D94-A653-31A99DEF6CD8@onholyground.com> Is there an example of a custom profile you can share for my ovirt use case (with gfapi enabled)? Or are you just talking about the standard group settings for virt as a custom profile? > On Feb 12, 2019, at 7:22 AM, Raghavendra Gowdappa wrote: > > https://review.gluster.org/22203 > > On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa > wrote: > All, > > We've found perf xlators io-cache and read-ahead not adding any performance improvement. At best read-ahead is redundant due to kernel read-ahead and at worst io-cache is degrading the performance for workloads that doesn't involve re-read. Given that VFS already have both these functionalities, I am proposing to have these two translators turned off by default for native fuse mounts. > > For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have these xlators on by having custom profiles. Comments? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 > > regards, > Raghavendra > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From budic at onholyground.com Wed Feb 13 14:51:06 2019 From: budic at onholyground.com (Darrell Budic) Date: Wed, 13 Feb 2019 08:51:06 -0600 Subject: [Gluster-devel] [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: <59A9002B-F427-4D94-A653-31A99DEF6CD8@onholyground.com> Message-ID: <869C2772-A443-4668-AA0B-B7ACB7A865B5@onholyground.com> Ah, ok, that?s what I thought. Then I have no complaints about improved defaults for the fuse case as long as the use case groups retain appropriately optimized settings. Thanks! > On Feb 12, 2019, at 11:14 PM, Raghavendra Gowdappa wrote: > > > > On Tue, Feb 12, 2019 at 11:09 PM Darrell Budic > wrote: > Is there an example of a custom profile you can share for my ovirt use case (with gfapi enabled)? > > I was speaking about a group setting like "group metadata-cache". Its just that custom options one would turn on for a class of applications or problems. > > Or are you just talking about the standard group settings for virt as a custom profile? > >> On Feb 12, 2019, at 7:22 AM, Raghavendra Gowdappa > wrote: >> >> https://review.gluster.org/22203 >> >> On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa > wrote: >> All, >> >> We've found perf xlators io-cache and read-ahead not adding any performance improvement. At best read-ahead is redundant due to kernel read-ahead and at worst io-cache is degrading the performance for workloads that doesn't involve re-read. Given that VFS already have both these functionalities, I am proposing to have these two translators turned off by default for native fuse mounts. >> >> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have these xlators on by having custom profiles. Comments? >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >> >> regards, >> Raghavendra >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chandranana.Naik at ibm.com Wed Feb 13 05:00:28 2019 From: Chandranana.Naik at ibm.com (Chandranana Naik) Date: Wed, 13 Feb 2019 10:30:28 +0530 Subject: [Gluster-devel] GlusterFs v4.1.5: Need help on bitrot detection Message-ID: Hi Team, We are working with Glusterfs v4.1.5 on big endian platform(Ubuntu 16.04) and encountered that the subtest 20 of test ./tests/bitrot/bug-1207627-bitrot-scrub-status.t is failing. Subtest 20 is failing as below: trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file //d/backends/patchy1/FILE1 not ok 20 Got "" instead of "trusted.bit-rot.bad-file", LINENUM:50 FAILED COMMAND: trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file //d/backends/patchy1/FILE1 The test is failing with error "remote operation failed [Cannot allocate memory]" logged in /var/log/glusterfs/scrub.log. Could you please let us know if anything is missing in making this test pass, PFA the logs for the test case (See attached file: bug-1207627-bitrot-scrub-status.7z) Note: Enough memory is available on the system. Regards, Chandranana Naik -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: bug-1207627-bitrot-scrub-status.7z Type: application/octet-stream Size: 11030 bytes Desc: not available URL: From revirii at googlemail.com Wed Feb 13 07:43:03 2019 From: revirii at googlemail.com (Hu Bert) Date: Wed, 13 Feb 2019 08:43:03 +0100 Subject: [Gluster-devel] [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts In-Reply-To: References: Message-ID: fyi: we have 3 servers, each with 2 SW RAID10 used as bricks in a replicate 3 setup (so 2 volumes); the default values set by OS (debian stretch) are: /dev/md3 Array Size : 29298911232 (27941.62 GiB 30002.09 GB) /sys/block/md3/queue/read_ahead_kb : 3027 /dev/md4 Array Size : 19532607488 (18627.75 GiB 20001.39 GB) /sys/block/md4/queue/read_ahead_kb : 2048 maybe that helps somehow :) Hubert Am Mi., 13. Feb. 2019 um 06:46 Uhr schrieb Manoj Pillai : > > > > On Wed, Feb 13, 2019 at 10:51 AM Raghavendra Gowdappa wrote: >> >> >> >> On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa wrote: >>> >>> All, >>> >>> We've found perf xlators io-cache and read-ahead not adding any performance improvement. At best read-ahead is redundant due to kernel read-ahead >> >> >> One thing we are still figuring out is whether kernel read-ahead is tunable. From what we've explored, it _looks_ like (may not be entirely correct), ra is capped at 128KB. If that's the case, I am interested in few things: >> * Are there any realworld applications/usecases, which would benefit from larger read-ahead (Manoj says block devices can do ra of 4MB)? > > > kernel read-ahead is adaptive but influenced by the read-ahead setting on the block device (/sys/block//queue/read_ahead_kb), which can be tuned. For RHEL specifically, the default is 128KB (last I checked) but the default RHEL tuned-profile, throughput-performance, bumps that up to 4MB. It should be fairly easy to rig up a test where 4MB read-ahead on the block device gives better performance than 128KB read-ahead. > > -- Manoj > >> * Is the limit on kernel ra tunable a hard one? IOW, what does it take to make it to do higher ra? If its difficult, can glusterfs read-ahead provide the expected performance improvement for these applications that would benefit from aggressive ra (as glusterfs can support larger ra sizes)? >> >> I am still inclined to prefer kernel ra as I think its more intelligent and can identify more sequential patterns than Glusterfs read-ahead [1][2]. >> [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf >> [2] https://lwn.net/Articles/155510/ >> >>> and at worst io-cache is degrading the performance for workloads that doesn't involve re-read. Given that VFS already have both these functionalities, I am proposing to have these two translators turned off by default for native fuse mounts. >>> >>> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have these xlators on by having custom profiles. Comments? >>> >>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >>> >>> regards, >>> Raghavendra > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users From srangana at redhat.com Mon Feb 18 20:06:36 2019 From: srangana at redhat.com (Shyam Ranganathan) Date: Mon, 18 Feb 2019 15:06:36 -0500 Subject: [Gluster-devel] Release 6: Branched and next steps In-Reply-To: <54d9fe44-5d59-db0d-2e76-4583351c7eba@redhat.com> References: <54d9fe44-5d59-db0d-2e76-4583351c7eba@redhat.com> Message-ID: In preparation for RC0 I have put up an intial patch for the release notes [1]. Request the following actions on the same (either a followup patchset, or a dependent one), - Please review! - Required GD2 section updated to latest GD2 status - Require notes on "Reduce the number or threads used in the brick process" and the actual status of the same in the notes RC0 build target would be tomorrow or by Wednesday. Thanks, Shyam [1] Release notes patch: https://review.gluster.org/c/glusterfs/+/22226 On 2/5/19 8:25 PM, Shyam Ranganathan wrote: > Hi, > > Release 6 is branched, and tracker bug for 6.0 is created [1]. > > Do mark blockers for the release against [1]. > > As of now we are only tracking [2] "core: implement a global thread pool > " for a backport as a feature into the release. > > We expect to create RC0 tag and builds for upgrade and other testing > close to mid-week next week (around 13th Feb), and the release is slated > for the first week of March for GA. > > I will post updates to this thread around release notes and other > related activity. > > Thanks, > Shyam > > [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0 > > [2] Patches tracked for a backport: > - https://review.gluster.org/c/glusterfs/+/20636 > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > From spisla80 at gmail.com Tue Feb 19 12:07:35 2019 From: spisla80 at gmail.com (David Spisla) Date: Tue, 19 Feb 2019 13:07:35 +0100 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c Message-ID: Hi folks, The 'struct md_cache' in md-cache.c uses int data types which are not in common with the data types used in the 'struct iatt' in iatt.h . If one take a closer look to the implementations one can see that the struct in md-cache.c uses still the int data types like in the struct 'old_iatt' . This can lead to unexpected side effects and some values of iatt maybe will not mapped correctly. I would suggest to open a bug report. What do you think? Additional info: struct md_cache { ia_prot_t md_prot; uint32_t md_nlink; uint32_t md_uid; uint32_t md_gid; uint32_t md_atime; uint32_t md_atime_nsec; uint32_t md_mtime; uint32_t md_mtime_nsec; uint32_t md_ctime; uint32_t md_ctime_nsec; uint64_t md_rdev; uint64_t md_size; uint64_t md_blocks; uint64_t invalidation_time; uint64_t generation; dict_t *xattr; char *linkname; time_t ia_time; time_t xa_time; gf_boolean_t need_lookup; gf_boolean_t valid; gf_boolean_t gen_rollover; gf_boolean_t invalidation_rollover; gf_lock_t lock; }; struct iatt { uint64_t ia_flags; uint64_t ia_ino; /* inode number */ uint64_t ia_dev; /* backing device ID */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ int64_t ia_atime; /* last access time */ int64_t ia_mtime; /* last modification time */ int64_t ia_ctime; /* last status change time */ int64_t ia_btime; /* creation time. Fill using statx */ uint32_t ia_atime_nsec; uint32_t ia_mtime_nsec; uint32_t ia_ctime_nsec; uint32_t ia_btime_nsec; uint64_t ia_attributes; /* chattr related:compressed, immutable, * append only, encrypted etc.*/ uint64_t ia_attributes_mask; /* Mask for the attributes */ uuid_t ia_gfid; ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ }; struct old_iatt { uint64_t ia_ino; /* inode number */ uuid_t ia_gfid; uint64_t ia_dev; /* backing device ID */ ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ uint32_t ia_atime; /* last access time */ uint32_t ia_atime_nsec; uint32_t ia_mtime; /* last modification time */ uint32_t ia_mtime_nsec; uint32_t ia_ctime; /* last status change time */ uint32_t ia_ctime_nsec; }; -------------- next part -------------- An HTML attachment was scrubbed... URL: From spisla80 at gmail.com Tue Feb 19 15:18:20 2019 From: spisla80 at gmail.com (David Spisla) Date: Tue, 19 Feb 2019 16:18:20 +0100 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c In-Reply-To: References: Message-ID: Hello, I already open a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1678726 There is also a link to a bug fix patch Regards David Spisla Am Di., 19. Feb. 2019 um 13:07 Uhr schrieb David Spisla : > Hi folks, > > The 'struct md_cache' in md-cache.c uses int data types which are not in > common with the data types used in the 'struct iatt' in iatt.h . If one > take a closer look to the implementations one can see that the struct in > md-cache.c uses still the int data types like in the struct 'old_iatt' . > This can lead to unexpected side effects and some values of iatt maybe will > not mapped correctly. I would suggest to open a bug report. What do you > think? > > Additional info: > > struct md_cache { > ia_prot_t md_prot; > uint32_t md_nlink; > uint32_t md_uid; > uint32_t md_gid; > uint32_t md_atime; > uint32_t md_atime_nsec; > uint32_t md_mtime; > uint32_t md_mtime_nsec; > uint32_t md_ctime; > uint32_t md_ctime_nsec; > uint64_t md_rdev; > uint64_t md_size; > uint64_t md_blocks; > uint64_t invalidation_time; > uint64_t generation; > dict_t *xattr; > char *linkname; > time_t ia_time; > time_t xa_time; > gf_boolean_t need_lookup; > gf_boolean_t valid; > gf_boolean_t gen_rollover; > gf_boolean_t invalidation_rollover; > gf_lock_t lock; > }; > > struct iatt { > uint64_t ia_flags; > uint64_t ia_ino; /* inode number */ > uint64_t ia_dev; /* backing device ID */ > uint64_t ia_rdev; /* device ID (if special file) */ > uint64_t ia_size; /* file size in bytes */ > uint32_t ia_nlink; /* Link count */ > uint32_t ia_uid; /* user ID of owner */ > uint32_t ia_gid; /* group ID of owner */ > uint32_t ia_blksize; /* blocksize for filesystem I/O */ > uint64_t ia_blocks; /* number of 512B blocks allocated */ > int64_t ia_atime; /* last access time */ > int64_t ia_mtime; /* last modification time */ > int64_t ia_ctime; /* last status change time */ > int64_t ia_btime; /* creation time. Fill using statx */ > uint32_t ia_atime_nsec; > uint32_t ia_mtime_nsec; > uint32_t ia_ctime_nsec; > uint32_t ia_btime_nsec; > uint64_t ia_attributes; /* chattr related:compressed, immutable, > * append only, encrypted etc.*/ > uint64_t ia_attributes_mask; /* Mask for the attributes */ > > uuid_t ia_gfid; > ia_type_t ia_type; /* type of file */ > ia_prot_t ia_prot; /* protection */ > }; > > struct old_iatt { > uint64_t ia_ino; /* inode number */ > uuid_t ia_gfid; > uint64_t ia_dev; /* backing device ID */ > ia_type_t ia_type; /* type of file */ > ia_prot_t ia_prot; /* protection */ > uint32_t ia_nlink; /* Link count */ > uint32_t ia_uid; /* user ID of owner */ > uint32_t ia_gid; /* group ID of owner */ > uint64_t ia_rdev; /* device ID (if special file) */ > uint64_t ia_size; /* file size in bytes */ > uint32_t ia_blksize; /* blocksize for filesystem I/O */ > uint64_t ia_blocks; /* number of 512B blocks allocated */ > uint32_t ia_atime; /* last access time */ > uint32_t ia_atime_nsec; > uint32_t ia_mtime; /* last modification time */ > uint32_t ia_mtime_nsec; > uint32_t ia_ctime; /* last status change time */ > uint32_t ia_ctime_nsec; > }; > -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Feb 20 12:45:07 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 20 Feb 2019 18:15:07 +0530 Subject: [Gluster-devel] Release 6: Branched and next steps In-Reply-To: References: <54d9fe44-5d59-db0d-2e76-4583351c7eba@redhat.com> Message-ID: On Tue, Feb 19, 2019 at 1:37 AM Shyam Ranganathan wrote: > In preparation for RC0 I have put up an intial patch for the release > notes [1]. Request the following actions on the same (either a followup > patchset, or a dependent one), > > - Please review! > - Required GD2 section updated to latest GD2 status > I am inclined to drop the GD2 section for 'standalone' users. As the team worked with goals of making GD2 invisible with containers (GCS) in mind. So, should we call out any features of GD2 at all? Anyways, as per my previous email on GCS release updates, we are planning to have a container available with gd2 and glusterfs, which can be used by people who are trying out options with GD2. > - Require notes on "Reduce the number or threads used in the brick > process" and the actual status of the same in the notes > > This work is still in progress, and we are treating it as a bug fix for 'brick-multiplex' usecase, which is mainly required in scaled volume number usecase in container world. My guess is, we won't have much content to add for glusterfs-6.0 at the moment. > RC0 build target would be tomorrow or by Wednesday. > > Thanks, I was testing for few upgrade and different version clusters support. With 4.1.6 and latest release-6.0 branch, things works fine. I haven't done much of a load testing yet. Requesting people to support in upgrade testing. From different volume options, and different usecase scenarios. Regards, Amar > Thanks, > Shyam > > [1] Release notes patch: https://review.gluster.org/c/glusterfs/+/22226 > > On 2/5/19 8:25 PM, Shyam Ranganathan wrote: > > Hi, > > > > Release 6 is branched, and tracker bug for 6.0 is created [1]. > > > > Do mark blockers for the release against [1]. > > > > As of now we are only tracking [2] "core: implement a global thread pool > > " for a backport as a feature into the release. > > > > We expect to create RC0 tag and builds for upgrade and other testing > > close to mid-week next week (around 13th Feb), and the release is slated > > for the first week of March for GA. > > > > I will post updates to this thread around release notes and other > > related activity. > > > > Thanks, > > Shyam > > > > [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0 > > > > [2] Patches tracked for a backport: > > - https://review.gluster.org/c/glusterfs/+/20636 > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Feb 20 13:17:38 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 20 Feb 2019 18:47:38 +0530 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c In-Reply-To: References: Message-ID: Hi David, Thanks for the patch, it got merged in master now. Can you please post it into release branches, so we can take them in release-6, release-5 branch, so next releases can have them. Regards, Amar On Tue, Feb 19, 2019 at 8:49 PM David Spisla wrote: > Hello, > > I already open a bug: > https://bugzilla.redhat.com/show_bug.cgi?id=1678726 > > There is also a link to a bug fix patch > > Regards > David Spisla > > Am Di., 19. Feb. 2019 um 13:07 Uhr schrieb David Spisla < > spisla80 at gmail.com>: > >> Hi folks, >> >> The 'struct md_cache' in md-cache.c uses int data types which are not in >> common with the data types used in the 'struct iatt' in iatt.h . If one >> take a closer look to the implementations one can see that the struct in >> md-cache.c uses still the int data types like in the struct 'old_iatt' . >> This can lead to unexpected side effects and some values of iatt maybe will >> not mapped correctly. I would suggest to open a bug report. What do you >> think? >> >> Additional info: >> >> struct md_cache { >> ia_prot_t md_prot; >> uint32_t md_nlink; >> uint32_t md_uid; >> uint32_t md_gid; >> uint32_t md_atime; >> uint32_t md_atime_nsec; >> uint32_t md_mtime; >> uint32_t md_mtime_nsec; >> uint32_t md_ctime; >> uint32_t md_ctime_nsec; >> uint64_t md_rdev; >> uint64_t md_size; >> uint64_t md_blocks; >> uint64_t invalidation_time; >> uint64_t generation; >> dict_t *xattr; >> char *linkname; >> time_t ia_time; >> time_t xa_time; >> gf_boolean_t need_lookup; >> gf_boolean_t valid; >> gf_boolean_t gen_rollover; >> gf_boolean_t invalidation_rollover; >> gf_lock_t lock; >> }; >> >> struct iatt { >> uint64_t ia_flags; >> uint64_t ia_ino; /* inode number */ >> uint64_t ia_dev; /* backing device ID */ >> uint64_t ia_rdev; /* device ID (if special file) */ >> uint64_t ia_size; /* file size in bytes */ >> uint32_t ia_nlink; /* Link count */ >> uint32_t ia_uid; /* user ID of owner */ >> uint32_t ia_gid; /* group ID of owner */ >> uint32_t ia_blksize; /* blocksize for filesystem I/O */ >> uint64_t ia_blocks; /* number of 512B blocks allocated */ >> int64_t ia_atime; /* last access time */ >> int64_t ia_mtime; /* last modification time */ >> int64_t ia_ctime; /* last status change time */ >> int64_t ia_btime; /* creation time. Fill using statx */ >> uint32_t ia_atime_nsec; >> uint32_t ia_mtime_nsec; >> uint32_t ia_ctime_nsec; >> uint32_t ia_btime_nsec; >> uint64_t ia_attributes; /* chattr related:compressed, immutable, >> * append only, encrypted etc.*/ >> uint64_t ia_attributes_mask; /* Mask for the attributes */ >> >> uuid_t ia_gfid; >> ia_type_t ia_type; /* type of file */ >> ia_prot_t ia_prot; /* protection */ >> }; >> >> struct old_iatt { >> uint64_t ia_ino; /* inode number */ >> uuid_t ia_gfid; >> uint64_t ia_dev; /* backing device ID */ >> ia_type_t ia_type; /* type of file */ >> ia_prot_t ia_prot; /* protection */ >> uint32_t ia_nlink; /* Link count */ >> uint32_t ia_uid; /* user ID of owner */ >> uint32_t ia_gid; /* group ID of owner */ >> uint64_t ia_rdev; /* device ID (if special file) */ >> uint64_t ia_size; /* file size in bytes */ >> uint32_t ia_blksize; /* blocksize for filesystem I/O */ >> uint64_t ia_blocks; /* number of 512B blocks allocated */ >> uint32_t ia_atime; /* last access time */ >> uint32_t ia_atime_nsec; >> uint32_t ia_mtime; /* last modification time */ >> uint32_t ia_mtime_nsec; >> uint32_t ia_ctime; /* last status change time */ >> uint32_t ia_ctime_nsec; >> }; >> > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Feb 20 13:55:18 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 20 Feb 2019 19:25:18 +0530 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c In-Reply-To: References: Message-ID: Hi David, https://docs.gluster.org/en/latest/Developer-guide/Backport-Guidelines/ gives more details about it. But easiest is to go to your patch (https://review.gluster.org/22234), and then click on 'Cherry Pick' button. In the pop-up, 'branch:' field, give 'release-6' and Submit. If you want it in release-5 branch too, repeat the same, with branch being 'release-5'. Siimlarly we need 'clone-of' bug for both the branches (the original bug used in patch is for master branch). That should be it. Rest, we can take care. Thanks a lot! Regards, Amar On Wed, Feb 20, 2019 at 6:58 PM David Spisla wrote: > Hello Amar, > > > > no problem. How can I do that? Can you please tell me the procedure? > > > > Regards > > David > > > > *Von:* Amar Tumballi Suryanarayan > *Gesendet:* Mittwoch, 20. Februar 2019 14:18 > *An:* David Spisla > *Cc:* Gluster Devel > *Betreff:* Re: [Gluster-devel] md-cache: May bug found in md-cache.c > > > > Hi David, > > > > Thanks for the patch, it got merged in master now. Can you please post it > into release branches, so we can take them in release-6, release-5 branch, > so next releases can have them. > > > > Regards, > > Amar > > > > On Tue, Feb 19, 2019 at 8:49 PM David Spisla wrote: > > Hello, > > > > I already open a bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=1678726 > > > > There is also a link to a bug fix patch > > > > Regards > > David Spisla > > > > Am Di., 19. Feb. 2019 um 13:07 Uhr schrieb David Spisla < > spisla80 at gmail.com>: > > Hi folks, > > > > The 'struct md_cache' in md-cache.c uses int data types which are not in > common with the data types used in the 'struct iatt' in iatt.h . If one > take a closer look to the implementations one can see that the struct in > md-cache.c uses still the int data types like in the struct 'old_iatt' . > This can lead to unexpected side effects and some values of iatt maybe will > not mapped correctly. I would suggest to open a bug report. What do you > think? > > Additional info: > > struct md_cache { > ia_prot_t md_prot; > uint32_t md_nlink; > uint32_t md_uid; > uint32_t md_gid; > uint32_t md_atime; > uint32_t md_atime_nsec; > uint32_t md_mtime; > uint32_t md_mtime_nsec; > uint32_t md_ctime; > uint32_t md_ctime_nsec; > uint64_t md_rdev; > uint64_t md_size; > uint64_t md_blocks; > uint64_t invalidation_time; > uint64_t generation; > dict_t *xattr; > char *linkname; > time_t ia_time; > time_t xa_time; > gf_boolean_t need_lookup; > gf_boolean_t valid; > gf_boolean_t gen_rollover; > gf_boolean_t invalidation_rollover; > gf_lock_t lock; > }; > > struct iatt { > uint64_t ia_flags; > uint64_t ia_ino; /* inode number */ > uint64_t ia_dev; /* backing device ID */ > uint64_t ia_rdev; /* device ID (if special file) */ > uint64_t ia_size; /* file size in bytes */ > uint32_t ia_nlink; /* Link count */ > uint32_t ia_uid; /* user ID of owner */ > uint32_t ia_gid; /* group ID of owner */ > uint32_t ia_blksize; /* blocksize for filesystem I/O */ > uint64_t ia_blocks; /* number of 512B blocks allocated */ > int64_t ia_atime; /* last access time */ > int64_t ia_mtime; /* last modification time */ > int64_t ia_ctime; /* last status change time */ > int64_t ia_btime; /* creation time. Fill using statx */ > uint32_t ia_atime_nsec; > uint32_t ia_mtime_nsec; > uint32_t ia_ctime_nsec; > uint32_t ia_btime_nsec; > uint64_t ia_attributes; /* chattr related:compressed, immutable, > * append only, encrypted etc.*/ > uint64_t ia_attributes_mask; /* Mask for the attributes */ > > uuid_t ia_gfid; > ia_type_t ia_type; /* type of file */ > ia_prot_t ia_prot; /* protection */ > }; > > struct old_iatt { > uint64_t ia_ino; /* inode number */ > uuid_t ia_gfid; > uint64_t ia_dev; /* backing device ID */ > ia_type_t ia_type; /* type of file */ > ia_prot_t ia_prot; /* protection */ > uint32_t ia_nlink; /* Link count */ > uint32_t ia_uid; /* user ID of owner */ > uint32_t ia_gid; /* group ID of owner */ > uint64_t ia_rdev; /* device ID (if special file) */ > uint64_t ia_size; /* file size in bytes */ > uint32_t ia_blksize; /* blocksize for filesystem I/O */ > uint64_t ia_blocks; /* number of 512B blocks allocated */ > uint32_t ia_atime; /* last access time */ > uint32_t ia_atime_nsec; > uint32_t ia_mtime; /* last modification time */ > uint32_t ia_mtime_nsec; > uint32_t ia_ctime; /* last status change time */ > uint32_t ia_ctime_nsec; > }; > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > -- > > Amar Tumballi (amarts) > -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From srangana at redhat.com Wed Feb 20 14:40:12 2019 From: srangana at redhat.com (Shyam Ranganathan) Date: Wed, 20 Feb 2019 09:40:12 -0500 Subject: [Gluster-devel] Release 6: Branched and next steps In-Reply-To: References: <54d9fe44-5d59-db0d-2e76-4583351c7eba@redhat.com> Message-ID: <0bf75579-b777-a6f0-c848-0518aa0700c6@redhat.com> On 2/20/19 7:45 AM, Amar Tumballi Suryanarayan wrote: > > > On Tue, Feb 19, 2019 at 1:37 AM Shyam Ranganathan > wrote: > > In preparation for RC0 I have put up an intial patch for the release > notes [1]. Request the following actions on the same (either a followup > patchset, or a dependent one), > > - Please review! > - Required GD2 section updated to latest GD2 status > > > I am inclined to drop the GD2 section for 'standalone' users. As the > team worked with goals of making GD2 invisible with containers (GCS) in > mind. So, should we call out any features of GD2 at all? This is fine, we possibly need to add a note in the release notes, on the GD2 future and where it would land, so that we can inform users about the continued use of GD1 in non-GCS use cases. I will add some text around the same in the release-notes. > > Anyways, as per my previous email on GCS release updates, we are > planning to have a container available with gd2 and glusterfs, which can > be used by people who are trying out options with GD2. > ? > > - Require notes on "Reduce the number or threads used in the brick > process" and the actual status of the same in the notes > > > This work is still in progress, and we are treating it as a bug fix for > 'brick-multiplex' usecase, which is mainly required in scaled volume > number usecase in container world. My guess is, we won't have much > content to add for glusterfs-6.0 at the moment. Ack! > ? > > RC0 build target would be tomorrow or by Wednesday. > > > Thanks, I was testing for few upgrade and different version clusters > support. With 4.1.6 and latest release-6.0 branch, things works fine. I > haven't done much of a load testing yet. Awesome! Helps write out the upgrade guide as well. As this time content there would/should carry data regarding how to upgrade if any of the deprecated xlators are in use by a deployment. > > Requesting people to support in upgrade testing. From different volume > options, and different usecase scenarios. > > Regards, > Amar > > ? > > Thanks, > Shyam > > [1] Release notes patch: https://review.gluster.org/c/glusterfs/+/22226 > > On 2/5/19 8:25 PM, Shyam Ranganathan wrote: > > Hi, > > > > Release 6 is branched, and tracker bug for 6.0 is created [1]. > > > > Do mark blockers for the release against [1]. > > > > As of now we are only tracking [2] "core: implement a global > thread pool > > " for a backport as a feature into the release. > > > > We expect to create RC0 tag and builds for upgrade and other testing > > close to mid-week next week (around 13th Feb), and the release is > slated > > for the first week of March for GA. > > > > I will post updates to this thread around release notes and other > > related activity. > > > > Thanks, > > Shyam > > > > [1] Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-6.0 > > > > [2] Patches tracked for a backport: > >? ?- https://review.gluster.org/c/glusterfs/+/20636 > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > -- > Amar Tumballi (amarts) From atumball at redhat.com Wed Feb 20 15:04:54 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 20 Feb 2019 20:34:54 +0530 Subject: [Gluster-devel] GlusterFs v4.1.5: Need help on bitrot detection In-Reply-To: References: Message-ID: Hi Chandranana, We are trying to find a BigEndian platform to test this out at the moment, will get back to you on this. Meantime, did you run the entire regression suit? Is it the only test failing? To run the entire regression suite, please run `run-tests.sh -c` from glusterfs source repo. -Amar On Tue, Feb 19, 2019 at 1:31 AM Chandranana Naik wrote: > Hi Team, > > We are working with Glusterfs v4.1.5 on big endian platform(Ubuntu 16.04) > and encountered that the subtest 20 of test > ./tests/bitrot/bug-1207627-bitrot-scrub-status.t is failing. > > Subtest 20 is failing as below: > *trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file > //d/backends/patchy1/FILE1* > *not ok 20 Got "" instead of "trusted.bit-rot.bad-file", LINENUM:50* > *FAILED COMMAND: trusted.bit-rot.bad-file check_for_xattr > trusted.bit-rot.bad-file //d/backends/patchy1/FILE1* > > The test is failing with error "*remote operation failed [Cannot allocate > memory]"* logged in /var/log/glusterfs/scrub.log. > Could you please let us know if anything is missing in making this test > pass, PFA the logs for the test case > > *(See attached file: bug-1207627-bitrot-scrub-status.7z)* > > Note: *Enough memory is available on the system*. > > Regards, > Chandranana Naik > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From nbalacha at redhat.com Thu Feb 21 03:15:31 2019 From: nbalacha at redhat.com (Nithya Balachandran) Date: Thu, 21 Feb 2019 08:45:31 +0530 Subject: [Gluster-devel] GlusterFs v4.1.5: Need help on bitrot detection In-Reply-To: References: Message-ID: On Wed, 20 Feb 2019 at 21:03, Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > Hi Chandranana, > > We are trying to find a BigEndian platform to test this out at the moment, > will get back to you on this. > > Meantime, did you run the entire regression suit? Is it the only test > failing? To run the entire regression suite, please run `run-tests.sh -c` > from glusterfs source repo. > > They are seeing other issues as well [1] , mostly related to the hashed values in Big endian systems and the hardcoded names and paths in the .t files. I have fixed 2 .t files and asked them to debug the remaining tests and provide patches as it was taking a long time to go back and forth with various suggested changes. There are debug logs attached for all the failing tests, including one for the failing bitrot case which indicates a very large value being returned in fgetxattr (probably also related to endianess). [2019-02-14 09:12:05.140750] D [MSGID: 0] [io-threads.c:372:iot_schedule] 0-patchy-io-threads: FGETXATTR scheduled as least priority fop [2019-02-14 09:12:05.140828] A [MSGID: 0] [mem-pool.c:118:__gf_calloc] : no memory available for size (176093659239) [call stack follows] /usr/local/lib/libglusterfs.so.0(+0x28eaa)[0x3ffb2da8eaa] /usr/local/lib/libglusterfs.so.0(_gf_msg_nomem+0x31c)[0x3ffb2da93c4] /usr/local/lib/libglusterfs.so.0(__gf_calloc+0x13c)[0x3ffb2dd595c] /usr/local/lib/glusterfs/4.1.5/xlator/features/bitrot-stub.so(+0xe3c4)[0x3ffae28e3c4] /usr/local/lib/glusterfs/4.1.5/xlator/storage/posix.so(+0x32154)[0x3ffae7b2154] Regards, Nithya [1] https://bugzilla.redhat.com/show_bug.cgi?id=1672480 > -Amar > > On Tue, Feb 19, 2019 at 1:31 AM Chandranana Naik > wrote: > >> Hi Team, >> >> We are working with Glusterfs v4.1.5 on big endian platform(Ubuntu 16.04) >> and encountered that the subtest 20 of test >> ./tests/bitrot/bug-1207627-bitrot-scrub-status.t is failing. >> >> Subtest 20 is failing as below: >> *trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file >> //d/backends/patchy1/FILE1* >> *not ok 20 Got "" instead of "trusted.bit-rot.bad-file", LINENUM:50* >> *FAILED COMMAND: trusted.bit-rot.bad-file check_for_xattr >> trusted.bit-rot.bad-file //d/backends/patchy1/FILE1* >> >> The test is failing with error "*remote operation failed [Cannot >> allocate memory]"* logged in /var/log/glusterfs/scrub.log. >> Could you please let us know if anything is missing in making this test >> pass, PFA the logs for the test case >> >> *(See attached file: bug-1207627-bitrot-scrub-status.7z)* >> >> Note: *Enough memory is available on the system*. >> >> Regards, >> Chandranana Naik >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > -- > Amar Tumballi (amarts) > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From dan at danfries.net Wed Feb 20 11:07:06 2019 From: dan at danfries.net (Dan Fries) Date: Wed, 20 Feb 2019 03:07:06 -0800 Subject: [Gluster-devel] Contributing to Gluster Message-ID: Dear Gluster, I hope you do not mind me contacting you directly, as I was given your email by a colleague. My name is Dan Fries, and I am a technical copywriter focused covering the open source software community. I'm reaching out to you today in the hopes of writing for Gluster. Is there any availability to contribute to the site as a guest author? I'm not seeking employment nor remuneration, only volunteer work. Thanks for your time and consideration. Dan __ *Daniel Fries - Linkedin * If I've reached you in error or you would prefer to not receive another message at this address, I apologize for the inconvenience. Click here and you shouldn't hear from me again. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.spisla at iternity.com Wed Feb 20 14:14:30 2019 From: david.spisla at iternity.com (David Spisla) Date: Wed, 20 Feb 2019 14:14:30 +0000 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c In-Reply-To: References: Message-ID: Hello Amar, it should be done. Thank you for the explanation Regards David Von: Amar Tumballi Suryanarayan Gesendet: Mittwoch, 20. Februar 2019 14:55 An: David Spisla Cc: Gluster Devel Betreff: Re: [Gluster-devel] md-cache: May bug found in md-cache.c Hi David, https://docs.gluster.org/en/latest/Developer-guide/Backport-Guidelines/ gives more details about it. But easiest is to go to your patch (https://review.gluster.org/22234), and then click on 'Cherry Pick' button. In the pop-up, 'branch:' field, give 'release-6' and Submit. If you want it in release-5 branch too, repeat the same, with branch being 'release-5'. Siimlarly we need 'clone-of' bug for both the branches (the original bug used in patch is for master branch). That should be it. Rest, we can take care. Thanks a lot! Regards, Amar On Wed, Feb 20, 2019 at 6:58 PM David Spisla > wrote: Hello Amar, no problem. How can I do that? Can you please tell me the procedure? Regards David Von: Amar Tumballi Suryanarayan > Gesendet: Mittwoch, 20. Februar 2019 14:18 An: David Spisla > Cc: Gluster Devel > Betreff: Re: [Gluster-devel] md-cache: May bug found in md-cache.c Hi David, Thanks for the patch, it got merged in master now. Can you please post it into release branches, so we can take them in release-6, release-5 branch, so next releases can have them. Regards, Amar On Tue, Feb 19, 2019 at 8:49 PM David Spisla > wrote: Hello, I already open a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1678726 There is also a link to a bug fix patch Regards David Spisla Am Di., 19. Feb. 2019 um 13:07 Uhr schrieb David Spisla >: Hi folks, The 'struct md_cache' in md-cache.c uses int data types which are not in common with the data types used in the 'struct iatt' in iatt.h . If one take a closer look to the implementations one can see that the struct in md-cache.c uses still the int data types like in the struct 'old_iatt' . This can lead to unexpected side effects and some values of iatt maybe will not mapped correctly. I would suggest to open a bug report. What do you think? Additional info: struct md_cache { ia_prot_t md_prot; uint32_t md_nlink; uint32_t md_uid; uint32_t md_gid; uint32_t md_atime; uint32_t md_atime_nsec; uint32_t md_mtime; uint32_t md_mtime_nsec; uint32_t md_ctime; uint32_t md_ctime_nsec; uint64_t md_rdev; uint64_t md_size; uint64_t md_blocks; uint64_t invalidation_time; uint64_t generation; dict_t *xattr; char *linkname; time_t ia_time; time_t xa_time; gf_boolean_t need_lookup; gf_boolean_t valid; gf_boolean_t gen_rollover; gf_boolean_t invalidation_rollover; gf_lock_t lock; }; struct iatt { uint64_t ia_flags; uint64_t ia_ino; /* inode number */ uint64_t ia_dev; /* backing device ID */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ int64_t ia_atime; /* last access time */ int64_t ia_mtime; /* last modification time */ int64_t ia_ctime; /* last status change time */ int64_t ia_btime; /* creation time. Fill using statx */ uint32_t ia_atime_nsec; uint32_t ia_mtime_nsec; uint32_t ia_ctime_nsec; uint32_t ia_btime_nsec; uint64_t ia_attributes; /* chattr related:compressed, immutable, * append only, encrypted etc.*/ uint64_t ia_attributes_mask; /* Mask for the attributes */ uuid_t ia_gfid; ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ }; struct old_iatt { uint64_t ia_ino; /* inode number */ uuid_t ia_gfid; uint64_t ia_dev; /* backing device ID */ ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ uint32_t ia_atime; /* last access time */ uint32_t ia_atime_nsec; uint32_t ia_mtime; /* last modification time */ uint32_t ia_mtime_nsec; uint32_t ia_ctime; /* last status change time */ uint32_t ia_ctime_nsec; }; _______________________________________________ Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.spisla at iternity.com Wed Feb 20 13:28:48 2019 From: david.spisla at iternity.com (David Spisla) Date: Wed, 20 Feb 2019 13:28:48 +0000 Subject: [Gluster-devel] md-cache: May bug found in md-cache.c In-Reply-To: References: Message-ID: Hello Amar, no problem. How can I do that? Can you please tell me the procedure? Regards David Von: Amar Tumballi Suryanarayan Gesendet: Mittwoch, 20. Februar 2019 14:18 An: David Spisla Cc: Gluster Devel Betreff: Re: [Gluster-devel] md-cache: May bug found in md-cache.c Hi David, Thanks for the patch, it got merged in master now. Can you please post it into release branches, so we can take them in release-6, release-5 branch, so next releases can have them. Regards, Amar On Tue, Feb 19, 2019 at 8:49 PM David Spisla > wrote: Hello, I already open a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1678726 There is also a link to a bug fix patch Regards David Spisla Am Di., 19. Feb. 2019 um 13:07 Uhr schrieb David Spisla >: Hi folks, The 'struct md_cache' in md-cache.c uses int data types which are not in common with the data types used in the 'struct iatt' in iatt.h . If one take a closer look to the implementations one can see that the struct in md-cache.c uses still the int data types like in the struct 'old_iatt' . This can lead to unexpected side effects and some values of iatt maybe will not mapped correctly. I would suggest to open a bug report. What do you think? Additional info: struct md_cache { ia_prot_t md_prot; uint32_t md_nlink; uint32_t md_uid; uint32_t md_gid; uint32_t md_atime; uint32_t md_atime_nsec; uint32_t md_mtime; uint32_t md_mtime_nsec; uint32_t md_ctime; uint32_t md_ctime_nsec; uint64_t md_rdev; uint64_t md_size; uint64_t md_blocks; uint64_t invalidation_time; uint64_t generation; dict_t *xattr; char *linkname; time_t ia_time; time_t xa_time; gf_boolean_t need_lookup; gf_boolean_t valid; gf_boolean_t gen_rollover; gf_boolean_t invalidation_rollover; gf_lock_t lock; }; struct iatt { uint64_t ia_flags; uint64_t ia_ino; /* inode number */ uint64_t ia_dev; /* backing device ID */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ int64_t ia_atime; /* last access time */ int64_t ia_mtime; /* last modification time */ int64_t ia_ctime; /* last status change time */ int64_t ia_btime; /* creation time. Fill using statx */ uint32_t ia_atime_nsec; uint32_t ia_mtime_nsec; uint32_t ia_ctime_nsec; uint32_t ia_btime_nsec; uint64_t ia_attributes; /* chattr related:compressed, immutable, * append only, encrypted etc.*/ uint64_t ia_attributes_mask; /* Mask for the attributes */ uuid_t ia_gfid; ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ }; struct old_iatt { uint64_t ia_ino; /* inode number */ uuid_t ia_gfid; uint64_t ia_dev; /* backing device ID */ ia_type_t ia_type; /* type of file */ ia_prot_t ia_prot; /* protection */ uint32_t ia_nlink; /* Link count */ uint32_t ia_uid; /* user ID of owner */ uint32_t ia_gid; /* group ID of owner */ uint64_t ia_rdev; /* device ID (if special file) */ uint64_t ia_size; /* file size in bytes */ uint32_t ia_blksize; /* blocksize for filesystem I/O */ uint64_t ia_blocks; /* number of 512B blocks allocated */ uint32_t ia_atime; /* last access time */ uint32_t ia_atime_nsec; uint32_t ia_mtime; /* last modification time */ uint32_t ia_mtime_nsec; uint32_t ia_ctime; /* last status change time */ uint32_t ia_ctime_nsec; }; _______________________________________________ Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From amye at redhat.com Thu Feb 21 18:49:14 2019 From: amye at redhat.com (Amye Scavarda) Date: Thu, 21 Feb 2019 10:49:14 -0800 Subject: [Gluster-devel] Contributing to Gluster In-Reply-To: References: Message-ID: Hello! We're always welcoming contributors, what do you have in mind? - amye On Thu, Feb 21, 2019 at 10:41 AM Dan Fries wrote: > > Dear Gluster, > > I hope you do not mind me contacting you directly, as I was given your email by a colleague. My name is Dan Fries, and I am a technical copywriter focused covering the open source software community. > > I'm reaching out to you today in the hopes of writing for Gluster. Is there any availability to contribute to the site as a guest author? I'm not seeking employment nor remuneration, only volunteer work. > > Thanks for your time and consideration. > > Dan > __ > Daniel Fries - Linkedin > > > If I've reached you in error or you would prefer to not receive another message at this address, I apologize for the inconvenience. Click here and you shouldn't hear from me again. > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amye Scavarda | amye at redhat.com | Gluster Community Lead From jenkins at build.gluster.org Mon Feb 25 01:45:02 2019 From: jenkins at build.gluster.org (jenkins at build.gluster.org) Date: Mon, 25 Feb 2019 01:45:02 +0000 (UTC) Subject: [Gluster-devel] Weekly Untriaged Bugs Message-ID: <210522922.12.1551059102831.JavaMail.jenkins@jenkins-el7.rht.gluster.org> [...truncated 6 lines...] https://bugzilla.redhat.com/1672076 / core: chrome / chromium crash on gluster, sqlite issue? https://bugzilla.redhat.com/1679904 / core: client log flooding with intentional socket shutdown message when a brick is down https://bugzilla.redhat.com/1677555 / core: Glusterfs brick is crashed due to segfault caused by broken gfid symlink https://bugzilla.redhat.com/1674412 / core: listing a file while writing to it causes deadlock https://bugzilla.redhat.com/1673058 / core: Network throughput usage increased x5 https://bugzilla.redhat.com/1670334 / core: Some memory leaks found in GlusterFS 5.3 https://bugzilla.redhat.com/1676429 / distribute: distribute: Perf regression in mkdir path https://bugzilla.redhat.com/1672656 / eventsapi: glustereventsd: crash, ABRT report for package glusterfs has reached 100 occurrences https://bugzilla.redhat.com/1672258 / fuse: fuse takes memory and doesn't free https://bugzilla.redhat.com/1679892 / glusterd: assertion failure log in glusterd.log file when a volume start is triggered https://bugzilla.redhat.com/1679744 / glusterd: Minio gateway nas does not work with 2 + 1 dispersed volumes https://bugzilla.redhat.com/1678640 / glusterd: Running 'control-cpu-load.sh' prevents CTDB starting https://bugzilla.redhat.com/1670382 / gluster-smb: parallel-readdir prevents directories and files listing https://bugzilla.redhat.com/1679169 / md-cache: Integer Overflow possible in md-cache.c due to data type inconsistency https://bugzilla.redhat.com/1679170 / md-cache: Integer Overflow possible in md-cache.c due to data type inconsistency https://bugzilla.redhat.com/1677557 / nfs: gNFS crashed when processing "gluster v profile [vol] info nfs" https://bugzilla.redhat.com/1677804 / posix-acl: POSIX ACLs are absent on FUSE-mounted volume using tmpfs bricks (posix-acl-autoload usually returns -1) https://bugzilla.redhat.com/1678378 / project-infrastructure: Add a nightly build verification job in Jenkins for release-6 https://bugzilla.redhat.com/1676546 / replicate: Getting client connection error in gluster logs https://bugzilla.redhat.com/1671207 / rpc: Several fixes on socket pollin and pollout return value https://bugzilla.redhat.com/1670155 / tiering: Tiered volume files disappear when a hot brick is failed/restored until the tier detached. https://bugzilla.redhat.com/1676356 / write-behind: glusterfs FUSE client crashing every few days with 'Failed to dispatch handler' [...truncated 2 lines...] -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 2807 bytes Desc: not available URL: From srangana at redhat.com Mon Feb 25 15:10:21 2019 From: srangana at redhat.com (Shyam Ranganathan) Date: Mon, 25 Feb 2019 10:10:21 -0500 Subject: [Gluster-devel] [Gluster-Maintainers] glusterfs-6.0rc0 released In-Reply-To: References: <430948742.3.1550808940070.JavaMail.jenkins@jenkins-el7.rht.gluster.org> Message-ID: Hi, Release-6 RC0 packages are built (see mail below). This is a good time to start testing the release bits, and reporting any issues on bugzilla. Do post on the lists any testing done and feedback from the same. We have about 2 weeks to GA of release-6 barring any major blockers uncovered during the test phase. Please take this time to help make the release effective, by testing the same. Thanks, Shyam NOTE: CentOS StorageSIG packages for the same are still pending and should be available in due course. On 2/23/19 9:41 AM, Kaleb Keithley wrote: > > GlusterFS 6.0rc0 is built in Fedora 30 and Fedora 31/rawhide. > > Packages for Fedora 29, RHEL 8, RHEL 7, and RHEL 6* and Debian 9/stretch > and Debian 10/buster are at > https://download.gluster.org/pub/gluster/glusterfs/qa-releases/6.0rc0/ > > Packages are signed. The public key is at > https://download.gluster.org/pub/gluster/glusterfs/6/rsa.pub > > * RHEL 6 is client-side only. Fedora 29, RHEL 7, and RHEL 6 RPMs are > Fedora Koji scratch builds. RHEL 7 and RHEL 6 RPMs are provided here for > convenience only, and are independent of the RPMs in the CentOS Storage SIG. From atumball at redhat.com Mon Feb 25 18:11:34 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Mon, 25 Feb 2019 23:41:34 +0530 Subject: [Gluster-devel] GlusterFS - 6.0RC - Test days (27th, 28th Feb) Message-ID: Hi all, We are calling out our users, and developers to contribute in validating ?glusterfs-6.0rc? build in their usecase. Specially for the cases of upgrade, stability, and performance. Some of the key highlights of the release are listed in release-notes draft . Please note that there are some of the features which are being dropped out of this release, and hence making sure your setup is not going to have an issue is critical. Also the default lru-limit option in fuse mount for Inodes should help to control the memory usage of client processes. All the good reason to give it a shot in your test setup. If you are developer using gfapi interface to integrate with other projects, you also have some signature changes, so please make sure your project would work with latest release. Or even if you are using a project which depends on gfapi, report the error with new RPMs (if any). We will help fix it. As part of test days, we want to focus on testing the latest upcoming release i.e. GlusterFS-6, and one or the other gluster volunteers would be there on #gluster channel on freenode to assist the people. Some of the key things we are looking as bug reports are: - See if upgrade from your current version to 6.0rc is smooth, and works as documented. - Report bugs in process, or in documentation if you find mismatch. - Functionality is all as expected for your usecase. - No issues with actual application you would run on production etc. - Performance has not degraded in your usecase. - While we have added some performance options to the code, not all of them are turned on, as they have to be done based on usecases. - Make sure the default setup is at least same as your current version - Try out few options mentioned in release notes (especially, --auto-invalidation=no) and see if it helps performance. - While doing all the above, check below: - see if the log files are making sense, and not flooding with some ?for developer only? type of messages. - get ?profile info? output from old and now, and see if there is anything which is out of normal expectation. Check with us on the numbers. - get a ?statedump? when there are some issues. Try to make sense of it, and raise a bug if you don?t understand it completely. Process expected on test days. - We have a tracker bug [0] - We will attach all the ?blocker? bugs to this bug. - Use this link to report bugs, so that we have more metadata around given bugzilla. - Click Here [1] - The test cases which are to be tested are listed here in this sheet [2], please add, update, and keep it up-to-date to reduce duplicate efforts. Lets together make this release a success. Also check if we covered some of the open issues from Weekly untriaged bugs [3] For details on build and RPMs check this email [4] Finally, the dates :-) - Wednesday - Feb 27th, and - Thursday - Feb 28th Note that our goal is to identify as many issues as possible in upgrade and stability scenarios, and if any blockers are found, want to make sure we release with the fix for same. So each of you, Gluster users, feel comfortable to upgrade to 6.0 version. Regards, Gluster Ants. -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Tue Feb 26 11:57:30 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Tue, 26 Feb 2019 17:27:30 +0530 Subject: [Gluster-devel] test failure reports for last 30 days Message-ID: [1] captures the test failures report since last 30 days and we'd need volunteers/component owners to see why the number of failures are so high against few tests. [1] https://fstat.gluster.org/summary?start_date=2019-01-26&end_date=2019-02-25&job=all -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Tue Feb 26 12:17:05 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Tue, 26 Feb 2019 17:47:05 +0530 Subject: [Gluster-devel] Version uplift query Message-ID: Hi, Currently we are using Glusterfs 3.7.6 and thinking to switch on Glusterfs 4.1 or 5.0, when I see there are too much code changes between these version, could you please let us know, is there any compatibility issue when we uplift any of the new mentioned version? Regards Abhishek -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Wed Feb 27 11:15:11 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Wed, 27 Feb 2019 16:45:11 +0530 Subject: [Gluster-devel] Version uplift query In-Reply-To: References: Message-ID: Hi, Could you please update on this and also let us know what is GlusterD2 (as it is under development in 5.0 release), so it is ok to uplift to 5.0? Regards, Abhishek On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL wrote: > Hi, > > Currently we are using Glusterfs 3.7.6 and thinking to switch on Glusterfs > 4.1 or 5.0, when I see there are too much code changes between these > version, could you please let us know, is there any compatibility issue > when we uplift any of the new mentioned version? > > Regards > Abhishek > -- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: From atumball at redhat.com Wed Feb 27 15:11:25 2019 From: atumball at redhat.com (Amar Tumballi Suryanarayan) Date: Wed, 27 Feb 2019 20:41:25 +0530 Subject: [Gluster-devel] [Gluster-users] Version uplift query In-Reply-To: References: Message-ID: GlusterD2 is not yet called out for standalone deployments. You can happily update to glusterfs-5.x (recommend you to wait for glusterfs-5.4 which is already tagged, and waiting for packages to be built). Regards, Amar On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL wrote: > Hi, > > Could you please update on this and also let us know what is GlusterD2 > (as it is under development in 5.0 release), so it is ok to uplift to 5.0? > > Regards, > Abhishek > > On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL > wrote: > >> Hi, >> >> Currently we are using Glusterfs 3.7.6 and thinking to switch on >> Glusterfs 4.1 or 5.0, when I see there are too much code changes between >> these version, could you please let us know, is there any compatibility >> issue when we uplift any of the new mentioned version? >> >> Regards >> Abhishek >> > > > -- > > > > > Regards > Abhishek Paliwal > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- Amar Tumballi (amarts) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chandranana.Naik at ibm.com Wed Feb 27 06:11:36 2019 From: Chandranana.Naik at ibm.com (Chandranana Naik) Date: Wed, 27 Feb 2019 11:41:36 +0530 Subject: [Gluster-devel] GlusterFs v4.1.5: Need help on bitrot detection In-Reply-To: References: Message-ID: Thanks Nithya & Amar for the response. Please let us know if you need more info from our side. We are looking into the source code of Glusterfs. Regards, Chandranana From: Nithya Balachandran To: Amar Tumballi Suryanarayan Cc: Chandranana Naik , Abhay Singh , Gluster Devel Date: 02/21/2019 08:46 AM Subject: Re: [Gluster-devel] GlusterFs v4.1.5: Need help on bitrot detection On Wed, 20 Feb 2019 at 21:03, Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: Hi Chandranana, We are trying to find a BigEndian platform to test this out at the moment, will get back to you on this. Meantime, did you run the entire regression suit? Is it the only test failing? To run the entire regression suite, please run `run-tests.sh -c` from glusterfs source repo. They are seeing other issues as well [1] , mostly related to the hashed values in Big endian systems and the hardcoded names and paths in the .t files.? I have fixed 2 .t files and asked them to debug the remaining tests and provide patches as it was taking a long time to go back and forth with various suggested changes. There are debug logs attached for all the failing tests, including one for the failing bitrot case which indicates a very large value being returned in fgetxattr (probably also related to endianess). [2019-02-14 09:12:05.140750] D [MSGID: 0] [io-threads.c:372:iot_schedule] 0-patchy-io-threads: FGETXATTR scheduled as least priority fop [2019-02-14 09:12:05.140828] A [MSGID: 0] [mem-pool.c:118:__gf_calloc] : no memory available for size (176093659239) [call stack follows] /usr/local/lib/libglusterfs.so.0(+0x28eaa)[0x3ffb2da8eaa] /usr/local/lib/libglusterfs.so.0(_gf_msg_nomem+0x31c)[0x3ffb2da93c4] /usr/local/lib/libglusterfs.so.0(__gf_calloc+0x13c)[0x3ffb2dd595c] /usr/local/lib/glusterfs/4.1.5/xlator/features/bitrot-stub.so (+0xe3c4)[0x3ffae28e3c4] /usr/local/lib/glusterfs/4.1.5/xlator/storage/posix.so (+0x32154)[0x3ffae7b2154] Regards, Nithya [1]?https://bugzilla.redhat.com/show_bug.cgi?id=1672480 -Amar On Tue, Feb 19, 2019 at 1:31 AM Chandranana Naik < Chandranana.Naik at ibm.com> wrote: Hi Team, We are working with Glusterfs v4.1.5 on big endian platform(Ubuntu 16.04) and encountered that the subtest 20 of test ./tests/bitrot/bug-1207627-bitrot-scrub-status.t is failing. Subtest 20 is failing as below: trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file //d/backends/patchy1/FILE1 not ok 20 Got "" instead of "trusted.bit-rot.bad-file", LINENUM:50 FAILED COMMAND: trusted.bit-rot.bad-file check_for_xattr trusted.bit-rot.bad-file //d/backends/patchy1/FILE1 The test is failing with error "remote operation failed [Cannot allocate memory]" logged in /var/log/glusterfs/scrub.log. Could you please let us know if anything is missing in making this test pass, PFA the logs for the test case (See attached file: bug-1207627-bitrot-scrub-status.7z) Note: Enough memory is available on the system. Regards, Chandranana Naik _______________________________________________ Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -- Amar Tumballi (amarts) _______________________________________________ Gluster-devel mailing list Gluster-devel at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From pgurusid at redhat.com Thu Feb 28 02:46:15 2019 From: pgurusid at redhat.com (Poornima Gurusiddaiah) Date: Thu, 28 Feb 2019 08:16:15 +0530 Subject: [Gluster-devel] [Gluster-users] Version uplift query In-Reply-To: References: Message-ID: On Wed, Feb 27, 2019, 11:52 PM Ingo Fischer wrote: > Hi Amar, > > sorry to jump into this thread with an connected question. > > When installing via "apt-get" and so using debian packages and also > systemd to start/stop glusterd is the online upgrade process from > 3.x/4.x to 5.x still needed as described at > https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_4.1/ ? > > Especially because there is manual killall and such for processes > handled by systemd in my case. Or is there an other upgrade guide or > recommendations for use on ubuntu? > > Would systemctl stop glusterd, then using apt-get update with changes > sources and a reboot be enough? > I think you would still need to kill the process manually, AFAIK systemd only stops glusterd not the other Gluster processes like glusterfsd(bricks), heal process etc. Reboot of system is not required, if that's what you meant by reboot. Also you need follow all the other steps mentioned, for the cluster to work smoothly after upgrade. Especially the steps to perform heal are important. Regards, Poornima > Ingo > > Am 27.02.19 um 16:11 schrieb Amar Tumballi Suryanarayan: > > GlusterD2 is not yet called out for standalone deployments. > > > > You can happily update to glusterfs-5.x (recommend you to wait for > > glusterfs-5.4 which is already tagged, and waiting for packages to be > > built). > > > > Regards, > > Amar > > > > On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL > > > wrote: > > > > Hi, > > > > Could you please update on this and also let us know what is > > GlusterD2 (as it is under development in 5.0 release), so it is ok > > to uplift to 5.0? > > > > Regards, > > Abhishek > > > > On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL > > > wrote: > > > > Hi, > > > > Currently we are using Glusterfs 3.7.6 and thinking to switch on > > Glusterfs 4.1 or 5.0, when I see there are too much code changes > > between these version, could you please let us know, is there > > any compatibility issue when we uplift any of the new mentioned > > version? > > > > Regards > > Abhishek > > > > > > > > -- > > > > > > > > > > Regards > > Abhishek Paliwal > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > > > > > -- > > Amar Tumballi (amarts) > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhishpaliwal at gmail.com Thu Feb 28 07:10:51 2019 From: abhishpaliwal at gmail.com (ABHISHEK PALIWAL) Date: Thu, 28 Feb 2019 12:40:51 +0530 Subject: [Gluster-devel] [Gluster-users] Version uplift query In-Reply-To: References: Message-ID: I am trying to build Gluster5.4 but getting below error at the time of configure conftest.c:11:28: fatal error: ac_nonexistent.h: No such file or directory Could you please help me what is the reason of the above error. Regards, Abhishek On Wed, Feb 27, 2019 at 8:42 PM Amar Tumballi Suryanarayan < atumball at redhat.com> wrote: > GlusterD2 is not yet called out for standalone deployments. > > You can happily update to glusterfs-5.x (recommend you to wait for > glusterfs-5.4 which is already tagged, and waiting for packages to be > built). > > Regards, > Amar > > On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL > wrote: > >> Hi, >> >> Could you please update on this and also let us know what is GlusterD2 >> (as it is under development in 5.0 release), so it is ok to uplift to 5.0? >> >> Regards, >> Abhishek >> >> On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL >> wrote: >> >>> Hi, >>> >>> Currently we are using Glusterfs 3.7.6 and thinking to switch on >>> Glusterfs 4.1 or 5.0, when I see there are too much code changes between >>> these version, could you please let us know, is there any compatibility >>> issue when we uplift any of the new mentioned version? >>> >>> Regards >>> Abhishek >>> >> >> >> -- >> >> >> >> >> Regards >> Abhishek Paliwal >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Amar Tumballi (amarts) > -- Regards Abhishek Paliwal -------------- next part -------------- An HTML attachment was scrubbed... URL: From mchangir at redhat.com Thu Feb 28 07:31:24 2019 From: mchangir at redhat.com (Milind Changire) Date: Thu, 28 Feb 2019 13:01:24 +0530 Subject: [Gluster-devel] [Gluster-users] Version uplift query In-Reply-To: References: Message-ID: you might want to check what build.log says ... especially at the very bottom Here's a hint from StackExhange . On Thu, Feb 28, 2019 at 12:42 PM ABHISHEK PALIWAL wrote: > I am trying to build Gluster5.4 but getting below error at the time of > configure > > conftest.c:11:28: fatal error: ac_nonexistent.h: No such file or directory > > Could you please help me what is the reason of the above error. > > Regards, > Abhishek > > On Wed, Feb 27, 2019 at 8:42 PM Amar Tumballi Suryanarayan < > atumball at redhat.com> wrote: > >> GlusterD2 is not yet called out for standalone deployments. >> >> You can happily update to glusterfs-5.x (recommend you to wait for >> glusterfs-5.4 which is already tagged, and waiting for packages to be >> built). >> >> Regards, >> Amar >> >> On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL >> wrote: >> >>> Hi, >>> >>> Could you please update on this and also let us know what is GlusterD2 >>> (as it is under development in 5.0 release), so it is ok to uplift to 5.0? >>> >>> Regards, >>> Abhishek >>> >>> On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL < >>> abhishpaliwal at gmail.com> wrote: >>> >>>> Hi, >>>> >>>> Currently we are using Glusterfs 3.7.6 and thinking to switch on >>>> Glusterfs 4.1 or 5.0, when I see there are too much code changes between >>>> these version, could you please let us know, is there any compatibility >>>> issue when we uplift any of the new mentioned version? >>>> >>>> Regards >>>> Abhishek >>>> >>> >>> >>> -- >>> >>> >>> >>> >>> Regards >>> Abhishek Paliwal >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> >> >> >> -- >> Amar Tumballi (amarts) >> > > > -- > > > > > Regards > Abhishek Paliwal > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Milind -------------- next part -------------- An HTML attachment was scrubbed... URL: From 994506334 at qq.com Thu Feb 28 03:03:50 2019 From: 994506334 at qq.com (=?gb18030?B?v+zA1g==?=) Date: Thu, 28 Feb 2019 03:03:50 -0000 Subject: [Gluster-devel] Different glusterfs clients's data not consistent. Message-ID: Three node? node1? node2, node3 Steps: 1. gluster volume create volume_test node1:/brick1 2. gluster volume set volume_test cluster.server-quorum-ratio 51 3. gluster volume set volume_test cluster.server-quorum-type server 4. On node1, mount -t glusterfs node1:/volume_test /mnt. 5. On node2, mount -t glusterfs node2:/volume_test /mnt. 6. On node1, killall glusterd 7. On node2, gluster volume add-brick volume_test node2:/brick2 8. On node2. mkdir /mnt/test 8. touch /mnt/test/file1 on two nodes. On node1, found /brick1/file1. But on node2, also found /brick2/file1. I don't want to set cluster.server-quorum-ratio to 100. Cound you help me to solve this porblem? -------------- next part -------------- An HTML attachment was scrubbed... URL: From amudhan83 at gmail.com Thu Feb 28 06:09:20 2019 From: amudhan83 at gmail.com (Amudhan P) Date: Thu, 28 Feb 2019 06:09:20 -0000 Subject: [Gluster-devel] [Gluster-users] Version uplift query In-Reply-To: References: Message-ID: Hi Poornima, Instead of killing process stopping volume followed by stopping service in nodes and update glusterfs. can't we follow the above step? regards Amudhan On Thu, Feb 28, 2019 at 8:16 AM Poornima Gurusiddaiah wrote: > > > On Wed, Feb 27, 2019, 11:52 PM Ingo Fischer wrote: > >> Hi Amar, >> >> sorry to jump into this thread with an connected question. >> >> When installing via "apt-get" and so using debian packages and also >> systemd to start/stop glusterd is the online upgrade process from >> 3.x/4.x to 5.x still needed as described at >> https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_4.1/ ? >> >> Especially because there is manual killall and such for processes >> handled by systemd in my case. Or is there an other upgrade guide or >> recommendations for use on ubuntu? >> >> Would systemctl stop glusterd, then using apt-get update with changes >> sources and a reboot be enough? >> > > I think you would still need to kill the process manually, AFAIK systemd > only stops glusterd not the other Gluster processes like > glusterfsd(bricks), heal process etc. Reboot of system is not required, if > that's what you meant by reboot. Also you need follow all the other steps > mentioned, for the cluster to work smoothly after upgrade. Especially the > steps to perform heal are important. > > Regards, > Poornima > > >> Ingo >> >> Am 27.02.19 um 16:11 schrieb Amar Tumballi Suryanarayan: >> > GlusterD2 is not yet called out for standalone deployments. >> > >> > You can happily update to glusterfs-5.x (recommend you to wait for >> > glusterfs-5.4 which is already tagged, and waiting for packages to be >> > built). >> > >> > Regards, >> > Amar >> > >> > On Wed, Feb 27, 2019 at 4:46 PM ABHISHEK PALIWAL >> > > wrote: >> > >> > Hi, >> > >> > Could you please update on this and also let us know what is >> > GlusterD2 (as it is under development in 5.0 release), so it is ok >> > to uplift to 5.0? >> > >> > Regards, >> > Abhishek >> > >> > On Tue, Feb 26, 2019 at 5:47 PM ABHISHEK PALIWAL >> > > wrote: >> > >> > Hi, >> > >> > Currently we are using Glusterfs 3.7.6 and thinking to switch on >> > Glusterfs 4.1 or 5.0, when I see there are too much code changes >> > between these version, could you please let us know, is there >> > any compatibility issue when we uplift any of the new mentioned >> > version? >> > >> > Regards >> > Abhishek >> > >> > >> > >> > -- >> > >> > >> > >> > >> > Regards >> > Abhishek Paliwal >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> > >> > >> > >> > -- >> > Amar Tumballi (amarts) >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> > >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: