From mscherer at redhat.com Tue May 7 16:11:29 2019 From: mscherer at redhat.com (Michael Scherer) Date: Tue, 07 May 2019 18:11:29 +0200 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : > Looks like is_nfs_export_available started failing again in recent > centos-regressions. > > Michael, can you please check? I will try but I am leaving for vacation tonight, so if I find nothing, until I leave, I guess Deepshika will have to look. > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: > > > > > > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < > > mscherer at redhat.com> > > wrote: > > > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : > > > > Is this back again? The recent patches are failing regression > > > > :-\ . > > > > > > So, on builder206, it took me a while to find that the issue is > > > that > > > nfs (the service) was running. > > > > > > ./tests/basic/afr/tarissue.t failed, because the nfs > > > initialisation > > > failed with a rather cryptic message: > > > > > > [2019-04-23 13:17:05.371733] I > > > [socket.c:991:__socket_server_bind] 0- > > > socket.nfs-server: process started listening on port (38465) > > > [2019-04-23 13:17:05.385819] E > > > [socket.c:972:__socket_server_bind] 0- > > > socket.nfs-server: binding to failed: Address already in use > > > [2019-04-23 13:17:05.385843] E > > > [socket.c:974:__socket_server_bind] 0- > > > socket.nfs-server: Port is already in use > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 > > > > > > I found where this came from, but a few stuff did surprised me: > > > > > > - the order of print is different that the order in the code > > > > > > > Indeed strange... > > > > > - the message on "started listening" didn't take in account the > > > fact > > > that bind failed on: > > > > > > > Shouldn't it bail out if it failed to bind? > > Some missing 'goto out' around line 975/976? > > Y. > > > > > > > > > > > > > > https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 > > > > > > The message about port 38465 also threw me off the track. The > > > real > > > issue is that the service nfs was already running, and I couldn't > > > find > > > anything listening on port 38465 > > > > > > once I do service nfs stop, it no longer failed. > > > > > > So far, I do know why nfs.service was activated. > > > > > > But at least, 206 should be fixed, and we know a bit more on what > > > would > > > be causing some failure. > > > > > > > > > > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < > > > > mscherer at redhat.com> > > > > wrote: > > > > > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a > > > > > ?crit : > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < > > > > > > jthottan at redhat.com> > > > > > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > is_nfs_export_available is just a wrapper around > > > > > > > "showmount" > > > > > > > command AFAIR. > > > > > > > I saw following messages in console output. > > > > > > > mount.nfs: rpc.statd is not running but is required for > > > > > > > remote > > > > > > > locking. > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks > > > > > > > local, > > > > > > > or > > > > > > > start > > > > > > > statd. > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was > > > > > > > specified > > > > > > > > > > > > > > For me it looks rpcbind may not be running on the > > > > > > > machine. > > > > > > > Usually rpcbind starts automatically on machines, don't > > > > > > > know > > > > > > > whether it > > > > > > > can happen or not. > > > > > > > > > > > > > > > > > > > That's precisely what the question is. Why suddenly we're > > > > > > seeing > > > > > > this > > > > > > happening too frequently. Today I saw atleast 4 to 5 such > > > > > > failures > > > > > > already. > > > > > > > > > > > > Deepshika - Can you please help in inspecting this? > > > > > > > > > > So we think (we are not sure) that the issue is a bit > > > > > complex. > > > > > > > > > > What we were investigating was nightly run fail on aws. When > > > > > the > > > > > build > > > > > crash, the builder is restarted, since that's the easiest way > > > > > to > > > > > clean > > > > > everything (since even with a perfect test suite that would > > > > > clean > > > > > itself, we could always end in a corrupt state on the system, > > > > > WRT > > > > > mount, fs, etc). > > > > > > > > > > In turn, this seems to cause trouble on aws, since cloud-init > > > > > or > > > > > something rename eth0 interface to ens5, without cleaning to > > > > > the > > > > > network configuration. > > > > > > > > > > So the network init script fail (because the image say "start > > > > > eth0" > > > > > and > > > > > that's not present), but fail in a weird way. Network is > > > > > initialised > > > > > and working (we can connect), but the dhclient process is not > > > > > in > > > > > the > > > > > right cgroup, and network.service is in failed state. > > > > > Restarting > > > > > network didn't work. In turn, this mean that rpc-statd refuse > > > > > to > > > > > start > > > > > (due to systemd dependencies), which seems to impact various > > > > > NFS > > > > > tests. > > > > > > > > > > We have also seen that on some builders, rpcbind pick some IP > > > > > v6 > > > > > autoconfiguration, but we can't reproduce that, and there is > > > > > no ip > > > > > v6 > > > > > set up anywhere. I suspect the network.service failure is > > > > > somehow > > > > > involved, but fail to see how. In turn, rpcbind.socket not > > > > > starting > > > > > could cause NFS test troubles. > > > > > > > > > > Our current stop gap fix was to fix all the builders one by > > > > > one. > > > > > Remove > > > > > the config, kill the rogue dhclient, restart network service. > > > > > > > > > > However, we can't be sure this is going to fix the problem > > > > > long > > > > > term > > > > > since this only manifest after a crash of the test suite, and > > > > > it > > > > > doesn't happen so often. (plus, it was working before some > > > > > day in > > > > > the > > > > > past, when something did make this fail, and I do not know if > > > > > that's a > > > > > system upgrade, or a test change, or both). > > > > > > > > > > So we are still looking at it to have a complete > > > > > understanding of > > > > > the > > > > > issue, but so far, we hacked our way to make it work (or so > > > > > do I > > > > > think). > > > > > > > > > > Deepshika is working to fix it long term, by fixing the issue > > > > > regarding > > > > > eth0/ens5 with a new base image. > > > > > -- > > > > > Michael Scherer > > > > > Sysadmin, Community Infrastructure and Platform, OSAS > > > > > > > > > > > > > > > -- > > > > > > > > - Atin (atinm) > > > > > > -- > > > Michael Scherer > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel at gluster.org > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > -- Michael Scherer Sysadmin, Community Infrastructure -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From dkhandel at redhat.com Tue May 7 18:53:05 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Wed, 8 May 2019 00:23:05 +0530 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: Sanju, can you please give us more info about the failures. I see the failures occurring on just one of the builder (builder206). I'm taking it back offline for now. On Tue, May 7, 2019 at 9:42 PM Michael Scherer wrote: > Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : > > Looks like is_nfs_export_available started failing again in recent > > centos-regressions. > > > > Michael, can you please check? > > I will try but I am leaving for vacation tonight, so if I find nothing, > until I leave, I guess Deepshika will have to look. > > > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: > > > > > > > > > > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < > > > mscherer at redhat.com> > > > wrote: > > > > > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : > > > > > Is this back again? The recent patches are failing regression > > > > > :-\ . > > > > > > > > So, on builder206, it took me a while to find that the issue is > > > > that > > > > nfs (the service) was running. > > > > > > > > ./tests/basic/afr/tarissue.t failed, because the nfs > > > > initialisation > > > > failed with a rather cryptic message: > > > > > > > > [2019-04-23 13:17:05.371733] I > > > > [socket.c:991:__socket_server_bind] 0- > > > > socket.nfs-server: process started listening on port (38465) > > > > [2019-04-23 13:17:05.385819] E > > > > [socket.c:972:__socket_server_bind] 0- > > > > socket.nfs-server: binding to failed: Address already in use > > > > [2019-04-23 13:17:05.385843] E > > > > [socket.c:974:__socket_server_bind] 0- > > > > socket.nfs-server: Port is already in use > > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- > > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 > > > > > > > > I found where this came from, but a few stuff did surprised me: > > > > > > > > - the order of print is different that the order in the code > > > > > > > > > > Indeed strange... > > > > > > > - the message on "started listening" didn't take in account the > > > > fact > > > > that bind failed on: > > > > > > > > > > Shouldn't it bail out if it failed to bind? > > > Some missing 'goto out' around line 975/976? > > > Y. > > > > > > > > > > > > > > > > > > > > > https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 > > > > > > > > The message about port 38465 also threw me off the track. The > > > > real > > > > issue is that the service nfs was already running, and I couldn't > > > > find > > > > anything listening on port 38465 > > > > > > > > once I do service nfs stop, it no longer failed. > > > > > > > > So far, I do know why nfs.service was activated. > > > > > > > > But at least, 206 should be fixed, and we know a bit more on what > > > > would > > > > be causing some failure. > > > > > > > > > > > > > > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < > > > > > mscherer at redhat.com> > > > > > wrote: > > > > > > > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a > > > > > > ?crit : > > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < > > > > > > > jthottan at redhat.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > is_nfs_export_available is just a wrapper around > > > > > > > > "showmount" > > > > > > > > command AFAIR. > > > > > > > > I saw following messages in console output. > > > > > > > > mount.nfs: rpc.statd is not running but is required for > > > > > > > > remote > > > > > > > > locking. > > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks > > > > > > > > local, > > > > > > > > or > > > > > > > > start > > > > > > > > statd. > > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was > > > > > > > > specified > > > > > > > > > > > > > > > > For me it looks rpcbind may not be running on the > > > > > > > > machine. > > > > > > > > Usually rpcbind starts automatically on machines, don't > > > > > > > > know > > > > > > > > whether it > > > > > > > > can happen or not. > > > > > > > > > > > > > > > > > > > > > > That's precisely what the question is. Why suddenly we're > > > > > > > seeing > > > > > > > this > > > > > > > happening too frequently. Today I saw atleast 4 to 5 such > > > > > > > failures > > > > > > > already. > > > > > > > > > > > > > > Deepshika - Can you please help in inspecting this? > > > > > > > > > > > > So we think (we are not sure) that the issue is a bit > > > > > > complex. > > > > > > > > > > > > What we were investigating was nightly run fail on aws. When > > > > > > the > > > > > > build > > > > > > crash, the builder is restarted, since that's the easiest way > > > > > > to > > > > > > clean > > > > > > everything (since even with a perfect test suite that would > > > > > > clean > > > > > > itself, we could always end in a corrupt state on the system, > > > > > > WRT > > > > > > mount, fs, etc). > > > > > > > > > > > > In turn, this seems to cause trouble on aws, since cloud-init > > > > > > or > > > > > > something rename eth0 interface to ens5, without cleaning to > > > > > > the > > > > > > network configuration. > > > > > > > > > > > > So the network init script fail (because the image say "start > > > > > > eth0" > > > > > > and > > > > > > that's not present), but fail in a weird way. Network is > > > > > > initialised > > > > > > and working (we can connect), but the dhclient process is not > > > > > > in > > > > > > the > > > > > > right cgroup, and network.service is in failed state. > > > > > > Restarting > > > > > > network didn't work. In turn, this mean that rpc-statd refuse > > > > > > to > > > > > > start > > > > > > (due to systemd dependencies), which seems to impact various > > > > > > NFS > > > > > > tests. > > > > > > > > > > > > We have also seen that on some builders, rpcbind pick some IP > > > > > > v6 > > > > > > autoconfiguration, but we can't reproduce that, and there is > > > > > > no ip > > > > > > v6 > > > > > > set up anywhere. I suspect the network.service failure is > > > > > > somehow > > > > > > involved, but fail to see how. In turn, rpcbind.socket not > > > > > > starting > > > > > > could cause NFS test troubles. > > > > > > > > > > > > Our current stop gap fix was to fix all the builders one by > > > > > > one. > > > > > > Remove > > > > > > the config, kill the rogue dhclient, restart network service. > > > > > > > > > > > > However, we can't be sure this is going to fix the problem > > > > > > long > > > > > > term > > > > > > since this only manifest after a crash of the test suite, and > > > > > > it > > > > > > doesn't happen so often. (plus, it was working before some > > > > > > day in > > > > > > the > > > > > > past, when something did make this fail, and I do not know if > > > > > > that's a > > > > > > system upgrade, or a test change, or both). > > > > > > > > > > > > So we are still looking at it to have a complete > > > > > > understanding of > > > > > > the > > > > > > issue, but so far, we hacked our way to make it work (or so > > > > > > do I > > > > > > think). > > > > > > > > > > > > Deepshika is working to fix it long term, by fixing the issue > > > > > > regarding > > > > > > eth0/ens5 with a new base image. > > > > > > -- > > > > > > Michael Scherer > > > > > > Sysadmin, Community Infrastructure and Platform, OSAS > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > - Atin (atinm) > > > > > > > > -- > > > > Michael Scherer > > > > Sysadmin, Community Infrastructure > > > > > > > > > > > > > > > > _______________________________________________ > > > > Gluster-devel mailing list > > > > Gluster-devel at gluster.org > > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > _______________________________________________ > > > Gluster-devel mailing list > > > Gluster-devel at gluster.org > > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > > > > > > -- > Michael Scherer > Sysadmin, Community Infrastructure > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla at redhat.com Wed May 8 03:49:36 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 08 May 2019 03:49:36 +0000 Subject: [Gluster-infra] [Bug 1707671] New: Cronjob of feeding gluster blogs from different account into planet gluster isn't working Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1707671 Bug ID: 1707671 Summary: Cronjob of feeding gluster blogs from different account into planet gluster isn't working Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Assignee: bugs at gluster.org Reporter: amukherj at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community Description of problem: As mentioned in the title. For example https://github.com/gluster/planet-gluster/blob/master/data/feeds.yml has feed: https://atinmu.wordpress.com/feed/ configured however I don't see my latest blog https://atinmu.wordpress.com/2019/04/03/glusterd-volume-scalability-improvements-in-glusterfs-7/ in planet gluster. -- You are receiving this mail because: You are on the CC list for the bug. From amukherj at redhat.com Wed May 8 04:23:04 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Wed, 8 May 2019 09:53:04 +0530 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde wrote: > Deepshikha, > > I see the failure here[1] which ran on builder206. So, we are good. > Not really, https://build.gluster.org/job/centos7-regression/5909/consoleFull failed on builder204 for similar reasons I believe? I am bit more worried on this issue being resurfacing more often these days. What can we do to fix this permanently? > [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull > > On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal > wrote: > >> Sanju, can you please give us more info about the failures. >> >> I see the failures occurring on just one of the builder (builder206). I'm >> taking it back offline for now. >> >> On Tue, May 7, 2019 at 9:42 PM Michael Scherer >> wrote: >> >>> Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : >>> > Looks like is_nfs_export_available started failing again in recent >>> > centos-regressions. >>> > >>> > Michael, can you please check? >>> >>> I will try but I am leaving for vacation tonight, so if I find nothing, >>> until I leave, I guess Deepshika will have to look. >>> >>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: >>> > >>> > > >>> > > >>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < >>> > > mscherer at redhat.com> >>> > > wrote: >>> > > >>> > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >>> > > > > Is this back again? The recent patches are failing regression >>> > > > > :-\ . >>> > > > >>> > > > So, on builder206, it took me a while to find that the issue is >>> > > > that >>> > > > nfs (the service) was running. >>> > > > >>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs >>> > > > initialisation >>> > > > failed with a rather cryptic message: >>> > > > >>> > > > [2019-04-23 13:17:05.371733] I >>> > > > [socket.c:991:__socket_server_bind] 0- >>> > > > socket.nfs-server: process started listening on port (38465) >>> > > > [2019-04-23 13:17:05.385819] E >>> > > > [socket.c:972:__socket_server_bind] 0- >>> > > > socket.nfs-server: binding to failed: Address already in use >>> > > > [2019-04-23 13:17:05.385843] E >>> > > > [socket.c:974:__socket_server_bind] 0- >>> > > > socket.nfs-server: Port is already in use >>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 >>> > > > >>> > > > I found where this came from, but a few stuff did surprised me: >>> > > > >>> > > > - the order of print is different that the order in the code >>> > > > >>> > > >>> > > Indeed strange... >>> > > >>> > > > - the message on "started listening" didn't take in account the >>> > > > fact >>> > > > that bind failed on: >>> > > > >>> > > >>> > > Shouldn't it bail out if it failed to bind? >>> > > Some missing 'goto out' around line 975/976? >>> > > Y. >>> > > >>> > > > >>> > > > >>> > > > >>> > > > >>> >>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >>> > > > >>> > > > The message about port 38465 also threw me off the track. The >>> > > > real >>> > > > issue is that the service nfs was already running, and I couldn't >>> > > > find >>> > > > anything listening on port 38465 >>> > > > >>> > > > once I do service nfs stop, it no longer failed. >>> > > > >>> > > > So far, I do know why nfs.service was activated. >>> > > > >>> > > > But at least, 206 should be fixed, and we know a bit more on what >>> > > > would >>> > > > be causing some failure. >>> > > > >>> > > > >>> > > > >>> > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < >>> > > > > mscherer at redhat.com> >>> > > > > wrote: >>> > > > > >>> > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a >>> > > > > > ?crit : >>> > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >>> > > > > > > jthottan at redhat.com> >>> > > > > > > wrote: >>> > > > > > > >>> > > > > > > > Hi, >>> > > > > > > > >>> > > > > > > > is_nfs_export_available is just a wrapper around >>> > > > > > > > "showmount" >>> > > > > > > > command AFAIR. >>> > > > > > > > I saw following messages in console output. >>> > > > > > > > mount.nfs: rpc.statd is not running but is required for >>> > > > > > > > remote >>> > > > > > > > locking. >>> > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks >>> > > > > > > > local, >>> > > > > > > > or >>> > > > > > > > start >>> > > > > > > > statd. >>> > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was >>> > > > > > > > specified >>> > > > > > > > >>> > > > > > > > For me it looks rpcbind may not be running on the >>> > > > > > > > machine. >>> > > > > > > > Usually rpcbind starts automatically on machines, don't >>> > > > > > > > know >>> > > > > > > > whether it >>> > > > > > > > can happen or not. >>> > > > > > > > >>> > > > > > > >>> > > > > > > That's precisely what the question is. Why suddenly we're >>> > > > > > > seeing >>> > > > > > > this >>> > > > > > > happening too frequently. Today I saw atleast 4 to 5 such >>> > > > > > > failures >>> > > > > > > already. >>> > > > > > > >>> > > > > > > Deepshika - Can you please help in inspecting this? >>> > > > > > >>> > > > > > So we think (we are not sure) that the issue is a bit >>> > > > > > complex. >>> > > > > > >>> > > > > > What we were investigating was nightly run fail on aws. When >>> > > > > > the >>> > > > > > build >>> > > > > > crash, the builder is restarted, since that's the easiest way >>> > > > > > to >>> > > > > > clean >>> > > > > > everything (since even with a perfect test suite that would >>> > > > > > clean >>> > > > > > itself, we could always end in a corrupt state on the system, >>> > > > > > WRT >>> > > > > > mount, fs, etc). >>> > > > > > >>> > > > > > In turn, this seems to cause trouble on aws, since cloud-init >>> > > > > > or >>> > > > > > something rename eth0 interface to ens5, without cleaning to >>> > > > > > the >>> > > > > > network configuration. >>> > > > > > >>> > > > > > So the network init script fail (because the image say "start >>> > > > > > eth0" >>> > > > > > and >>> > > > > > that's not present), but fail in a weird way. Network is >>> > > > > > initialised >>> > > > > > and working (we can connect), but the dhclient process is not >>> > > > > > in >>> > > > > > the >>> > > > > > right cgroup, and network.service is in failed state. >>> > > > > > Restarting >>> > > > > > network didn't work. In turn, this mean that rpc-statd refuse >>> > > > > > to >>> > > > > > start >>> > > > > > (due to systemd dependencies), which seems to impact various >>> > > > > > NFS >>> > > > > > tests. >>> > > > > > >>> > > > > > We have also seen that on some builders, rpcbind pick some IP >>> > > > > > v6 >>> > > > > > autoconfiguration, but we can't reproduce that, and there is >>> > > > > > no ip >>> > > > > > v6 >>> > > > > > set up anywhere. I suspect the network.service failure is >>> > > > > > somehow >>> > > > > > involved, but fail to see how. In turn, rpcbind.socket not >>> > > > > > starting >>> > > > > > could cause NFS test troubles. >>> > > > > > >>> > > > > > Our current stop gap fix was to fix all the builders one by >>> > > > > > one. >>> > > > > > Remove >>> > > > > > the config, kill the rogue dhclient, restart network service. >>> > > > > > >>> > > > > > However, we can't be sure this is going to fix the problem >>> > > > > > long >>> > > > > > term >>> > > > > > since this only manifest after a crash of the test suite, and >>> > > > > > it >>> > > > > > doesn't happen so often. (plus, it was working before some >>> > > > > > day in >>> > > > > > the >>> > > > > > past, when something did make this fail, and I do not know if >>> > > > > > that's a >>> > > > > > system upgrade, or a test change, or both). >>> > > > > > >>> > > > > > So we are still looking at it to have a complete >>> > > > > > understanding of >>> > > > > > the >>> > > > > > issue, but so far, we hacked our way to make it work (or so >>> > > > > > do I >>> > > > > > think). >>> > > > > > >>> > > > > > Deepshika is working to fix it long term, by fixing the issue >>> > > > > > regarding >>> > > > > > eth0/ens5 with a new base image. >>> > > > > > -- >>> > > > > > Michael Scherer >>> > > > > > Sysadmin, Community Infrastructure and Platform, OSAS >>> > > > > > >>> > > > > > >>> > > > > > -- >>> > > > > >>> > > > > - Atin (atinm) >>> > > > >>> > > > -- >>> > > > Michael Scherer >>> > > > Sysadmin, Community Infrastructure >>> > > > >>> > > > >>> > > > >>> > > > _______________________________________________ >>> > > > Gluster-devel mailing list >>> > > > Gluster-devel at gluster.org >>> > > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>> > > >>> > > _______________________________________________ >>> > > Gluster-devel mailing list >>> > > Gluster-devel at gluster.org >>> > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>> > >>> > >>> > >>> -- >>> Michael Scherer >>> Sysadmin, Community Infrastructure >>> >>> >>> >>> _______________________________________________ >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> > > -- > Thanks, > Sanju > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/836554017 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/486278655 > > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Wed May 8 14:08:15 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Wed, 8 May 2019 19:38:15 +0530 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: builder204 needs to be fixed, too many failures, mostly none of the patches are passing regression. On Wed, May 8, 2019 at 9:53 AM Atin Mukherjee wrote: > > > On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde wrote: > >> Deepshikha, >> >> I see the failure here[1] which ran on builder206. So, we are good. >> > > Not really, > https://build.gluster.org/job/centos7-regression/5909/consoleFull failed > on builder204 for similar reasons I believe? > > I am bit more worried on this issue being resurfacing more often these > days. What can we do to fix this permanently? > > >> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull >> >> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal < >> dkhandel at redhat.com> wrote: >> >>> Sanju, can you please give us more info about the failures. >>> >>> I see the failures occurring on just one of the builder (builder206). >>> I'm taking it back offline for now. >>> >>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer >>> wrote: >>> >>>> Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : >>>> > Looks like is_nfs_export_available started failing again in recent >>>> > centos-regressions. >>>> > >>>> > Michael, can you please check? >>>> >>>> I will try but I am leaving for vacation tonight, so if I find nothing, >>>> until I leave, I guess Deepshika will have to look. >>>> >>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: >>>> > >>>> > > >>>> > > >>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < >>>> > > mscherer at redhat.com> >>>> > > wrote: >>>> > > >>>> > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >>>> > > > > Is this back again? The recent patches are failing regression >>>> > > > > :-\ . >>>> > > > >>>> > > > So, on builder206, it took me a while to find that the issue is >>>> > > > that >>>> > > > nfs (the service) was running. >>>> > > > >>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs >>>> > > > initialisation >>>> > > > failed with a rather cryptic message: >>>> > > > >>>> > > > [2019-04-23 13:17:05.371733] I >>>> > > > [socket.c:991:__socket_server_bind] 0- >>>> > > > socket.nfs-server: process started listening on port (38465) >>>> > > > [2019-04-23 13:17:05.385819] E >>>> > > > [socket.c:972:__socket_server_bind] 0- >>>> > > > socket.nfs-server: binding to failed: Address already in use >>>> > > > [2019-04-23 13:17:05.385843] E >>>> > > > [socket.c:974:__socket_server_bind] 0- >>>> > > > socket.nfs-server: Port is already in use >>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 >>>> > > > >>>> > > > I found where this came from, but a few stuff did surprised me: >>>> > > > >>>> > > > - the order of print is different that the order in the code >>>> > > > >>>> > > >>>> > > Indeed strange... >>>> > > >>>> > > > - the message on "started listening" didn't take in account the >>>> > > > fact >>>> > > > that bind failed on: >>>> > > > >>>> > > >>>> > > Shouldn't it bail out if it failed to bind? >>>> > > Some missing 'goto out' around line 975/976? >>>> > > Y. >>>> > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> >>>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >>>> > > > >>>> > > > The message about port 38465 also threw me off the track. The >>>> > > > real >>>> > > > issue is that the service nfs was already running, and I couldn't >>>> > > > find >>>> > > > anything listening on port 38465 >>>> > > > >>>> > > > once I do service nfs stop, it no longer failed. >>>> > > > >>>> > > > So far, I do know why nfs.service was activated. >>>> > > > >>>> > > > But at least, 206 should be fixed, and we know a bit more on what >>>> > > > would >>>> > > > be causing some failure. >>>> > > > >>>> > > > >>>> > > > >>>> > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < >>>> > > > > mscherer at redhat.com> >>>> > > > > wrote: >>>> > > > > >>>> > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a >>>> > > > > > ?crit : >>>> > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >>>> > > > > > > jthottan at redhat.com> >>>> > > > > > > wrote: >>>> > > > > > > >>>> > > > > > > > Hi, >>>> > > > > > > > >>>> > > > > > > > is_nfs_export_available is just a wrapper around >>>> > > > > > > > "showmount" >>>> > > > > > > > command AFAIR. >>>> > > > > > > > I saw following messages in console output. >>>> > > > > > > > mount.nfs: rpc.statd is not running but is required for >>>> > > > > > > > remote >>>> > > > > > > > locking. >>>> > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks >>>> > > > > > > > local, >>>> > > > > > > > or >>>> > > > > > > > start >>>> > > > > > > > statd. >>>> > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was >>>> > > > > > > > specified >>>> > > > > > > > >>>> > > > > > > > For me it looks rpcbind may not be running on the >>>> > > > > > > > machine. >>>> > > > > > > > Usually rpcbind starts automatically on machines, don't >>>> > > > > > > > know >>>> > > > > > > > whether it >>>> > > > > > > > can happen or not. >>>> > > > > > > > >>>> > > > > > > >>>> > > > > > > That's precisely what the question is. Why suddenly we're >>>> > > > > > > seeing >>>> > > > > > > this >>>> > > > > > > happening too frequently. Today I saw atleast 4 to 5 such >>>> > > > > > > failures >>>> > > > > > > already. >>>> > > > > > > >>>> > > > > > > Deepshika - Can you please help in inspecting this? >>>> > > > > > >>>> > > > > > So we think (we are not sure) that the issue is a bit >>>> > > > > > complex. >>>> > > > > > >>>> > > > > > What we were investigating was nightly run fail on aws. When >>>> > > > > > the >>>> > > > > > build >>>> > > > > > crash, the builder is restarted, since that's the easiest way >>>> > > > > > to >>>> > > > > > clean >>>> > > > > > everything (since even with a perfect test suite that would >>>> > > > > > clean >>>> > > > > > itself, we could always end in a corrupt state on the system, >>>> > > > > > WRT >>>> > > > > > mount, fs, etc). >>>> > > > > > >>>> > > > > > In turn, this seems to cause trouble on aws, since cloud-init >>>> > > > > > or >>>> > > > > > something rename eth0 interface to ens5, without cleaning to >>>> > > > > > the >>>> > > > > > network configuration. >>>> > > > > > >>>> > > > > > So the network init script fail (because the image say "start >>>> > > > > > eth0" >>>> > > > > > and >>>> > > > > > that's not present), but fail in a weird way. Network is >>>> > > > > > initialised >>>> > > > > > and working (we can connect), but the dhclient process is not >>>> > > > > > in >>>> > > > > > the >>>> > > > > > right cgroup, and network.service is in failed state. >>>> > > > > > Restarting >>>> > > > > > network didn't work. In turn, this mean that rpc-statd refuse >>>> > > > > > to >>>> > > > > > start >>>> > > > > > (due to systemd dependencies), which seems to impact various >>>> > > > > > NFS >>>> > > > > > tests. >>>> > > > > > >>>> > > > > > We have also seen that on some builders, rpcbind pick some IP >>>> > > > > > v6 >>>> > > > > > autoconfiguration, but we can't reproduce that, and there is >>>> > > > > > no ip >>>> > > > > > v6 >>>> > > > > > set up anywhere. I suspect the network.service failure is >>>> > > > > > somehow >>>> > > > > > involved, but fail to see how. In turn, rpcbind.socket not >>>> > > > > > starting >>>> > > > > > could cause NFS test troubles. >>>> > > > > > >>>> > > > > > Our current stop gap fix was to fix all the builders one by >>>> > > > > > one. >>>> > > > > > Remove >>>> > > > > > the config, kill the rogue dhclient, restart network service. >>>> > > > > > >>>> > > > > > However, we can't be sure this is going to fix the problem >>>> > > > > > long >>>> > > > > > term >>>> > > > > > since this only manifest after a crash of the test suite, and >>>> > > > > > it >>>> > > > > > doesn't happen so often. (plus, it was working before some >>>> > > > > > day in >>>> > > > > > the >>>> > > > > > past, when something did make this fail, and I do not know if >>>> > > > > > that's a >>>> > > > > > system upgrade, or a test change, or both). >>>> > > > > > >>>> > > > > > So we are still looking at it to have a complete >>>> > > > > > understanding of >>>> > > > > > the >>>> > > > > > issue, but so far, we hacked our way to make it work (or so >>>> > > > > > do I >>>> > > > > > think). >>>> > > > > > >>>> > > > > > Deepshika is working to fix it long term, by fixing the issue >>>> > > > > > regarding >>>> > > > > > eth0/ens5 with a new base image. >>>> > > > > > -- >>>> > > > > > Michael Scherer >>>> > > > > > Sysadmin, Community Infrastructure and Platform, OSAS >>>> > > > > > >>>> > > > > > >>>> > > > > > -- >>>> > > > > >>>> > > > > - Atin (atinm) >>>> > > > >>>> > > > -- >>>> > > > Michael Scherer >>>> > > > Sysadmin, Community Infrastructure >>>> > > > >>>> > > > >>>> > > > >>>> > > > _______________________________________________ >>>> > > > Gluster-devel mailing list >>>> > > > Gluster-devel at gluster.org >>>> > > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> > > >>>> > > _______________________________________________ >>>> > > Gluster-devel mailing list >>>> > > Gluster-devel at gluster.org >>>> > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> > >>>> > >>>> > >>>> -- >>>> Michael Scherer >>>> Sysadmin, Community Infrastructure >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> >> >> -- >> Thanks, >> Sanju >> _______________________________________________ >> >> Community Meeting Calendar: >> >> APAC Schedule - >> Every 2nd and 4th Tuesday at 11:30 AM IST >> Bridge: https://bluejeans.com/836554017 >> >> NA/EMEA Schedule - >> Every 1st and 3rd Tuesday at 01:00 PM EDT >> Bridge: https://bluejeans.com/486278655 >> >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From amukherj at redhat.com Thu May 9 04:31:47 2019 From: amukherj at redhat.com (Atin Mukherjee) Date: Thu, 9 May 2019 10:01:47 +0530 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: On Wed, May 8, 2019 at 7:38 PM Atin Mukherjee wrote: > builder204 needs to be fixed, too many failures, mostly none of the > patches are passing regression. > And with that builder201 joins the pool, https://build.gluster.org/job/centos7-regression/5943/consoleFull > On Wed, May 8, 2019 at 9:53 AM Atin Mukherjee wrote: > >> >> >> On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde wrote: >> >>> Deepshikha, >>> >>> I see the failure here[1] which ran on builder206. So, we are good. >>> >> >> Not really, >> https://build.gluster.org/job/centos7-regression/5909/consoleFull failed >> on builder204 for similar reasons I believe? >> >> I am bit more worried on this issue being resurfacing more often these >> days. What can we do to fix this permanently? >> >> >>> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull >>> >>> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal < >>> dkhandel at redhat.com> wrote: >>> >>>> Sanju, can you please give us more info about the failures. >>>> >>>> I see the failures occurring on just one of the builder (builder206). >>>> I'm taking it back offline for now. >>>> >>>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer >>>> wrote: >>>> >>>>> Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : >>>>> > Looks like is_nfs_export_available started failing again in recent >>>>> > centos-regressions. >>>>> > >>>>> > Michael, can you please check? >>>>> >>>>> I will try but I am leaving for vacation tonight, so if I find nothing, >>>>> until I leave, I guess Deepshika will have to look. >>>>> >>>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: >>>>> > >>>>> > > >>>>> > > >>>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < >>>>> > > mscherer at redhat.com> >>>>> > > wrote: >>>>> > > >>>>> > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >>>>> > > > > Is this back again? The recent patches are failing regression >>>>> > > > > :-\ . >>>>> > > > >>>>> > > > So, on builder206, it took me a while to find that the issue is >>>>> > > > that >>>>> > > > nfs (the service) was running. >>>>> > > > >>>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs >>>>> > > > initialisation >>>>> > > > failed with a rather cryptic message: >>>>> > > > >>>>> > > > [2019-04-23 13:17:05.371733] I >>>>> > > > [socket.c:991:__socket_server_bind] 0- >>>>> > > > socket.nfs-server: process started listening on port (38465) >>>>> > > > [2019-04-23 13:17:05.385819] E >>>>> > > > [socket.c:972:__socket_server_bind] 0- >>>>> > > > socket.nfs-server: binding to failed: Address already in use >>>>> > > > [2019-04-23 13:17:05.385843] E >>>>> > > > [socket.c:974:__socket_server_bind] 0- >>>>> > > > socket.nfs-server: Port is already in use >>>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >>>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 >>>>> > > > >>>>> > > > I found where this came from, but a few stuff did surprised me: >>>>> > > > >>>>> > > > - the order of print is different that the order in the code >>>>> > > > >>>>> > > >>>>> > > Indeed strange... >>>>> > > >>>>> > > > - the message on "started listening" didn't take in account the >>>>> > > > fact >>>>> > > > that bind failed on: >>>>> > > > >>>>> > > >>>>> > > Shouldn't it bail out if it failed to bind? >>>>> > > Some missing 'goto out' around line 975/976? >>>>> > > Y. >>>>> > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > >>>>> >>>>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >>>>> > > > >>>>> > > > The message about port 38465 also threw me off the track. The >>>>> > > > real >>>>> > > > issue is that the service nfs was already running, and I couldn't >>>>> > > > find >>>>> > > > anything listening on port 38465 >>>>> > > > >>>>> > > > once I do service nfs stop, it no longer failed. >>>>> > > > >>>>> > > > So far, I do know why nfs.service was activated. >>>>> > > > >>>>> > > > But at least, 206 should be fixed, and we know a bit more on what >>>>> > > > would >>>>> > > > be causing some failure. >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < >>>>> > > > > mscherer at redhat.com> >>>>> > > > > wrote: >>>>> > > > > >>>>> > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a >>>>> > > > > > ?crit : >>>>> > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >>>>> > > > > > > jthottan at redhat.com> >>>>> > > > > > > wrote: >>>>> > > > > > > >>>>> > > > > > > > Hi, >>>>> > > > > > > > >>>>> > > > > > > > is_nfs_export_available is just a wrapper around >>>>> > > > > > > > "showmount" >>>>> > > > > > > > command AFAIR. >>>>> > > > > > > > I saw following messages in console output. >>>>> > > > > > > > mount.nfs: rpc.statd is not running but is required for >>>>> > > > > > > > remote >>>>> > > > > > > > locking. >>>>> > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks >>>>> > > > > > > > local, >>>>> > > > > > > > or >>>>> > > > > > > > start >>>>> > > > > > > > statd. >>>>> > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was >>>>> > > > > > > > specified >>>>> > > > > > > > >>>>> > > > > > > > For me it looks rpcbind may not be running on the >>>>> > > > > > > > machine. >>>>> > > > > > > > Usually rpcbind starts automatically on machines, don't >>>>> > > > > > > > know >>>>> > > > > > > > whether it >>>>> > > > > > > > can happen or not. >>>>> > > > > > > > >>>>> > > > > > > >>>>> > > > > > > That's precisely what the question is. Why suddenly we're >>>>> > > > > > > seeing >>>>> > > > > > > this >>>>> > > > > > > happening too frequently. Today I saw atleast 4 to 5 such >>>>> > > > > > > failures >>>>> > > > > > > already. >>>>> > > > > > > >>>>> > > > > > > Deepshika - Can you please help in inspecting this? >>>>> > > > > > >>>>> > > > > > So we think (we are not sure) that the issue is a bit >>>>> > > > > > complex. >>>>> > > > > > >>>>> > > > > > What we were investigating was nightly run fail on aws. When >>>>> > > > > > the >>>>> > > > > > build >>>>> > > > > > crash, the builder is restarted, since that's the easiest way >>>>> > > > > > to >>>>> > > > > > clean >>>>> > > > > > everything (since even with a perfect test suite that would >>>>> > > > > > clean >>>>> > > > > > itself, we could always end in a corrupt state on the system, >>>>> > > > > > WRT >>>>> > > > > > mount, fs, etc). >>>>> > > > > > >>>>> > > > > > In turn, this seems to cause trouble on aws, since >>>>> cloud-init >>>>> > > > > > or >>>>> > > > > > something rename eth0 interface to ens5, without cleaning to >>>>> > > > > > the >>>>> > > > > > network configuration. >>>>> > > > > > >>>>> > > > > > So the network init script fail (because the image say "start >>>>> > > > > > eth0" >>>>> > > > > > and >>>>> > > > > > that's not present), but fail in a weird way. Network is >>>>> > > > > > initialised >>>>> > > > > > and working (we can connect), but the dhclient process is not >>>>> > > > > > in >>>>> > > > > > the >>>>> > > > > > right cgroup, and network.service is in failed state. >>>>> > > > > > Restarting >>>>> > > > > > network didn't work. In turn, this mean that rpc-statd refuse >>>>> > > > > > to >>>>> > > > > > start >>>>> > > > > > (due to systemd dependencies), which seems to impact various >>>>> > > > > > NFS >>>>> > > > > > tests. >>>>> > > > > > >>>>> > > > > > We have also seen that on some builders, rpcbind pick some IP >>>>> > > > > > v6 >>>>> > > > > > autoconfiguration, but we can't reproduce that, and there is >>>>> > > > > > no ip >>>>> > > > > > v6 >>>>> > > > > > set up anywhere. I suspect the network.service failure is >>>>> > > > > > somehow >>>>> > > > > > involved, but fail to see how. In turn, rpcbind.socket not >>>>> > > > > > starting >>>>> > > > > > could cause NFS test troubles. >>>>> > > > > > >>>>> > > > > > Our current stop gap fix was to fix all the builders one by >>>>> > > > > > one. >>>>> > > > > > Remove >>>>> > > > > > the config, kill the rogue dhclient, restart network service. >>>>> > > > > > >>>>> > > > > > However, we can't be sure this is going to fix the problem >>>>> > > > > > long >>>>> > > > > > term >>>>> > > > > > since this only manifest after a crash of the test suite, and >>>>> > > > > > it >>>>> > > > > > doesn't happen so often. (plus, it was working before some >>>>> > > > > > day in >>>>> > > > > > the >>>>> > > > > > past, when something did make this fail, and I do not know if >>>>> > > > > > that's a >>>>> > > > > > system upgrade, or a test change, or both). >>>>> > > > > > >>>>> > > > > > So we are still looking at it to have a complete >>>>> > > > > > understanding of >>>>> > > > > > the >>>>> > > > > > issue, but so far, we hacked our way to make it work (or so >>>>> > > > > > do I >>>>> > > > > > think). >>>>> > > > > > >>>>> > > > > > Deepshika is working to fix it long term, by fixing the issue >>>>> > > > > > regarding >>>>> > > > > > eth0/ens5 with a new base image. >>>>> > > > > > -- >>>>> > > > > > Michael Scherer >>>>> > > > > > Sysadmin, Community Infrastructure and Platform, OSAS >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > -- >>>>> > > > > >>>>> > > > > - Atin (atinm) >>>>> > > > >>>>> > > > -- >>>>> > > > Michael Scherer >>>>> > > > Sysadmin, Community Infrastructure >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > _______________________________________________ >>>>> > > > Gluster-devel mailing list >>>>> > > > Gluster-devel at gluster.org >>>>> > > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> > > >>>>> > > _______________________________________________ >>>>> > > Gluster-devel mailing list >>>>> > > Gluster-devel at gluster.org >>>>> > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> > >>>>> > >>>>> > >>>>> -- >>>>> Michael Scherer >>>>> Sysadmin, Community Infrastructure >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-devel mailing list >>>>> Gluster-devel at gluster.org >>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> >>>> >>> >>> -- >>> Thanks, >>> Sanju >>> _______________________________________________ >>> >>> Community Meeting Calendar: >>> >>> APAC Schedule - >>> Every 2nd and 4th Tuesday at 11:30 AM IST >>> Bridge: https://bluejeans.com/836554017 >>> >>> NA/EMEA Schedule - >>> Every 1st and 3rd Tuesday at 01:00 PM EDT >>> Bridge: https://bluejeans.com/486278655 >>> >>> Gluster-devel mailing list >>> Gluster-devel at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkhandel at redhat.com Thu May 9 05:56:22 2019 From: dkhandel at redhat.com (Deepshikha Khandelwal) Date: Thu, 9 May 2019 11:26:22 +0530 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: I took a quick look at the builders and noticed both have the same error of 'Cannot allocate memory' which comes up every time when the builder is rebooted after a build abort. It is happening in the same pattern. Though there's no such memory consumption on the builders. I?m investigating more on this. On Thu, May 9, 2019 at 10:02 AM Atin Mukherjee wrote: > > > On Wed, May 8, 2019 at 7:38 PM Atin Mukherjee wrote: > >> builder204 needs to be fixed, too many failures, mostly none of the >> patches are passing regression. >> > > And with that builder201 joins the pool, > https://build.gluster.org/job/centos7-regression/5943/consoleFull > > >> On Wed, May 8, 2019 at 9:53 AM Atin Mukherjee >> wrote: >> >>> >>> >>> On Wed, May 8, 2019 at 7:16 AM Sanju Rakonde >>> wrote: >>> >>>> Deepshikha, >>>> >>>> I see the failure here[1] which ran on builder206. So, we are good. >>>> >>> >>> Not really, >>> https://build.gluster.org/job/centos7-regression/5909/consoleFull >>> failed on builder204 for similar reasons I believe? >>> >>> I am bit more worried on this issue being resurfacing more often these >>> days. What can we do to fix this permanently? >>> >>> >>>> [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull >>>> >>>> On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal < >>>> dkhandel at redhat.com> wrote: >>>> >>>>> Sanju, can you please give us more info about the failures. >>>>> >>>>> I see the failures occurring on just one of the builder (builder206). >>>>> I'm taking it back offline for now. >>>>> >>>>> On Tue, May 7, 2019 at 9:42 PM Michael Scherer >>>>> wrote: >>>>> >>>>>> Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : >>>>>> > Looks like is_nfs_export_available started failing again in recent >>>>>> > centos-regressions. >>>>>> > >>>>>> > Michael, can you please check? >>>>>> >>>>>> I will try but I am leaving for vacation tonight, so if I find >>>>>> nothing, >>>>>> until I leave, I guess Deepshika will have to look. >>>>>> >>>>>> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul >>>>>> wrote: >>>>>> > >>>>>> > > >>>>>> > > >>>>>> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < >>>>>> > > mscherer at redhat.com> >>>>>> > > wrote: >>>>>> > > >>>>>> > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >>>>>> > > > > Is this back again? The recent patches are failing regression >>>>>> > > > > :-\ . >>>>>> > > > >>>>>> > > > So, on builder206, it took me a while to find that the issue is >>>>>> > > > that >>>>>> > > > nfs (the service) was running. >>>>>> > > > >>>>>> > > > ./tests/basic/afr/tarissue.t failed, because the nfs >>>>>> > > > initialisation >>>>>> > > > failed with a rather cryptic message: >>>>>> > > > >>>>>> > > > [2019-04-23 13:17:05.371733] I >>>>>> > > > [socket.c:991:__socket_server_bind] 0- >>>>>> > > > socket.nfs-server: process started listening on port (38465) >>>>>> > > > [2019-04-23 13:17:05.385819] E >>>>>> > > > [socket.c:972:__socket_server_bind] 0- >>>>>> > > > socket.nfs-server: binding to failed: Address already in use >>>>>> > > > [2019-04-23 13:17:05.385843] E >>>>>> > > > [socket.c:974:__socket_server_bind] 0- >>>>>> > > > socket.nfs-server: Port is already in use >>>>>> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >>>>>> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 >>>>>> > > > >>>>>> > > > I found where this came from, but a few stuff did surprised me: >>>>>> > > > >>>>>> > > > - the order of print is different that the order in the code >>>>>> > > > >>>>>> > > >>>>>> > > Indeed strange... >>>>>> > > >>>>>> > > > - the message on "started listening" didn't take in account the >>>>>> > > > fact >>>>>> > > > that bind failed on: >>>>>> > > > >>>>>> > > >>>>>> > > Shouldn't it bail out if it failed to bind? >>>>>> > > Some missing 'goto out' around line 975/976? >>>>>> > > Y. >>>>>> > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> >>>>>> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >>>>>> > > > >>>>>> > > > The message about port 38465 also threw me off the track. The >>>>>> > > > real >>>>>> > > > issue is that the service nfs was already running, and I >>>>>> couldn't >>>>>> > > > find >>>>>> > > > anything listening on port 38465 >>>>>> > > > >>>>>> > > > once I do service nfs stop, it no longer failed. >>>>>> > > > >>>>>> > > > So far, I do know why nfs.service was activated. >>>>>> > > > >>>>>> > > > But at least, 206 should be fixed, and we know a bit more on >>>>>> what >>>>>> > > > would >>>>>> > > > be causing some failure. >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < >>>>>> > > > > mscherer at redhat.com> >>>>>> > > > > wrote: >>>>>> > > > > >>>>>> > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a >>>>>> > > > > > ?crit : >>>>>> > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >>>>>> > > > > > > jthottan at redhat.com> >>>>>> > > > > > > wrote: >>>>>> > > > > > > >>>>>> > > > > > > > Hi, >>>>>> > > > > > > > >>>>>> > > > > > > > is_nfs_export_available is just a wrapper around >>>>>> > > > > > > > "showmount" >>>>>> > > > > > > > command AFAIR. >>>>>> > > > > > > > I saw following messages in console output. >>>>>> > > > > > > > mount.nfs: rpc.statd is not running but is required for >>>>>> > > > > > > > remote >>>>>> > > > > > > > locking. >>>>>> > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks >>>>>> > > > > > > > local, >>>>>> > > > > > > > or >>>>>> > > > > > > > start >>>>>> > > > > > > > statd. >>>>>> > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was >>>>>> > > > > > > > specified >>>>>> > > > > > > > >>>>>> > > > > > > > For me it looks rpcbind may not be running on the >>>>>> > > > > > > > machine. >>>>>> > > > > > > > Usually rpcbind starts automatically on machines, don't >>>>>> > > > > > > > know >>>>>> > > > > > > > whether it >>>>>> > > > > > > > can happen or not. >>>>>> > > > > > > > >>>>>> > > > > > > >>>>>> > > > > > > That's precisely what the question is. Why suddenly we're >>>>>> > > > > > > seeing >>>>>> > > > > > > this >>>>>> > > > > > > happening too frequently. Today I saw atleast 4 to 5 such >>>>>> > > > > > > failures >>>>>> > > > > > > already. >>>>>> > > > > > > >>>>>> > > > > > > Deepshika - Can you please help in inspecting this? >>>>>> > > > > > >>>>>> > > > > > So we think (we are not sure) that the issue is a bit >>>>>> > > > > > complex. >>>>>> > > > > > >>>>>> > > > > > What we were investigating was nightly run fail on aws. When >>>>>> > > > > > the >>>>>> > > > > > build >>>>>> > > > > > crash, the builder is restarted, since that's the easiest >>>>>> way >>>>>> > > > > > to >>>>>> > > > > > clean >>>>>> > > > > > everything (since even with a perfect test suite that would >>>>>> > > > > > clean >>>>>> > > > > > itself, we could always end in a corrupt state on the >>>>>> system, >>>>>> > > > > > WRT >>>>>> > > > > > mount, fs, etc). >>>>>> > > > > > >>>>>> > > > > > In turn, this seems to cause trouble on aws, since >>>>>> cloud-init >>>>>> > > > > > or >>>>>> > > > > > something rename eth0 interface to ens5, without cleaning to >>>>>> > > > > > the >>>>>> > > > > > network configuration. >>>>>> > > > > > >>>>>> > > > > > So the network init script fail (because the image say >>>>>> "start >>>>>> > > > > > eth0" >>>>>> > > > > > and >>>>>> > > > > > that's not present), but fail in a weird way. Network is >>>>>> > > > > > initialised >>>>>> > > > > > and working (we can connect), but the dhclient process is >>>>>> not >>>>>> > > > > > in >>>>>> > > > > > the >>>>>> > > > > > right cgroup, and network.service is in failed state. >>>>>> > > > > > Restarting >>>>>> > > > > > network didn't work. In turn, this mean that rpc-statd >>>>>> refuse >>>>>> > > > > > to >>>>>> > > > > > start >>>>>> > > > > > (due to systemd dependencies), which seems to impact various >>>>>> > > > > > NFS >>>>>> > > > > > tests. >>>>>> > > > > > >>>>>> > > > > > We have also seen that on some builders, rpcbind pick some >>>>>> IP >>>>>> > > > > > v6 >>>>>> > > > > > autoconfiguration, but we can't reproduce that, and there is >>>>>> > > > > > no ip >>>>>> > > > > > v6 >>>>>> > > > > > set up anywhere. I suspect the network.service failure is >>>>>> > > > > > somehow >>>>>> > > > > > involved, but fail to see how. In turn, rpcbind.socket not >>>>>> > > > > > starting >>>>>> > > > > > could cause NFS test troubles. >>>>>> > > > > > >>>>>> > > > > > Our current stop gap fix was to fix all the builders one by >>>>>> > > > > > one. >>>>>> > > > > > Remove >>>>>> > > > > > the config, kill the rogue dhclient, restart network >>>>>> service. >>>>>> > > > > > >>>>>> > > > > > However, we can't be sure this is going to fix the problem >>>>>> > > > > > long >>>>>> > > > > > term >>>>>> > > > > > since this only manifest after a crash of the test suite, >>>>>> and >>>>>> > > > > > it >>>>>> > > > > > doesn't happen so often. (plus, it was working before some >>>>>> > > > > > day in >>>>>> > > > > > the >>>>>> > > > > > past, when something did make this fail, and I do not know >>>>>> if >>>>>> > > > > > that's a >>>>>> > > > > > system upgrade, or a test change, or both). >>>>>> > > > > > >>>>>> > > > > > So we are still looking at it to have a complete >>>>>> > > > > > understanding of >>>>>> > > > > > the >>>>>> > > > > > issue, but so far, we hacked our way to make it work (or so >>>>>> > > > > > do I >>>>>> > > > > > think). >>>>>> > > > > > >>>>>> > > > > > Deepshika is working to fix it long term, by fixing the >>>>>> issue >>>>>> > > > > > regarding >>>>>> > > > > > eth0/ens5 with a new base image. >>>>>> > > > > > -- >>>>>> > > > > > Michael Scherer >>>>>> > > > > > Sysadmin, Community Infrastructure and Platform, OSAS >>>>>> > > > > > >>>>>> > > > > > >>>>>> > > > > > -- >>>>>> > > > > >>>>>> > > > > - Atin (atinm) >>>>>> > > > >>>>>> > > > -- >>>>>> > > > Michael Scherer >>>>>> > > > Sysadmin, Community Infrastructure >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > _______________________________________________ >>>>>> > > > Gluster-devel mailing list >>>>>> > > > Gluster-devel at gluster.org >>>>>> > > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>> > > >>>>>> > > _______________________________________________ >>>>>> > > Gluster-devel mailing list >>>>>> > > Gluster-devel at gluster.org >>>>>> > > https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>> > >>>>>> > >>>>>> > >>>>>> -- >>>>>> Michael Scherer >>>>>> Sysadmin, Community Infrastructure >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-devel mailing list >>>>>> Gluster-devel at gluster.org >>>>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>>> >>>>> >>>> >>>> -- >>>> Thanks, >>>> Sanju >>>> _______________________________________________ >>>> >>>> Community Meeting Calendar: >>>> >>>> APAC Schedule - >>>> Every 2nd and 4th Tuesday at 11:30 AM IST >>>> Bridge: https://bluejeans.com/836554017 >>>> >>>> NA/EMEA Schedule - >>>> Every 1st and 3rd Tuesday at 01:00 PM EDT >>>> Bridge: https://bluejeans.com/486278655 >>>> >>>> Gluster-devel mailing list >>>> Gluster-devel at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-devel >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla at redhat.com Thu May 9 13:17:53 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 13:17:53 +0000 Subject: [Gluster-infra] [Bug 1708257] New: Grant additional maintainers merge rights on release branches Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Bug ID: 1708257 Summary: Grant additional maintainers merge rights on release branches Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Assignee: bugs at gluster.org Reporter: srangana at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org, hgowtham at redhat.com, rkothiya at redhat.com, sunkumar at redhat.com Target Milestone: --- Classification: Community Going forward the following owners would be managing the minor release branches and require merge rights for the same, - Hari Gowtham - Rinku Kothiya - Sunny Kumar I am marking this bug NEEDINFO from each of the above users, for them to provide their github username and also to ensure that 2FA is setup on their github accounts before permissions are granted to them. Branches that they need merge rights to are: release-4.1 release-5 release-6 -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 13:18:57 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 13:18:57 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Shyamsundar changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(hgowtham at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 13:19:10 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 13:19:10 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Shyamsundar changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(sunkumar at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 13:19:25 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 13:19:25 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Shyamsundar changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(rkothiya at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 13:49:43 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 13:49:43 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 hari gowtham changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(hgowtham at redhat.c | |om) | |needinfo?(sunkumar at redhat.c | |om) | |needinfo?(rkothiya at redhat.c | |om) | --- Comment #1 from hari gowtham --- hgowtham's username: harigowtham and 2FA has been activated for github. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 19:52:19 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 19:52:19 +0000 Subject: [Gluster-infra] [Bug 1348071] Change backup process so we only backups in a specific pattern In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1348071 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED CC| |atumball at redhat.com Resolution|--- |WORKSFORME Last Closed| |2019-05-09 19:52:19 --- Comment #1 from Amar Tumballi --- This is mostly done now with each job having a limit on how long it stores data. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 9 19:53:35 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 09 May 2019 19:53:35 +0000 Subject: [Gluster-infra] [Bug 1348072] Backups for Gerrit In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1348072 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |high CC| |atumball at redhat.com Severity|unspecified |high --- Comment #4 from Amar Tumballi --- Any idea if this is done? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 05:28:15 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 05:28:15 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Sunny Kumar changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(sunkumar at redhat.c |needinfo?(rkothiya at redhat.c |om) |om) |needinfo?(rkothiya at redhat.c | |om) | --- Comment #3 from Sunny Kumar --- Hi Sunny's Username: sunnyku and 2FA is enabled for this account. -Sunny -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 12:54:11 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 12:54:11 +0000 Subject: [Gluster-infra] [Bug 1428047] Require a Jenkins job to validate Change-ID on commits to branches in glusterfs repository In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1428047 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |atumball at redhat.com --- Comment #19 from Amar Tumballi --- I guess for now, the work we have done through ./rfc.sh is good. Prefer to close as WORKSFORME. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:10:06 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:10:06 +0000 Subject: [Gluster-infra] [Bug 1458719] download.gluster.org occasionally has the wrong permissions causing problems for users In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1458719 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED Resolution|--- |WORKSFORME Last Closed| |2019-05-10 13:10:06 --- Comment #2 from Amar Tumballi --- Not seen in a long time now. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:16:03 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:16:03 +0000 Subject: [Gluster-infra] [Bug 1431199] Request to automate closing github PRs In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1431199 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEW Severity|unspecified |medium -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:16:38 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:16:38 +0000 Subject: [Gluster-infra] [Bug 1439706] Change default name in gerrit patch In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1439706 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW CC| |atumball at redhat.com Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:54 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:54 +0000 Subject: [Gluster-infra] [Bug 1557127] github issue update on spec commits In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1557127 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|high |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:54 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:54 +0000 Subject: [Gluster-infra] [Bug 1564451] The abandon job for patches should post info in bugzilla that some patch is abandon'd. In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1564451 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:55 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:55 +0000 Subject: [Gluster-infra] [Bug 1489325] Place to host gerritstats In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1489325 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|medium |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:55 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:55 +0000 Subject: [Gluster-infra] [Bug 1584998] Need automatic inclusion of few reviewers to a given patch In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1584998 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:56 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:56 +0000 Subject: [Gluster-infra] [Bug 1631390] Run smoke and regression on a patch only after passing clang-format job In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1631390 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|medium |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:57 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:57 +0000 Subject: [Gluster-infra] [Bug 1584992] Need python pep8 and other relevant tests in smoke if a patch includes any python file In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1584992 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:57 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:57 +0000 Subject: [Gluster-infra] [Bug 1657584] Re-enable TSAN jobs In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1657584 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:58 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:58 +0000 Subject: [Gluster-infra] [Bug 1562670] Run libgfapi-python tests on Gerrit against glusterfs changes In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1562670 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:59 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:59 +0000 Subject: [Gluster-infra] [Bug 1564130] need option 'cherry-pick to release-x.y' in reviews In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1564130 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:59 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:59 +0000 Subject: [Gluster-infra] [Bug 1620377] Coverity scan setup for gluster-block and related projects In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1620377 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:20:59 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:20:59 +0000 Subject: [Gluster-infra] [Bug 1623596] Git plugin might be suffering from memory leak In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1623596 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:21:00 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:21:00 +0000 Subject: [Gluster-infra] [Bug 1597731] need 'shellcheck' in smoke. In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1597731 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|medium |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:21:00 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:21:00 +0000 Subject: [Gluster-infra] [Bug 1638030] Need a regression job to test out Py3 support in Glusterfs code base In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1638030 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:21:01 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:21:01 +0000 Subject: [Gluster-infra] [Bug 1594857] Make smoke runs detect test cases added to patch In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1594857 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|high |low Status|ASSIGNED |NEW Severity|high |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:21:01 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:21:01 +0000 Subject: [Gluster-infra] [Bug 1463273] infra: include bugzilla query in the weekly BZ email In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1463273 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:21:02 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:21:02 +0000 Subject: [Gluster-infra] [Bug 1598326] Setup CI for gluster-block In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1598326 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |low Status|ASSIGNED |NEW Severity|unspecified |low -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 10 13:25:22 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 10 May 2019 13:25:22 +0000 Subject: [Gluster-infra] [Bug 1685051] New Project create request In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1685051 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |CLOSED Resolution|--- |CURRENTRELEASE Last Closed|2019-03-04 09:26:26 |2019-05-10 13:25:22 --- Comment #8 from Amar Tumballi --- https://github.com/gluster/devblog && https://gluster.github.io/devblog/ -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Wed May 15 06:01:17 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 15 May 2019 06:01:17 +0000 Subject: [Gluster-infra] [Bug 1707671] Cronjob of feeding gluster blogs from different account into planet gluster isn't working In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1707671 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |high CC| |atumball at redhat.com Severity|unspecified |urgent -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Wed May 15 10:57:43 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 15 May 2019 10:57:43 +0000 Subject: [Gluster-infra] [Bug 1707671] Cronjob of feeding gluster blogs from different account into planet gluster isn't working In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1707671 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED CC| |dkhandel at redhat.com Resolution|--- |CURRENTRELEASE Last Closed| |2019-05-15 10:57:43 --- Comment #1 from Deepshikha khandelwal --- It is fixed. I can see your blog there: https://planet.gluster.org/ For other blogs you need to update the feed.yml -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 20 13:07:35 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 20 May 2019 13:07:35 +0000 Subject: [Gluster-infra] [Bug 1711945] New: create account on download.gluster.org Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711945 Bug ID: 1711945 Summary: create account on download.gluster.org Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Assignee: bugs at gluster.org Reporter: spamecha at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community Description - To upload the packages public key - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDaSvn31f+1My0S9aWAvjIWPrVOiENmWrM62CEF/wzBvAnMRuRh0qaGrcrJ1ZKz9+yjetwX7/obuynjjUTd0f/245Jc+f06E66jUJHKGjtk8bwfa0JMUzGYrFVyNFXMPqewRvcHZoFnjZF3xOIbCqTy4H9CZfJZszc83+FLoITPir3HNMJo0ATrSe9XHBRJHne6el+zxfaGQMEe4M5p76oWJORsvYkGjqAEnQSRTbdF9e51VvLz3ME3pdWiPviWF4TIkXolAjD7A2Jm9KK06t9SiOIP9AuVS9llVyf8gOZrwP+IR5gbZeiL5+9G+xWQTi7Pw5anAfJY1Mbe2l31yAen root at server1 -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 20 13:10:26 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 20 May 2019 13:10:26 +0000 Subject: [Gluster-infra] [Bug 1711945] create account on download.gluster.org In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711945 spamecha at redhat.com changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs at gluster.org |dkhandel at redhat.com -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 20 13:25:51 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 20 May 2019 13:25:51 +0000 Subject: [Gluster-infra] [Bug 1711950] New: Account in download.gluster.org to upload upload the build packages Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711950 Bug ID: 1711950 Summary: Account in download.gluster.org to upload upload the build packages Product: GlusterFS Version: 4.1 Status: NEW Component: project-infrastructure Assignee: bugs at gluster.org Reporter: sacharya at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community Description of problem: Need an account in download.gluster.org to upload upload the build packages. rsa-public key: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDXjiqCChrq8B/cJaRx9W3YQdtEo60dGwxuILdtw4xvQQz2/NwPKNAeOZ/1McfLv8zzuJa2Jm8mBzVk3Cc1NO0lRy3hUUphSHGrGe7BjL2WysXk4pYNYrNIza1X6EXjEDphvfRw7FU3DKVMIisOPnOgWW0xGT8Wb5XVfIfQzpW3ZJJX/aR2Nsjas2Dwxbf9hMfPHRNz5OQmNtpbqmkrcr/PC+9t7B5JJ+kdTe8x920/+7EaCTuAIOsin8fPxK4XoynA6BBuZu7B0rZbOm4DfL59loE2304epXbhvJkaTrNnkZOoQJRn4ruLDGq4F5jzCrOZyOH86TmExOz2rJdZC/wP root at vm1 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 20 13:27:24 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 20 May 2019 13:27:24 +0000 Subject: [Gluster-infra] [Bug 1711950] Account in download.gluster.org to upload the build packages In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711950 Shwetha K Acharya changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs at gluster.org |dkhandel at redhat.com Summary|Account in |Account in |download.gluster.org to |download.gluster.org to |upload upload the build |upload the build packages |packages | -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Wed May 22 06:00:41 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 22 May 2019 06:00:41 +0000 Subject: [Gluster-infra] [Bug 1711950] Account in download.gluster.org to upload the build packages In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711950 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mscherer at redhat.com Flags| |needinfo? | |needinfo?(mscherer at redhat.c | |om) --- Comment #1 from Deepshikha khandelwal --- Misc, can you please take a look at it. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Wed May 22 06:21:36 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 22 May 2019 06:21:36 +0000 Subject: [Gluster-infra] [Bug 1711945] create account on download.gluster.org In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711945 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mscherer at redhat.com Flags| |needinfo?(mscherer at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 09:38:02 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 09:38:02 +0000 Subject: [Gluster-infra] [Bug 1713260] New: Using abrt-action-analyze-c on core dumps on CI Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713260 Bug ID: 1713260 Summary: Using abrt-action-analyze-c on core dumps on CI Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Severity: medium Priority: medium Assignee: bugs at gluster.org Reporter: sankarshan at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 14:28:25 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 14:28:25 +0000 Subject: [Gluster-infra] [Bug 1713391] New: Access to wordpress instance of gluster.org required for release management Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713391 Bug ID: 1713391 Summary: Access to wordpress instance of gluster.org required for release management Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Severity: medium Assignee: bugs at gluster.org Reporter: rkothiya at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community Description of problem: As I am managing the release of glusterfs, I need access to gluster.org workpress instance, to publish the release schedule and do other activities related to release. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 14:31:31 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 14:31:31 +0000 Subject: [Gluster-infra] [Bug 1713391] Access to wordpress instance of gluster.org required for release management In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713391 Shyamsundar changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |atumball at redhat.com, | |srangana at redhat.com Flags| |needinfo?(atumball at redhat.c | |om) --- Comment #1 from Shyamsundar --- Approved by me, adding Amar for another ack! -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 14:33:55 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 14:33:55 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 --- Comment #4 from Shyamsundar --- Deepshika/Misc, can we get this done, else their ability to manage releases is not feasible. Thanks! Pint Rinku for your github details as well. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 16:05:36 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 16:05:36 +0000 Subject: [Gluster-infra] [Bug 1713429] New: My personal blog contenting is not feeding to https://planet.gluster.org/ Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713429 Bug ID: 1713429 Summary: My personal blog contenting is not feeding to https://planet.gluster.org/ Product: GlusterFS Version: mainline Status: NEW Component: project-infrastructure Assignee: bugs at gluster.org Reporter: rkavunga at redhat.com CC: bugs at gluster.org, gluster-infra at gluster.org Target Milestone: --- Classification: Community Description of problem: My personal content having the tag glusterfs should have been available in the https://planet.gluster.org/ after merging https://github.com/gluster/planet-gluster/pull/47. The content is not feeding to to the site. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Thu May 23 19:23:40 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Thu, 23 May 2019 19:23:40 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Rinku changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(rkothiya at redhat.c | |om) | --- Comment #5 from Rinku --- Hi Rinku's Username: rkothiya and 2FA is enabled for this account on github. Regards Rinku -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 05:06:46 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 05:06:46 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dkhandel at redhat.com --- Comment #6 from Deepshikha khandelwal --- Done. You all have now merge rights on the given branches. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 05:07:23 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 05:07:23 +0000 Subject: [Gluster-infra] [Bug 1708257] Grant additional maintainers merge rights on release branches In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1708257 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED Resolution|--- |CURRENTRELEASE Last Closed| |2019-05-24 05:07:23 -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:05:30 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:05:30 +0000 Subject: [Gluster-infra] [Bug 1713429] My personal blog contenting is not feeding to https://planet.gluster.org/ In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713429 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged CC| |dkhandel at redhat.com -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:08:12 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:08:12 +0000 Subject: [Gluster-infra] [Bug 1713260] Using abrt-action-analyze-c on core dumps on CI In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713260 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged CC| |dkhandel at redhat.com, | |ravishankar at redhat.com -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:08:41 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:08:41 +0000 Subject: [Gluster-infra] [Bug 1711950] Account in download.gluster.org to upload the build packages In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711950 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged Flags|needinfo? | |needinfo?(mscherer at redhat.c | |om) | -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:09:16 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:09:16 +0000 Subject: [Gluster-infra] [Bug 1711945] create account on download.gluster.org In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1711945 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:45:30 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:45:30 +0000 Subject: [Gluster-infra] [Bug 1703435] gluster-block: Upstream Jenkins job which get triggered at PR level In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1703435 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 24 15:45:42 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 24 May 2019 15:45:42 +0000 Subject: [Gluster-infra] [Bug 1703433] gluster-block: setup GCOV & LCOV job In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1703433 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Sun May 26 14:05:36 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Sun, 26 May 2019 14:05:36 +0000 Subject: [Gluster-infra] [Bug 1713391] Access to wordpress instance of gluster.org required for release management In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713391 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(atumball at redhat.c | |om) | --- Comment #2 from Amar Tumballi --- Ack, Approved by me, too! -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 01:00:54 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 01:00:54 +0000 Subject: [Gluster-infra] [Bug 1348072] Backups for Gerrit In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1348072 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sankarshan at redhat.com Flags| |needinfo?(mscherer at redhat.c | |om) --- Comment #5 from sankarshan --- Michael, is this being planned in the near future? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 01:47:48 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 01:47:48 +0000 Subject: [Gluster-infra] [Bug 1693385] request to change the version of fedora in fedora-smoke-job In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1693385 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sankarshan at redhat.com --- Comment #4 from sankarshan --- (In reply to Amar Tumballi from comment #3) > Agree, I was asking for a job without DEBUG mainly because a few times, > there may be warning without DEBUG being there during compile (ref: > https://review.gluster.org/22347 && https://review.gluster.org/22389 ) > > As I had --enable-debug while testing locally, never saw the warning, and > none of the smoke tests captured the error. If we had a job without > --enable-debug, we could have seen the warning while compiling, which would > have failed Smoke. Is the request here to have a job without --enable-debug? Attempting to understand this because there has not been much updates or, clarity on the work. Also, Fedora 30 is now GA - https://fedoramagazine.org/announcing-fedora-30/ -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 01:52:40 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 01:52:40 +0000 Subject: [Gluster-infra] [Bug 1504713] Move planet build to be triggered by Jenkins In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1504713 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sankarshan at redhat.com Flags| |needinfo?(mscherer at redhat.c | |om) --- Comment #2 from sankarshan --- (In reply to M. Scherer from comment #1) > I also wonder if we could integrate jenkins with github, to replace the > travis build. It tend to change underneat us and may surprise users. > > Ideally, I also would like a system that open ticket when a feed fail for > too long. Is this now in place? The last time I checked with Deepshikha, there was a refrain of the cron behind the planet scripts failing randomly. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 01:54:11 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 01:54:11 +0000 Subject: [Gluster-infra] [Bug 1514365] Generate report to identify first time contributors In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1514365 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dkhandel at redhat.com Flags| |needinfo?(dkhandel at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 02:09:34 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 02:09:34 +0000 Subject: [Gluster-infra] [Bug 1665361] Alerts for offline nodes In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1665361 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dkhandel at redhat.com, | |sankarshan at redhat.com Flags| |needinfo?(dkhandel at redhat.c | |om) --- Comment #2 from sankarshan --- Is there any decision on whether Option#1 can be implemented? Deepshikha, can we have Naresh to look into this? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 02:10:24 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 02:10:24 +0000 Subject: [Gluster-infra] [Bug 1678378] Add a nightly build verification job in Jenkins for release-6 In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1678378 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dkhandel at redhat.com, | |sankarshan at redhat.com Flags| |needinfo?(dkhandel at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 02:11:52 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 02:11:52 +0000 Subject: [Gluster-infra] [Bug 1692349] gluster-csi-containers job is failing In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1692349 sankarshan changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED CC| |sankarshan at redhat.com Resolution|--- |DEFERRED Last Closed| |2019-05-27 02:11:52 -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 02:39:45 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 02:39:45 +0000 Subject: [Gluster-infra] [Bug 1713391] Access to wordpress instance of gluster.org required for release management In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713391 Ravishankar N changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |Triaged CC| |ravishankar at redhat.com -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 03:36:22 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 03:36:22 +0000 Subject: [Gluster-infra] [Bug 1713391] Access to wordpress instance of gluster.org required for release management In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713391 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dkhandel at redhat.com, | |mscherer at redhat.com Flags| |needinfo?(mscherer at redhat.c | |om) -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 03:43:51 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 03:43:51 +0000 Subject: [Gluster-infra] [Bug 1489325] Place to host gerritstats In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1489325 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mscherer at redhat.com Flags|needinfo?(sankarshan at redhat |needinfo?(mscherer at redhat.c |.com) |om) |needinfo?(dkhandel at redhat.c | |om) | --- Comment #4 from Deepshikha khandelwal --- I need more info about this bug. What kind of stats do we want? What would be the end result of this? It will be altogether a hosted service. We can have it in the cage or on the same as gerrit machine. @misc thoughts? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 03:50:03 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 03:50:03 +0000 Subject: [Gluster-infra] [Bug 1678378] Add a nightly build verification job in Jenkins for release-6 In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1678378 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(dkhandel at redhat.c | |om) | --- Comment #2 from Deepshikha khandelwal --- No, we have a nightly master job. Will add this job. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 03:53:59 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 03:53:59 +0000 Subject: [Gluster-infra] [Bug 1693385] request to change the version of fedora in fedora-smoke-job In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1693385 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(atumball at redhat.c | |om) | --- Comment #5 from Amar Tumballi --- First request in this bug is: * Change the version of Fedora (currently 28) to 30. - This can't be done without some head start time, because there are some 20-30 warnings with newer compiler version. We need to fix it before making the job to vote. - Best way is to have a job which doesn't vote (skip), but reports failure/success for at least a week or so. In that time, we fix all the warning, make the job GREEN, and then make it vote. * `--enable-debug` is used everywhere in smoke tests, but release bits doesn't involve the same. Hence, I was thinking of having at least 1 of the smoke jobs not have it. Probably we should consider opening another bug for the same. It can be done in fedora-smoke, or even in centos-smoke. doesn't matter. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 04:00:05 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 04:00:05 +0000 Subject: [Gluster-infra] [Bug 1665361] Alerts for offline nodes In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1665361 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(dkhandel at redhat.c | |om) | --- Comment #3 from Deepshikha khandelwal --- According to me we should have it on nagios rather than alerting jenkins job. Nagios is already in place for builders to alert about any memory failures or so. Though I don't receive notifications (that's a different story) but would be good to have just one such source of alerting. Naresh can look at the script if we agree on this. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 04:12:07 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 04:12:07 +0000 Subject: [Gluster-infra] [Bug 1663780] On docs.gluster.org, we should convert spaces in folder or file names to 301 redirects to hypens In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1663780 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs at gluster.org |dkhandel at redhat.com Flags|needinfo?(dkhandel at redhat.c | |om) | --- Comment #3 from Deepshikha khandelwal --- I can take this up once I gather enough understanding of how this can be done. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 15:35:48 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 15:35:48 +0000 Subject: [Gluster-infra] [Bug 1637652] Glusterd2 is not cleaning itself In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1637652 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |CLOSED CC| |atumball at redhat.com Resolution|--- |DEFERRED Last Closed| |2019-05-27 15:35:48 --- Comment #4 from Amar Tumballi --- Not working actively on glusterd2, and hence marking it as DEFERRED. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Mon May 27 16:12:28 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Mon, 27 May 2019 16:12:28 +0000 Subject: [Gluster-infra] [Bug 1598326] Setup CI for gluster-block In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1598326 Amar Tumballi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CLOSED CC| |atumball at redhat.com Resolution|--- |WORKSFORME Last Closed| |2019-05-27 16:12:28 --- Comment #4 from Amar Tumballi --- We have travis now in https://github.com/gluster/gluster-block for every PR, and nightly line-coverage tests on jenkins. Should be good to close this issue. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Tue May 28 10:10:57 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Tue, 28 May 2019 10:10:57 +0000 Subject: [Gluster-infra] [Bug 1678378] Add a nightly build verification job in Jenkins for release-6 In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1678378 --- Comment #3 from Deepshikha khandelwal --- Pushed a change to add this job too: https://review.gluster.org/#/c/build-jobs/+/22781/ -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Wed May 29 03:02:02 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Wed, 29 May 2019 03:02:02 +0000 Subject: [Gluster-infra] [Bug 1665361] Alerts for offline nodes In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1665361 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|bugs at gluster.org |narekuma at redhat.com -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 31 03:31:53 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 31 May 2019 03:31:53 +0000 Subject: [Gluster-infra] [Bug 1713429] My personal blog contenting is not feeding to https://planet.gluster.org/ In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713429 Atin Mukherjee changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amukherj at redhat.com Flags| |needinfo?(dkhandel at redhat.c | |om) --- Comment #2 from Atin Mukherjee --- Any update on this? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at redhat.com Fri May 31 07:33:21 2019 From: bugzilla at redhat.com (bugzilla at redhat.com) Date: Fri, 31 May 2019 07:33:21 +0000 Subject: [Gluster-infra] [Bug 1713429] My personal blog contenting is not feeding to https://planet.gluster.org/ In-Reply-To: References: Message-ID: https://bugzilla.redhat.com/show_bug.cgi?id=1713429 Deepshikha khandelwal changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(dkhandel at redhat.c | |om) | --- Comment #3 from Deepshikha khandelwal --- It's failing because the title field is empty on Rafi's blog feed. @rafi Can you please check. -- You are receiving this mail because: You are on the CC list for the bug. From srakonde at redhat.com Tue May 7 14:35:04 2019 From: srakonde at redhat.com (Sanju Rakonde) Date: Tue, 07 May 2019 14:35:04 -0000 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: Looks like is_nfs_export_available started failing again in recent centos-regressions. Michael, can you please check? On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: > > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer > wrote: > >> Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >> > Is this back again? The recent patches are failing regression :-\ . >> >> So, on builder206, it took me a while to find that the issue is that >> nfs (the service) was running. >> >> ./tests/basic/afr/tarissue.t failed, because the nfs initialisation >> failed with a rather cryptic message: >> >> [2019-04-23 13:17:05.371733] I [socket.c:991:__socket_server_bind] 0- >> socket.nfs-server: process started listening on port (38465) >> [2019-04-23 13:17:05.385819] E [socket.c:972:__socket_server_bind] 0- >> socket.nfs-server: binding to failed: Address already in use >> [2019-04-23 13:17:05.385843] E [socket.c:974:__socket_server_bind] 0- >> socket.nfs-server: Port is already in use >> [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >> socket.nfs-server: __socket_server_bind failed;closing socket 14 >> >> I found where this came from, but a few stuff did surprised me: >> >> - the order of print is different that the order in the code >> > > Indeed strange... > >> - the message on "started listening" didn't take in account the fact >> that bind failed on: >> > > Shouldn't it bail out if it failed to bind? > Some missing 'goto out' around line 975/976? > Y. > >> >> >> >> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >> >> The message about port 38465 also threw me off the track. The real >> issue is that the service nfs was already running, and I couldn't find >> anything listening on port 38465 >> >> once I do service nfs stop, it no longer failed. >> >> So far, I do know why nfs.service was activated. >> >> But at least, 206 should be fixed, and we know a bit more on what would >> be causing some failure. >> >> >> >> > On Wed, 3 Apr 2019 at 19:26, Michael Scherer >> > wrote: >> > >> > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a ?crit : >> > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >> > > > jthottan at redhat.com> >> > > > wrote: >> > > > >> > > > > Hi, >> > > > > >> > > > > is_nfs_export_available is just a wrapper around "showmount" >> > > > > command AFAIR. >> > > > > I saw following messages in console output. >> > > > > mount.nfs: rpc.statd is not running but is required for remote >> > > > > locking. >> > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks local, >> > > > > or >> > > > > start >> > > > > statd. >> > > > > 05:06:55 mount.nfs: an incorrect mount option was specified >> > > > > >> > > > > For me it looks rpcbind may not be running on the machine. >> > > > > Usually rpcbind starts automatically on machines, don't know >> > > > > whether it >> > > > > can happen or not. >> > > > > >> > > > >> > > > That's precisely what the question is. Why suddenly we're seeing >> > > > this >> > > > happening too frequently. Today I saw atleast 4 to 5 such >> > > > failures >> > > > already. >> > > > >> > > > Deepshika - Can you please help in inspecting this? >> > > >> > > So we think (we are not sure) that the issue is a bit complex. >> > > >> > > What we were investigating was nightly run fail on aws. When the >> > > build >> > > crash, the builder is restarted, since that's the easiest way to >> > > clean >> > > everything (since even with a perfect test suite that would clean >> > > itself, we could always end in a corrupt state on the system, WRT >> > > mount, fs, etc). >> > > >> > > In turn, this seems to cause trouble on aws, since cloud-init or >> > > something rename eth0 interface to ens5, without cleaning to the >> > > network configuration. >> > > >> > > So the network init script fail (because the image say "start eth0" >> > > and >> > > that's not present), but fail in a weird way. Network is >> > > initialised >> > > and working (we can connect), but the dhclient process is not in >> > > the >> > > right cgroup, and network.service is in failed state. Restarting >> > > network didn't work. In turn, this mean that rpc-statd refuse to >> > > start >> > > (due to systemd dependencies), which seems to impact various NFS >> > > tests. >> > > >> > > We have also seen that on some builders, rpcbind pick some IP v6 >> > > autoconfiguration, but we can't reproduce that, and there is no ip >> > > v6 >> > > set up anywhere. I suspect the network.service failure is somehow >> > > involved, but fail to see how. In turn, rpcbind.socket not starting >> > > could cause NFS test troubles. >> > > >> > > Our current stop gap fix was to fix all the builders one by one. >> > > Remove >> > > the config, kill the rogue dhclient, restart network service. >> > > >> > > However, we can't be sure this is going to fix the problem long >> > > term >> > > since this only manifest after a crash of the test suite, and it >> > > doesn't happen so often. (plus, it was working before some day in >> > > the >> > > past, when something did make this fail, and I do not know if >> > > that's a >> > > system upgrade, or a test change, or both). >> > > >> > > So we are still looking at it to have a complete understanding of >> > > the >> > > issue, but so far, we hacked our way to make it work (or so do I >> > > think). >> > > >> > > Deepshika is working to fix it long term, by fixing the issue >> > > regarding >> > > eth0/ens5 with a new base image. >> > > -- >> > > Michael Scherer >> > > Sysadmin, Community Infrastructure and Platform, OSAS >> > > >> > > >> > > -- >> > >> > - Atin (atinm) >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure >> >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Thanks, Sanju -------------- next part -------------- An HTML attachment was scrubbed... URL: From srakonde at redhat.com Wed May 8 01:46:06 2019 From: srakonde at redhat.com (Sanju Rakonde) Date: Wed, 08 May 2019 01:46:06 -0000 Subject: [Gluster-infra] [Gluster-devel] is_nfs_export_available from nfs.rc failing too often? In-Reply-To: References: <2056284426.17636953.1554272780313.JavaMail.zimbra@redhat.com> <797512f6ff7f1b9fedbf8b7968dd86a6968d9105.camel@redhat.com> Message-ID: Deepshikha, I see the failure here[1] which ran on builder206. So, we are good. [1] https://build.gluster.org/job/centos7-regression/5901/consoleFull On Wed, May 8, 2019 at 12:23 AM Deepshikha Khandelwal wrote: > Sanju, can you please give us more info about the failures. > > I see the failures occurring on just one of the builder (builder206). I'm > taking it back offline for now. > > On Tue, May 7, 2019 at 9:42 PM Michael Scherer > wrote: > >> Le mardi 07 mai 2019 ? 20:04 +0530, Sanju Rakonde a ?crit : >> > Looks like is_nfs_export_available started failing again in recent >> > centos-regressions. >> > >> > Michael, can you please check? >> >> I will try but I am leaving for vacation tonight, so if I find nothing, >> until I leave, I guess Deepshika will have to look. >> >> > On Wed, Apr 24, 2019 at 5:30 PM Yaniv Kaul wrote: >> > >> > > >> > > >> > > On Tue, Apr 23, 2019 at 5:15 PM Michael Scherer < >> > > mscherer at redhat.com> >> > > wrote: >> > > >> > > > Le lundi 22 avril 2019 ? 22:57 +0530, Atin Mukherjee a ?crit : >> > > > > Is this back again? The recent patches are failing regression >> > > > > :-\ . >> > > > >> > > > So, on builder206, it took me a while to find that the issue is >> > > > that >> > > > nfs (the service) was running. >> > > > >> > > > ./tests/basic/afr/tarissue.t failed, because the nfs >> > > > initialisation >> > > > failed with a rather cryptic message: >> > > > >> > > > [2019-04-23 13:17:05.371733] I >> > > > [socket.c:991:__socket_server_bind] 0- >> > > > socket.nfs-server: process started listening on port (38465) >> > > > [2019-04-23 13:17:05.385819] E >> > > > [socket.c:972:__socket_server_bind] 0- >> > > > socket.nfs-server: binding to failed: Address already in use >> > > > [2019-04-23 13:17:05.385843] E >> > > > [socket.c:974:__socket_server_bind] 0- >> > > > socket.nfs-server: Port is already in use >> > > > [2019-04-23 13:17:05.385852] E [socket.c:3788:socket_listen] 0- >> > > > socket.nfs-server: __socket_server_bind failed;closing socket 14 >> > > > >> > > > I found where this came from, but a few stuff did surprised me: >> > > > >> > > > - the order of print is different that the order in the code >> > > > >> > > >> > > Indeed strange... >> > > >> > > > - the message on "started listening" didn't take in account the >> > > > fact >> > > > that bind failed on: >> > > > >> > > >> > > Shouldn't it bail out if it failed to bind? >> > > Some missing 'goto out' around line 975/976? >> > > Y. >> > > >> > > > >> > > > >> > > > >> > > > >> >> https://github.com/gluster/glusterfs/blob/master/rpc/rpc-transport/socket/src/socket.c#L967 >> > > > >> > > > The message about port 38465 also threw me off the track. The >> > > > real >> > > > issue is that the service nfs was already running, and I couldn't >> > > > find >> > > > anything listening on port 38465 >> > > > >> > > > once I do service nfs stop, it no longer failed. >> > > > >> > > > So far, I do know why nfs.service was activated. >> > > > >> > > > But at least, 206 should be fixed, and we know a bit more on what >> > > > would >> > > > be causing some failure. >> > > > >> > > > >> > > > >> > > > > On Wed, 3 Apr 2019 at 19:26, Michael Scherer < >> > > > > mscherer at redhat.com> >> > > > > wrote: >> > > > > >> > > > > > Le mercredi 03 avril 2019 ? 16:30 +0530, Atin Mukherjee a >> > > > > > ?crit : >> > > > > > > On Wed, Apr 3, 2019 at 11:56 AM Jiffin Thottan < >> > > > > > > jthottan at redhat.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Hi, >> > > > > > > > >> > > > > > > > is_nfs_export_available is just a wrapper around >> > > > > > > > "showmount" >> > > > > > > > command AFAIR. >> > > > > > > > I saw following messages in console output. >> > > > > > > > mount.nfs: rpc.statd is not running but is required for >> > > > > > > > remote >> > > > > > > > locking. >> > > > > > > > 05:06:55 mount.nfs: Either use '-o nolock' to keep locks >> > > > > > > > local, >> > > > > > > > or >> > > > > > > > start >> > > > > > > > statd. >> > > > > > > > 05:06:55 mount.nfs: an incorrect mount option was >> > > > > > > > specified >> > > > > > > > >> > > > > > > > For me it looks rpcbind may not be running on the >> > > > > > > > machine. >> > > > > > > > Usually rpcbind starts automatically on machines, don't >> > > > > > > > know >> > > > > > > > whether it >> > > > > > > > can happen or not. >> > > > > > > > >> > > > > > > >> > > > > > > That's precisely what the question is. Why suddenly we're >> > > > > > > seeing >> > > > > > > this >> > > > > > > happening too frequently. Today I saw atleast 4 to 5 such >> > > > > > > failures >> > > > > > > already. >> > > > > > > >> > > > > > > Deepshika - Can you please help in inspecting this? >> > > > > > >> > > > > > So we think (we are not sure) that the issue is a bit >> > > > > > complex. >> > > > > > >> > > > > > What we were investigating was nightly run fail on aws. When >> > > > > > the >> > > > > > build >> > > > > > crash, the builder is restarted, since that's the easiest way >> > > > > > to >> > > > > > clean >> > > > > > everything (since even with a perfect test suite that would >> > > > > > clean >> > > > > > itself, we could always end in a corrupt state on the system, >> > > > > > WRT >> > > > > > mount, fs, etc). >> > > > > > >> > > > > > In turn, this seems to cause trouble on aws, since cloud-init >> > > > > > or >> > > > > > something rename eth0 interface to ens5, without cleaning to >> > > > > > the >> > > > > > network configuration. >> > > > > > >> > > > > > So the network init script fail (because the image say "start >> > > > > > eth0" >> > > > > > and >> > > > > > that's not present), but fail in a weird way. Network is >> > > > > > initialised >> > > > > > and working (we can connect), but the dhclient process is not >> > > > > > in >> > > > > > the >> > > > > > right cgroup, and network.service is in failed state. >> > > > > > Restarting >> > > > > > network didn't work. In turn, this mean that rpc-statd refuse >> > > > > > to >> > > > > > start >> > > > > > (due to systemd dependencies), which seems to impact various >> > > > > > NFS >> > > > > > tests. >> > > > > > >> > > > > > We have also seen that on some builders, rpcbind pick some IP >> > > > > > v6 >> > > > > > autoconfiguration, but we can't reproduce that, and there is >> > > > > > no ip >> > > > > > v6 >> > > > > > set up anywhere. I suspect the network.service failure is >> > > > > > somehow >> > > > > > involved, but fail to see how. In turn, rpcbind.socket not >> > > > > > starting >> > > > > > could cause NFS test troubles. >> > > > > > >> > > > > > Our current stop gap fix was to fix all the builders one by >> > > > > > one. >> > > > > > Remove >> > > > > > the config, kill the rogue dhclient, restart network service. >> > > > > > >> > > > > > However, we can't be sure this is going to fix the problem >> > > > > > long >> > > > > > term >> > > > > > since this only manifest after a crash of the test suite, and >> > > > > > it >> > > > > > doesn't happen so often. (plus, it was working before some >> > > > > > day in >> > > > > > the >> > > > > > past, when something did make this fail, and I do not know if >> > > > > > that's a >> > > > > > system upgrade, or a test change, or both). >> > > > > > >> > > > > > So we are still looking at it to have a complete >> > > > > > understanding of >> > > > > > the >> > > > > > issue, but so far, we hacked our way to make it work (or so >> > > > > > do I >> > > > > > think). >> > > > > > >> > > > > > Deepshika is working to fix it long term, by fixing the issue >> > > > > > regarding >> > > > > > eth0/ens5 with a new base image. >> > > > > > -- >> > > > > > Michael Scherer >> > > > > > Sysadmin, Community Infrastructure and Platform, OSAS >> > > > > > >> > > > > > >> > > > > > -- >> > > > > >> > > > > - Atin (atinm) >> > > > >> > > > -- >> > > > Michael Scherer >> > > > Sysadmin, Community Infrastructure >> > > > >> > > > >> > > > >> > > > _______________________________________________ >> > > > Gluster-devel mailing list >> > > > Gluster-devel at gluster.org >> > > > https://lists.gluster.org/mailman/listinfo/gluster-devel >> > > >> > > _______________________________________________ >> > > Gluster-devel mailing list >> > > Gluster-devel at gluster.org >> > > https://lists.gluster.org/mailman/listinfo/gluster-devel >> > >> > >> > >> -- >> Michael Scherer >> Sysadmin, Community Infrastructure >> >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel > > -- Thanks, Sanju -------------- next part -------------- An HTML attachment was scrubbed... URL: From marco.dominguez at gmail.com Sat May 11 05:16:14 2019 From: marco.dominguez at gmail.com (Marco Andres Dominguez) Date: Sat, 11 May 2019 05:16:14 -0000 Subject: [Gluster-infra] Could not start a VM with Glusterfs and selinux active Message-ID: Hi, I am trying to use glusterfs to share disk images (qcow2) between servers, but although the path to the files is correct I get these errors error: Failed to start domain srv02-vm02 error: internal error process exited while connecting to monitor: char device redirected to /dev/pts/5 /mnt/gluster/vms -rw-r--r--. root root system_u:object_r:fusefs_t:s0 srv02-vm02-Disk1.qcow2 -rw-r--r--. root root system_u:object_r:fusefs_t:s0 srv02-vm02-Disk2.qcow2 /mnt/xfs/vms [root at server02 vms]# ls -lZ drwxr-xr-x. root root unconfined_u:object_r:file_t:s0 definitions -rw-r--r--. root root unconfined_u:object_r:file_t:s0 srv02-vm02-Disk1.qcow2 -rw-r--r--. root root unconfined_u:object_r:file_t:s0 srv02-vm02-Disk2.qcow2 I am running a Centos # cat /etc/redhat-release CentOS release 6.2 (Final) # lsb_release LSB Version: :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch Gluster Version # gluster --version glusterfs 3.12.2 Repository revision: git://git.gluster.org/glusterfs.git Copyright (c) 2006-2016 Red Hat, Inc. GlusterFS comes with ABSOLUTELY NO WARRANTY. It is licensed to you under your choice of the GNU Lesser General Public License, version 3 or any later version (LGPLv3 or later), or the GNU General Public License, version 2 (GPLv2), in all cases as published by the Free Software Foundation. Does anyone has an idea of what should I do to fix this without disabling selinux? Best regards, Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: