[Gluster-devel] Cores generated with ./tests/geo-rep/georep-basic-dr-tarssh.t
Soumya Koduri
skoduri at redhat.com
Thu Mar 3 06:54:00 UTC 2016
Thanks a lot Kotresh.
On 03/03/2016 08:47 AM, Raghavendra G wrote:
> Hi Soumya,
>
> Can you send a fix to this regression on upstream master too? This patch
> is merged there.
>
I have submitted below patch.
http://review.gluster.org/#/c/13587/
Kindly review the same.
Thanks,
Soumya
> regards,
> Raghavendra
>
> On Tue, Mar 1, 2016 at 10:34 PM, Kotresh Hiremath Ravishankar
> <khiremat at redhat.com <mailto:khiremat at redhat.com>> wrote:
>
> Hi Soumya,
>
> I analysed the issue and found out that crash has happened because
> of the patch [1].
>
> The patch doesn't set transport object to NULL in 'rpc_clnt_disable'
> but instead does it on
> 'rpc_clnt_trigger_destroy'. So if there are pending rpc invocations
> on the rpc object that
> is disabled (those instances are possible as happening now in
> changelog), it will trigger a
> CONNECT notify again with 'mydata' that is freed causing a crash.
> This happens because
> 'rpc_clnt_submit' reconnects if rpc is not connected.
>
> rpc_clnt_submit (...) {
> ...
> if (conn->connected == 0) {
> ret = rpc_transport_connect (conn->trans,
>
> conn->config.remote_port);
> }
> ...
> }
>
> Without your patch, conn->trans was set NULL and hence CONNECT fails
> not resulting with
> CONNECT notify call. And also the cleanup happens in failure path.
>
> So the memory leak can happen, if there is no try for rpc invocation
> after DISCONNECT.
> It will be cleaned up otherwise.
>
>
> [1] http://review.gluster.org/#/c/13507/
>
> Thanks and Regards,
> Kotresh H R
>
> ----- Original Message -----
> > From: "Kotresh Hiremath Ravishankar" <khiremat at redhat.com
> <mailto:khiremat at redhat.com>>
> > To: "Soumya Koduri" <skoduri at redhat.com <mailto:skoduri at redhat.com>>
> > Cc: avishwan at redhat.com <mailto:avishwan at redhat.com>, "Gluster
> Devel" <gluster-devel at gluster.org <mailto:gluster-devel at gluster.org>>
> > Sent: Monday, February 29, 2016 4:15:22 PM
> > Subject: Re: Cores generated with
> ./tests/geo-rep/georep-basic-dr-tarssh.t
> >
> > Hi Soumya,
> >
> > I just tested that it is reproducible only with your patch both
> in master and
> > 3.76 branch.
> > The geo-rep test cases are marked bad in master. So it's not hit
> in master.
> > rpc is introduced
> > in changelog xlator to communicate to applications via
> libgfchangelog.
> > Venky/Me will check
> > why is the crash happening and will update.
> >
> >
> > Thanks and Regards,
> > Kotresh H R
> >
> > ----- Original Message -----
> > > From: "Soumya Koduri" <skoduri at redhat.com
> <mailto:skoduri at redhat.com>>
> > > To: avishwan at redhat.com <mailto:avishwan at redhat.com>, "kotresh"
> <khiremat at redhat.com <mailto:khiremat at redhat.com>>
> > > Cc: "Gluster Devel" <gluster-devel at gluster.org
> <mailto:gluster-devel at gluster.org>>
> > > Sent: Monday, February 29, 2016 2:10:51 PM
> > > Subject: Cores generated with
> ./tests/geo-rep/georep-basic-dr-tarssh.t
> > >
> > > Hi Aravinda/Kotresh,
> > >
> > > With [1], I consistently see cores generated with the test
> > > './tests/geo-rep/georep-basic-dr-tarssh.t' in release-3.7
> branch. From
> > > the cores, looks like we are trying to dereference a freed
> > > changelog_rpc_clnt_t(crpc) object in changelog_rpc_notify().
> Strangely
> > > this was not reported in master branch.
> > >
> > > I tried debugging but couldn't find any possible suspects. I
> request you
> > > to take a look and let me know if [1] caused any regression.
> > >
> > > Thanks,
> > > Soumya
> > >
> > > [1] http://review.gluster.org/#/c/13507/
> > >
> >
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org <mailto:Gluster-devel at gluster.org>
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>
> --
> Raghavendra G
More information about the Gluster-devel
mailing list