[Gluster-users] Problem with glusterd locks on gluster 3.6.1
B.K.Raghuram
bkrram at gmail.com
Fri Jun 17 10:06:37 UTC 2016
I'd tried that sometime back but ran into some merge conflicts and was not
sure who to turn to :) May I come to you for help with that?!
On Fri, Jun 17, 2016 at 3:29 PM, Atin Mukherjee <amukherj at redhat.com> wrote:
>
>
> On 06/17/2016 03:21 PM, B.K.Raghuram wrote:
> > Thanks a ton Atin. That fixed cherry-pick. Will build it and let you
> > know how it goes. Does it make sense to try and merge the whole upstream
> > glusterfs repo for the 3.6 branch in order to get all the other bug
> > fixes? That may bring in many more merge conflicts though..
>
> Yup, I'd not recommend that. Applying your local changes on the latest
> version is a much easier option :)
>
> >
> > On Fri, Jun 17, 2016 at 3:07 PM, Atin Mukherjee <amukherj at redhat.com
> > <mailto:amukherj at redhat.com>> wrote:
> >
> > I've resolved the merge conflicts and files are attached. Copy these
> > files and follow the instructions from the cherry pick command which
> > failed.
> >
> > ~Atin
> >
> > On 06/17/2016 02:55 PM, B.K.Raghuram wrote:
> > >
> > > Thanks Atin, I had three merge conflicts in the third patch.. I've
> > > attached the files with the conflicts. Would any of the intervening
> > > commits be needed as well?
> > >
> > > The conflicts were in :
> > >
> > > both modified: libglusterfs/src/mem-types.h
> > > both modified: xlators/mgmt/glusterd/src/glusterd-utils.c
> > > both modified: xlators/mgmt/glusterd/src/glusterd-utils.h
> > >
> > >
> > > On Fri, Jun 17, 2016 at 2:17 PM, Atin Mukherjee <
> amukherj at redhat.com <mailto:amukherj at redhat.com>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>> wrote:
> > >
> > >
> > >
> > > On 06/17/2016 12:44 PM, B.K.Raghuram wrote:
> > > > Thanks Atin.. I'm not familiar with pulling patches the
> review system
> > > > but will try:)
> > >
> > > It's not that difficult. Open the gerrit review link, go to
> the download
> > > drop box at the top right corner, click on it and then you
> will see a
> > > cherry pick option, copy that content and paste it the source
> code repo
> > > you host. If there are no merge conflicts, it should auto
> apply,
> > > otherwise you'd need to fix them manually.
> > >
> > > HTH.
> > > Atin
> > >
> > > >
> > > > On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee <
> amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>> wrote:
> > > >
> > > >
> > > >
> > > > On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> > > > >
> > > > >
> > > > > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> > > > >> Thanks a lot Atin,
> > > > >>
> > > > >> The problem is that we are using a forked version of
> 3.6.1 which has
> > > > >> been modified to work with ZFS (for snapshots) but we
> do not have the
> > > > >> resources to port that over to the later versions of
> gluster.
> > > > >>
> > > > >> Would you know of anyone who would be willing to take
> this on?!
> > > > >
> > > > > If you can cherry pick the patches and apply them on
> your source and
> > > > > rebuild it, I can point the patches to you, but you'd
> need to give a
> > > > > day's time to me as I have some other items to finish
> from my plate.
> > > >
> > > >
> > > > Here is the list of the patches need to be applied on
> the following
> > > > order:
> > > >
> > > > http://review.gluster.org/9328
> > > > http://review.gluster.org/9393
> > > > http://review.gluster.org/10023
> > > >
> > > > >
> > > > > ~Atin
> > > > >>
> > > > >> Regards,
> > > > >> -Ram
> > > > >>
> > > > >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee
> > > > <amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>
> > > > >> <mailto:amukherj at redhat.com
> > <mailto:amukherj at redhat.com> <mailto:amukherj at redhat.com
> > <mailto:amukherj at redhat.com>>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>>> wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> > > > >> >
> > > > >> >
> > > > >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee
> > > > <amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>
> > > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>>
> > > > >> > <mailto:amukherj at redhat.com
> > <mailto:amukherj at redhat.com>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>
> > > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>
> > > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>
> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>>>>> wrote:
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> > > > >> > > Hi,
> > > > >> > >
> > > > >> > > We're using gluster 3.6.1 and we
> > periodically find
> > > > that gluster commands
> > > > >> > > fail saying the it could not get the lock
> > on one of
> > > > the brick machines.
> > > > >> > > The logs on that machine then say
> > something like :
> > > > >> > >
> > > > >> > > [2016-06-15 08:17:03.076119] E
> > > > >> > >
> [glusterd-op-sm.c:3058:glusterd_op_ac_lock]
> > > > 0-management: Unable to
> > > > >> > > acquire lock for vol2
> > > > >> >
> > > > >> > This is a possible case if concurrent volume
> > > operations
> > > > are run. Do you
> > > > >> > have any script which checks for volume
> > status on an
> > > > interval from all
> > > > >> > the nodes, if so then this is an expected
> > behavior.
> > > > >> >
> > > > >> >
> > > > >> > Yes, I do have a couple of scripts that check on
> > > volume and
> > > > quota
> > > > >> > status.. Given this, I do get a "Another
> > transaction
> > > is in
> > > > progress.."
> > > > >> > message which is ok. The problem is that
> > sometimes I get
> > > > the volume lock
> > > > >> > held message which never goes away. This
> sometimes
> > > results
> > > > in glusterd
> > > > >> > consuming a lot of memory and CPU and the
> > problem can
> > > only
> > > > be fixed with
> > > > >> > a reboot. The log files are huge so I'm not
> sure if
> > > its ok
> > > > to attach
> > > > >> > them to an email.
> > > > >>
> > > > >> Ok, so this is known. We have fixed lots of stale
> > lock
> > > issues
> > > > in 3.7
> > > > >> branch and some of them if not all were also
> > backported to
> > > > 3.6 branch.
> > > > >> The issue is you are using 3.6.1 which is quite
> > old. If you
> > > > can upgrade
> > > > >> to latest versions of 3.7 or at worst of 3.6 I am
> > confident
> > > > that this
> > > > >> will go away.
> > > > >>
> > > > >> ~Atin
> > > > >> >
> > > > >> > >
> > > > >> > > After sometime, glusterd then seems to
> > give up
> > > and die..
> > > > >> >
> > > > >> > Do you mean glusterd shuts down or
> > segfaults, if so I
> > > > am more
> > > > >> interested
> > > > >> > in analyzing this part. Could you provide
> > us the
> > > > glusterd log,
> > > > >> > cmd_history log file along with core (in
> > case of
> > > SEGV) from
> > > > >> all the
> > > > >> > nodes for the further analysis?
> > > > >> >
> > > > >> >
> > > > >> > There is no segfault. glusterd just shuts down.
> > As I said
> > > > above,
> > > > >> > sometimes this happens and sometimes it just
> > continues to
> > > > hog a lot of
> > > > >> > memory and CPU..
> > > > >> >
> > > > >> >
> > > > >> > >
> > > > >> > > Interestingly, I also find the following
> line
> > > in the
> > > > >> beginning of
> > > > >> > > etc-glusterfs-glusterd.vol.log and I dont
> > know if
> > > > this has any
> > > > >> > > significance to the issue :
> > > > >> > >
> > > > >> > > [2016-06-14 06:48:57.282290] I
> > > > >> > >
> > [glusterd-store.c:2063:glusterd_restore_op_version]
> > > > >> 0-management:
> > > > >> > > Detected new install. Setting op-version
> to
> > > maximum :
> > > > 30600
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> > What does this line signify?
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> > >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160617/1849b8e8/attachment.html>
More information about the Gluster-users
mailing list