[Gluster-users] Problem with glusterd locks on gluster 3.6.1
B.K.Raghuram
bkrram at gmail.com
Fri Jun 17 07:14:00 UTC 2016
Thanks Atin.. I'm not familiar with pulling patches the review system but
will try:)
On Fri, Jun 17, 2016 at 12:35 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
>
>
> On 06/16/2016 06:17 PM, Atin Mukherjee wrote:
> >
> >
> > On 06/16/2016 01:32 PM, B.K.Raghuram wrote:
> >> Thanks a lot Atin,
> >>
> >> The problem is that we are using a forked version of 3.6.1 which has
> >> been modified to work with ZFS (for snapshots) but we do not have the
> >> resources to port that over to the later versions of gluster.
> >>
> >> Would you know of anyone who would be willing to take this on?!
> >
> > If you can cherry pick the patches and apply them on your source and
> > rebuild it, I can point the patches to you, but you'd need to give a
> > day's time to me as I have some other items to finish from my plate.
>
>
> Here is the list of the patches need to be applied on the following order:
>
> http://review.gluster.org/9328
> http://review.gluster.org/9393
> http://review.gluster.org/10023
>
> >
> > ~Atin
> >>
> >> Regards,
> >> -Ram
> >>
> >> On Thu, Jun 16, 2016 at 11:02 AM, Atin Mukherjee <amukherj at redhat.com
> >> <mailto:amukherj at redhat.com>> wrote:
> >>
> >>
> >>
> >> On 06/16/2016 10:49 AM, B.K.Raghuram wrote:
> >> >
> >> >
> >> > On Wed, Jun 15, 2016 at 5:01 PM, Atin Mukherjee <
> amukherj at redhat.com <mailto:amukherj at redhat.com>
> >> > <mailto:amukherj at redhat.com <mailto:amukherj at redhat.com>>> wrote:
> >> >
> >> >
> >> >
> >> > On 06/15/2016 04:24 PM, B.K.Raghuram wrote:
> >> > > Hi,
> >> > >
> >> > > We're using gluster 3.6.1 and we periodically find that
> gluster commands
> >> > > fail saying the it could not get the lock on one of the
> brick machines.
> >> > > The logs on that machine then say something like :
> >> > >
> >> > > [2016-06-15 08:17:03.076119] E
> >> > > [glusterd-op-sm.c:3058:glusterd_op_ac_lock] 0-management:
> Unable to
> >> > > acquire lock for vol2
> >> >
> >> > This is a possible case if concurrent volume operations are
> run. Do you
> >> > have any script which checks for volume status on an interval
> from all
> >> > the nodes, if so then this is an expected behavior.
> >> >
> >> >
> >> > Yes, I do have a couple of scripts that check on volume and quota
> >> > status.. Given this, I do get a "Another transaction is in
> progress.."
> >> > message which is ok. The problem is that sometimes I get the
> volume lock
> >> > held message which never goes away. This sometimes results in
> glusterd
> >> > consuming a lot of memory and CPU and the problem can only be
> fixed with
> >> > a reboot. The log files are huge so I'm not sure if its ok to
> attach
> >> > them to an email.
> >>
> >> Ok, so this is known. We have fixed lots of stale lock issues in 3.7
> >> branch and some of them if not all were also backported to 3.6
> branch.
> >> The issue is you are using 3.6.1 which is quite old. If you can
> upgrade
> >> to latest versions of 3.7 or at worst of 3.6 I am confident that
> this
> >> will go away.
> >>
> >> ~Atin
> >> >
> >> > >
> >> > > After sometime, glusterd then seems to give up and die..
> >> >
> >> > Do you mean glusterd shuts down or segfaults, if so I am more
> >> interested
> >> > in analyzing this part. Could you provide us the glusterd log,
> >> > cmd_history log file along with core (in case of SEGV) from
> >> all the
> >> > nodes for the further analysis?
> >> >
> >> >
> >> > There is no segfault. glusterd just shuts down. As I said above,
> >> > sometimes this happens and sometimes it just continues to hog a
> lot of
> >> > memory and CPU..
> >> >
> >> >
> >> > >
> >> > > Interestingly, I also find the following line in the
> >> beginning of
> >> > > etc-glusterfs-glusterd.vol.log and I dont know if this has
> any
> >> > > significance to the issue :
> >> > >
> >> > > [2016-06-14 06:48:57.282290] I
> >> > > [glusterd-store.c:2063:glusterd_restore_op_version]
> >> 0-management:
> >> > > Detected new install. Setting op-version to maximum : 30600
> >> > >
> >> >
> >> >
> >> > What does this line signify?
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160617/22b6b682/attachment.html>
More information about the Gluster-users
mailing list