[Gluster-users] Another transaction could be in progress

Tue Mar 18 08:21:07 UTC 2014

Sorry, didn't think to look in the log file, I can see I have bigger
problems. Last time I saw this was because I had changed an IP address
but this time all I did was reboot the server. I've checked all the
files in vols and everything looks good.

[2014-03-18 08:09:18.117040] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-0
[2014-03-18 08:09:18.117074] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-1
[2014-03-18 08:09:18.117087] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-2
[2014-03-18 08:09:18.117097] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-3
[2014-03-18 08:09:18.117107] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-4
[2014-03-18 08:09:18.117117] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-5
[2014-03-18 08:09:18.117128] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-6
[2014-03-18 08:09:18.117138] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-7
[2014-03-18 08:09:18.117148] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-8
[2014-03-18 08:09:18.117158] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-9
[2014-03-18 08:09:18.117168] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-10
[2014-03-18 08:09:18.117178] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-11
[2014-03-18 08:09:18.117196] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-12
[2014-03-18 08:09:18.117209] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-13
[2014-03-18 08:09:18.117219] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-14
[2014-03-18 08:09:18.117229] E [glusterd-store.c:1858:glusterd_store_retrieve_volume] 0-: Unknown key: brick-15

This is from another server 

[root at nas1 bricks]# gluster vol status
Status of volume: data
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick nas1-10g:/data1/gvol				49152	Y	17331
Brick nas2-10g:/data5/gvol				49160	Y	3933
Brick nas1-10g:/data2/gvol				49153	Y	17340
Brick nas2-10g:/data6/gvol				49161	Y	3942
Brick nas1-10g:/data3/gvol				49154	Y	17350
Brick nas2-10g:/data7/gvol				49162	Y	3951
Brick nas1-10g:/data4/gvol				49155	Y	17360
Brick nas2-10g:/data8/gvol				49163	Y	3960
Brick nas3-10g:/data9/gvol				49156	Y	10076
Brick nas3-10g:/data10/gvol				49157	Y	10085
Brick nas3-10g:/data11/gvol				49158	Y	10094
Brick nas3-10g:/data12/gvol				49159	Y	10108
Brick nas4-10g:/data13/gvol				N/A	N	8879
Brick nas4-10g:/data14/gvol				N/A	N	8884
Brick nas4-10g:/data15/gvol				N/A	N	8888
Brick nas4-10g:/data16/gvol				N/A	N	8892
NFS Server on localhost					2049	Y	18725
NFS Server on nas3-10g					2049	Y	11667
NFS Server on nas2-10g					2049	Y	4980
NFS Server on nas4-10g					N/A	N	N/A

There are no active volume tasks

Any ideas?

On Tue, 2014-03-18 at 12:39 +0530, Kaushal M wrote: 
> The lock is an in-memory structure which isn't persisted. Restarting
> should reset the lock. You could possibly reset the lock by gdbing
> into the glusterd process.
> 
> Since this is happening to you consistently, there is something else
> that is wrong. Could you please give more details on your cluster? And
> the glusterd logs of the misbehaving peer (if possible for all the
> peers). It would help in tracking it down.
> 
> 
> 
> On Tue, Mar 18, 2014 at 12:24 PM, Franco Broi <franco.broi at iongeo.com> wrote:
> >
> > Restarted the glusterd daemons on all 4 servers, still the same.
> >
> > It only and always fails on the same server and it always works on the
> > other servers.
> >
> > I had to reboot the server in question this morning, perhaps it's got
> > itself in a funny state.
> >
> > Is the lock something that can be examined? And removed?
> >
> > On Tue, 2014-03-18 at 12:08 +0530, Kaushal M wrote:
> >> This mostly occurs when you run two gluster commands simultaneously.
> >> Gluster uses a lock on each peer to synchronize commands. Any command
> >> which would need to do operations on multiple peers, would first
> >> acquire this lock, and release it after doing the operation. If a
> >> command cannot acquire a lock because another command had the lock, it
> >> will fail with the above error message.
> >>
> >> It sometimes happens that a command could fail to release the lock on
> >> some peers. When this happens all further commands which need the lock
> >> will fail with the same error. In this case your only option is to
> >> restart glusterd on the peers which have the stale lock held. This
> >> will not cause any downtime as the brick processes are not affected by
> >> restarting glusterd.
> >>
> >> In your case, since you can run commands on other nodes, most likely
> >> you are running commands simultaneously or at least running a command
> >> before an old one finishes.
> >>
> >> ~kaushal
> >>
> >> On Tue, Mar 18, 2014 at 11:24 AM, Franco Broi <franco.broi at iongeo.com> wrote:
> >> >
> >> > What causes this error? And how do I get rid of it?
> >> >
> >> > [root at nas4 ~]# gluster vol status
> >> > Another transaction could be in progress. Please try again after sometime.
> >> >
> >> >
> >> > Looks normal on any other server.
> >> >
> >> > _______________________________________________
> >> > Gluster-users mailing list
> >> > Gluster-users at gluster.org
> >> > http://supercolony.gluster.org/mailman/listinfo/gluster-users
> >
> >