[Gluster-devel] [Gluster-users] Unable to get lock for uuid peers

Wed Jun 10 12:42:46 UTC 2015


On 06/10/2015 05:32 PM, Tiemen Ruiten wrote:
> Hello Atin,
> 
> We are running 3.7.0 on our storage nodes and suffer from the same issue.
> Is it safe to perform the same command or should we first upgrade to 3.7.1?
You should upgrade to 3.7.1
> 
> On 10 June 2015 at 13:45, Atin Mukherjee <amukherj at redhat.com> wrote:
> 
>>
>>
>> On 06/10/2015 02:58 PM, Sergio Traldi wrote:
>>> On 06/10/2015 10:27 AM, Krishnan Parthasarathi wrote:
>>>>> Hi all,
>>>>> I two servers with 3.7.1 and have the same problem of this issue:
>>>>> http://comments.gmane.org/gmane.comp.file-systems.gluster.user/20693
>>>>>
>>>>> My servers packages:
>>>>> # rpm -qa | grep gluster | sort
>>>>> glusterfs-3.7.1-1.el6.x86_64
>>>>> glusterfs-api-3.7.1-1.el6.x86_64
>>>>> glusterfs-cli-3.7.1-1.el6.x86_64
>>>>> glusterfs-client-xlators-3.7.1-1.el6.x86_64
>>>>> glusterfs-fuse-3.7.1-1.el6.x86_64
>>>>> glusterfs-geo-replication-3.7.1-1.el6.x86_64
>>>>> glusterfs-libs-3.7.1-1.el6.x86_64
>>>>> glusterfs-server-3.7.1-1.el6.x86_64
>>>>>
>>>>> Command:
>>>>> # gluster volume status
>>>>> Another transaction is in progress. Please try again after sometime.
>> The problem is although you are running 3.7.1 binaries the cluster
>> op-version is set to 30501, because of glusterd still goes for acquiring
>> cluster lock instead of volume wise lock for every request. Command log
>> history indicates glusterD is getting multiple volume's status requests
>> and because of it fails to acquire cluster lock. Could you bump up your
>> cluster's op-version by the following command and recheck?
>>
>> gluster volume set all cluster.op-version 30701
>>
>> ~Atin
>>>>>
>>>>>
>>>>> In /var/log/gluster/etc-glusterfs-glusterd.vol.log I found:
>>>>>
>>>>> [2015-06-09 16:12:38.949842] E [glusterd-utils.c:164:glusterd_lock]
>>>>> 0-management: Unable to get lock for uuid:
>>>>> 99a41a2a-2ce5-461c-aec0-510bd5b37bf2, lock held by:
>>>>> 04a7d2bb-bdd9-4e0d-b460-87ad4adbe12c
>>>>> [2015-06-09 16:12:38.949864] E
>>>>> [glusterd-syncop.c:1766:gd_sync_task_begin]
>>>>> 0-management: Unable to acquire lock
>>>>>
>>>>> I check the files:
>>>>>   From server 1:
>>>>> # cat /var/lib/glusterd/peers/04a7d2bb-bdd9-4e0d-b460-87ad4adbe12c
>>>>> uuid=04a7d2bb-bdd9-4e0d-b460-87ad4adbe12c
>>>>> state=3
>>>>> hostname1=192.168.61.101
>>>>>
>>>>>   From server 2:
>>>>> # cat /var/lib/glusterd/peers/99a41a2a-2ce5-461c-aec0-510bd5b37bf2
>>>>> uuid=99a41a2a-2ce5-461c-aec0-510bd5b37bf2
>>>>> state=3
>>>>> hostname1=192.168.61.100
>>>> Could you attach the complete glusterd log file and cmd-history.log
>>>> file under /var/log/glusterfs directory? Could you provide a more
>>>> detailed listing of things you did before hitting this issue?
>>> Hi Krishnan,
>>> thanks to a quick answer.
>>> In attach you can found the two log you request:
>>> cmd_history.log
>>> etc-glusterfs-glusterd.vol.log
>>>
>>> We use the gluster volume as openstack nova, glance, cinder backend.
>>>
>>> The volume is configured using 2 bricks mounted by an iscsi device:
>>> [root at cld-stg-01 glusterfs]# gluster volume info volume-nova-prod
>>> Volume Name: volume-nova-prod
>>> Type: Distribute
>>> Volume ID: 4bbef4c8-0441-4e81-a2c5-559401adadc0
>>> Status: Started
>>> Number of Bricks: 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: 192.168.61.100:/brickOpenstack/nova-prod/mpathb
>>> Brick2: 192.168.61.101:/brickOpenstack/nova-prod/mpathb
>>> Options Reconfigured:
>>> storage.owner-gid: 162
>>> storage.owner-uid: 162
>>>
>>> Last week we update openstack from havana to icehouse and we rename the
>>> storage hosts but we didn't change the IP.
>>> All volume have been created using ip addresses.
>>>
>>> So last week we stop all services (openstack gluster and also iscsi).
>>> We change the name in DNS of private ip of the 2 nics.
>>> We reboot the storage servers
>>> We start agian iscsi, multipath, glusterd process.
>>> We have to stop and start the volumes, but after that everything works
>>> fine.
>>> Now we don't observe any other problems except this.
>>>
>>> We have a nagios probe which check the volume status each 5 minutes to
>>> ensure all gluster process is working fine and so we find this problem I
>>> post.
>>>
>>> Cheer
>>> Sergio
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>> --
>> ~Atin
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
> 
> 
> 

-- 
~Atin