[Gluster-users] Locking failed - since upgrade to 3.6.4

Mon Aug 3 14:22:36 UTC 2015

Could you check the glusterd log at the other nodes, that would give you
the hint of the exact issue. Also looking at .cmd_log_history will give you
the time interval at which volume status commands are executed. If the gap
is in milisecs then you are bound to hit it and its expected.

-Atin
Sent from one plus one
On Aug 3, 2015 7:32 PM, "Osborne, Paul (paul.osborne at canterbury.ac.uk)" <
paul.osborne at canterbury.ac.uk> wrote:

>
> Hi,
>
> Last week I upgraded one of my gluster clusters (3 hosts with bricks as
> replica 3) to 3.6.4 from 3.5.4 and all seemed well.
>
> Today I am getting reports that locking has failed:
>
>
> gfse-cant-01:/var/log/glusterfs# gluster volume status
> Locking failed on gfse-rh-01.core.canterbury.ac.uk. Please check log file
> for details.
> Locking failed on gfse-isr-01.core.canterbury.ac.uk. Please check log
> file for details.
>
> Logs:
> [2015-08-03 13:45:29.974560] E [glusterd-syncop.c:1640:gd_sync_task_begin]
> 0-management: Locking Peers Failed.
> [2015-08-03 13:49:48.273159] E [glusterd-syncop.c:105:gd_collate_errors]
> 0-: Locking failed on gfse-rh-01.core.canterbury.ac.uk. Please ch
> eck log file for details.
> [2015-08-03 13:49:48.273778] E [glusterd-syncop.c:105:gd_collate_errors]
> 0-: Locking failed on gfse-isr-01.core.canterbury.ac.uk. Please c
> heck log file for details.
>
>
> I am wondering if this is a new feature due to 3.6.4 or something that has
> gone wrong.
>
> Restarting gluster entirely (btw the restart script does not actually
> appear to kill the processes...) resolves the issue but then it repeats a
> few minutes later which is rather suboptimal for a running service.
>
> Googling suggests that there may be simultaneous actions going on that can
> cause a locking issue.
>
> I know that I have nagios running volume status <volname> for each of my
> volumes on each host every few minutes however this is not new and has been
> in place for the last 8-9 months that against 3.5 without issue so would
> hope that this is not causing the issue.
>
> I am not sure where to look now tbh.
>
>
>
>
> Paul Osborne
> Senior Systems Engineer
> Canterbury Christ Church University
> Tel: 01227 782751
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150803/99dc369c/attachment.html>