[Gluster-users] KVM lockups on Gluster 4.1.1

Amar Tumballi atumball at redhat.com
Mon Aug 20 09:30:23 UTC 2018


On Wed, Aug 15, 2018 at 2:54 AM, Walter Deignan <WDeignan at uline.com> wrote:

> I am using gluster to host KVM/QEMU images. I am seeing an intermittent
> issue where access to an image will hang. I have to do a lazy dismount of
> the gluster volume in order to break the lock and then reset the impacted
> virtual machine.
>
> It happened again today and I caught the events below in the client side
> logs. Any thoughts on what might cause this? It seemed to begin after I
> upgraded from 3.12.10 to 4.1.1 a few weeks ago.
>
> [2018-08-14 14:22:15.549501] E [MSGID: 114031] [client-rpc-fops_v2.c:1352:client4_0_finodelk_cbk]
> 2-gv1-client-4: remote operation failed [Invalid argument]
> [2018-08-14 14:22:15.549576] E [MSGID: 114031] [client-rpc-fops_v2.c:1352:client4_0_finodelk_cbk]
> 2-gv1-client-5: remote operation failed [Invalid argument]
> [2018-08-14 14:22:15.549583] E [MSGID: 108010] [afr-lk-common.c:284:afr_unlock_inodelk_cbk]
> 2-gv1-replicate-2: path=(null) gfid=00000000-0000-0000-0000-000000000000:
> unlock failed on subvolume gv1-client-4 with lock owner d89caca92b7f0000
> [Invalid argument]
> [2018-08-14 14:22:15.549615] E [MSGID: 108010] [afr-lk-common.c:284:afr_unlock_inodelk_cbk]
> 2-gv1-replicate-2: path=(null) gfid=00000000-0000-0000-0000-000000000000:
> unlock failed on subvolume gv1-client-5 with lock owner d89caca92b7f0000
> [Invalid argument]
> [2018-08-14 14:52:18.726219] E [rpc-clnt.c:184:call_bail] 2-gv1-client-4:
> bailing out frame type(GlusterFS 4.x v1) op(FINODELK(30)) xid = 0xc5e00
> sent = 2018-08-14 14:22:15.699082. timeout = 1800 for 10.35.20.106:49159
> [2018-08-14 14:52:18.726254] E [MSGID: 114031] [client-rpc-fops_v2.c:1352:client4_0_finodelk_cbk]
> 2-gv1-client-4: remote operation failed [Transport endpoint is not
> connected]
> [2018-08-14 15:22:25.962546] E [rpc-clnt.c:184:call_bail] 2-gv1-client-5:
> bailing out frame type(GlusterFS 4.x v1) op(FINODELK(30)) xid = 0xc4a6d
> sent = 2018-08-14 14:52:18.726329. timeout = 1800 for 10.35.20.107:49164
>


Hi Walter,

Do you see any warning or error on brick logs around this time?

Regards,
Amar


> [2018-08-14 15:22:25.962587] E [MSGID: 114031] [client-rpc-fops_v2.c:1352:client4_0_finodelk_cbk]
> 2-gv1-client-5: remote operation failed [Transport endpoint is not
> connected]
> [2018-08-14 15:22:25.962618] W [MSGID: 108019] [afr-lk-common.c:601:is_
> blocking_locks_count_sufficient] 2-gv1-replicate-2: Unable to obtain
> blocking inode lock on even one child for gfid:24a48cae-53fe-4634-8fb7-
> 0254c85ad672.
> [2018-08-14 15:22:25.962668] W [fuse-bridge.c:1441:fuse_err_cbk]
> 0-glusterfs-fuse: 3715808: FSYNC() ERR => -1 (Transport endpoint is not
> connected)
>
> Volume configuration -
>
> Volume Name: gv1
> Type: Distributed-Replicate
> Volume ID: 66ad703e-3bae-4e79-a0b7-29ea38e8fcfc
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 5 x 2 = 10
> Transport-type: tcp
> Bricks:
> Brick1: dc-vihi44:/gluster/bricks/megabrick/data
> Brick2: dc-vihi45:/gluster/bricks/megabrick/data
> Brick3: dc-vihi44:/gluster/bricks/brick1/data
> Brick4: dc-vihi45:/gluster/bricks/brick1/data
> Brick5: dc-vihi44:/gluster/bricks/brick2_1/data
> Brick6: dc-vihi45:/gluster/bricks/brick2/data
> Brick7: dc-vihi44:/gluster/bricks/brick3/data
> Brick8: dc-vihi45:/gluster/bricks/brick3/data
> Brick9: dc-vihi44:/gluster/bricks/brick4/data
> Brick10: dc-vihi45:/gluster/bricks/brick4/data
> Options Reconfigured:
> cluster.min-free-inodes: 6%
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.low-prio-threads: 32
> network.remote-dio: enable
> cluster.eager-lock: enable
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 10000
> user.cifs: off
> cluster.choose-local: off
> features.shard: on
> cluster.server-quorum-ratio: 51%
>
> -Walter Deignan
> -Uline IT, Systems Architect
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Amar Tumballi (amarts)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180820/78d5e6c8/attachment.html>


More information about the Gluster-users mailing list