[Gluster-users] VM going down

Krutika Dhananjay kdhananj at redhat.com
Mon May 8 10:38:25 UTC 2017


The newly introduced "SEEK" fop seems to be failing at the bricks.

Adding Niels for his inputs/help.

-Krutika

On Mon, May 8, 2017 at 3:43 PM, Alessandro Briosi <ab1 at metalit.com> wrote:

> Hi all,
> I have sporadic VM going down which files are on gluster FS.
>
> If I look at the gluster logs the only events I find are:
> /var/log/glusterfs/bricks/data-brick2-brick.log
>
> [2017-05-08 09:51:17.661697] I [MSGID: 115036]
> [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
> connection from
> srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661697] I [MSGID: 115036]
> [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting
> connection from
> srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
> 0-datastore2-server: releasing lock on
> 66d9eefb-ee55-40ad-9f44-c55d1e809006 held by {client=0x7f4c7c004880,
> pid=0 lk-owner=5c7099efc97f0000}
> [2017-05-08 09:51:17.661810] W [inodelk.c:399:pl_inodelk_log_cleanup]
> 0-datastore2-server: releasing lock on
> a8d82b3d-1cf9-45cf-9858-d8546710b49c held by {client=0x7f4c840f31d0,
> pid=0 lk-owner=5c7019fac97f0000}
> [2017-05-08 09:51:17.661835] I [MSGID: 115013]
> [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
> /images/201/vm-201-disk-2.qcow2
> [2017-05-08 09:51:17.661838] I [MSGID: 115013]
> [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on
> /images/201/vm-201-disk-1.qcow2
> [2017-05-08 09:51:17.661953] I [MSGID: 101055]
> [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
> connection srvpve2-9074-2017/05/04-14:12:53:301448-datastore2-client-0-0-0
> [2017-05-08 09:51:17.661953] I [MSGID: 101055]
> [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down
> connection srvpve2-9074-2017/05/04-14:12:53:367950-datastore2-client-0-0-0
> [2017-05-08 10:01:06.210392] I [MSGID: 115029]
> [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
> client from
> srvpve2-162483-2017/05/08-10:01:06:189720-datastore2-client-0-0-0
> (version: 3.8.11)
> [2017-05-08 10:01:06.237433] E [MSGID: 113107] [posix.c:1079:posix_seek]
> 0-datastore2-posix: seek failed on fd 18 length 42957209600 [No such
> device or address]
> [2017-05-08 10:01:06.237463] E [MSGID: 115089]
> [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
> (a8d82b3d-1cf9-45cf-9858-d8546710b49c) ==> (No such device or address)
> [No such device or address]
> [2017-05-08 10:01:07.019974] I [MSGID: 115029]
> [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted
> client from
> srvpve2-162483-2017/05/08-10:01:07:3687-datastore2-client-0-0-0
> (version: 3.8.11)
> [2017-05-08 10:01:07.041967] E [MSGID: 113107] [posix.c:1079:posix_seek]
> 0-datastore2-posix: seek failed on fd 19 length 859136720896 [No such
> device or address]
> [2017-05-08 10:01:07.041992] E [MSGID: 115089]
> [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2
> (66d9eefb-ee55-40ad-9f44-c55d1e809006) ==> (No such device or address)
> [No such device or address]
>
> The strange part is that I cannot seem to find any other error.
> If I restart the VM everything works as expected (it stopped at ~9.51
> UTC and was started at ~10.01 UTC) .
>
> This is not the first time that this happened, and I do not see any
> problems with networking or the hosts.
>
> Gluster version is 3.8.11
> this is the incriminated volume (though it happened on a different one too)
>
> Volume Name: datastore2
> Type: Replicate
> Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: srvpve2g:/data/brick2/brick
> Brick2: srvpve3g:/data/brick2/brick
> Brick3: srvpve1g:/data/brick2/brick (arbiter)
> Options Reconfigured:
> nfs.disable: on
> performance.readdir-ahead: on
> transport.address-family: inet
>
> Any hint on how to dig more deeply into the reason would be greatly
> appreciated.
>
> Alessandro
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170508/5fcac5a8/attachment.html>


More information about the Gluster-users mailing list