[Bugs] [Bug 1452766] New: VM crashing but there's no apparent reason

bugzilla at redhat.com bugzilla at redhat.com
Fri May 19 15:05:31 UTC 2017


https://bugzilla.redhat.com/show_bug.cgi?id=1452766

            Bug ID: 1452766
           Summary: VM crashing but there's no apparent reason
           Product: GlusterFS
           Version: 3.8
         Component: glusterd
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: ab1 at metalitnord.com
                CC: bugs at gluster.org



Description of problem:

VM on proxmox cluster which is also gluster server crashes.
The logs do not really report any reason (at least for me)

Gluster version is 3.8.11

How reproducible:
I have not identified any way to reproduce it constantly. It crashes randomly.

Additional info:

The brick log report the following:
[2017-05-19 14:40:25.264115] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-81350-2017/05/10-09:04:55:966276-da
tastore1-client-0-0-0
[2017-05-19 14:40:25.264220] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-81350-2017/05/10-09:04:55:912580-da
tastore1-client-0-0-0
[2017-05-19 14:40:25.264229] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore1-server: releasing lock on 9e66f0d2-501b-4cf9-80db-f423e2e2ef0f
held by {client=0x7ffa841014
e0, pid=0 lk-owner=5c807b04f67f0000}
[2017-05-19 14:40:25.264246] W [inodelk.c:399:pl_inodelk_log_cleanup]
0-datastore1-server: releasing lock on bc8f6a7e-31e5-4b48-946c-f779a4b2e64f
held by {client=0x7ffa7c005d
20, pid=0 lk-owner=5cd0180df67f0000}
[2017-05-19 14:40:25.264255] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore1-server: fd cleanup on
/images/101/vm-101-disk-2.qcow2
[2017-05-19 14:40:25.264259] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore1-server: fd cleanup on
/images/101/vm-101-disk-1.qcow2
[2017-05-19 14:40:25.264332] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-81350-2017/05/10-09:04:55:966276-datasto
re1-client-0-0-0
[2017-05-19 14:40:25.264339] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-81350-2017/05/10-09:04:55:912580-datasto
re1-client-0-0-0
[2017-05-19 14:48:15.756077] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:48:15.756158] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144293-2017/05/19-14:48:15:661020-d
atastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:48:15.782046] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 20 length 539669364736 [No such device or
address]
[2017-05-19 14:48:15.782084] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(bc8f6a7e-31e5-4b48-946c-f779a4b2e64f) ==> (No such de
vice or address) [No such device or address]
[2017-05-19 14:48:16.853352] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:48:16.853376] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144293-2017/05/19-14:48:16:834981-d
atastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:48:16.875513] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 21 length 966582009856 [No such device or
address]
[2017-05-19 14:48:16.875547] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(9e66f0d2-501b-4cf9-80db-f423e2e2ef0f) ==> (No such de
vice or address) [No such device or address]
[2017-05-19 14:50:42.399211] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-144293-2017/05/19-14:48:16:834981-d
atastore1-client-0-0-0
[2017-05-19 14:50:42.399261] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-144293-2017/05/19-14:48:16:834981-datast
ore1-client-0-0-0
[2017-05-19 14:50:42.399258] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-144293-2017/05/19-14:48:15:661020-d
atastore1-client-0-0-0
[2017-05-19 14:50:42.399290] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-144293-2017/05/19-14:48:15:661020-datast
ore1-client-0-0-0
[2017-05-19 14:50:43.385444] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:50:43.385464] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144530-2017/05/19-14:50:43:303790-d
atastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:50:43.411762] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 20 length 539669364736 [No such device or
address]
[2017-05-19 14:50:43.411803] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(bc8f6a7e-31e5-4b48-946c-f779a4b2e64f) ==> (No such de
vice or address) [No such device or address]
[2017-05-19 14:50:44.030125] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:50:44.030149] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144530-2017/05/19-14:50:44:15375-da
tastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:50:44.056989] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 21 length 966582009856 [No such device or
address]
[2017-05-19 14:50:44.057016] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(9e66f0d2-501b-4cf9-80db-f423e2e2ef0f) ==> (No such de
vice or address) [No such device or address]
[2017-05-19 14:51:09.079185] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-144530-2017/05/19-14:50:43:303790-d
atastore1-client-0-0-0
[2017-05-19 14:51:09.079186] I [MSGID: 115036] [server.c:548:server_rpc_notify]
0-datastore1-server: disconnecting connection from
srvpve1-144530-2017/05/19-14:50:44:15375-da
tastore1-client-0-0-0
[2017-05-19 14:51:09.079284] I [MSGID: 115013]
[server-helpers.c:293:do_fd_cleanup] 0-datastore1-server: fd cleanup on
/images/101/vm-101-disk-1.qcow2
[2017-05-19 14:51:09.079285] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-144530-2017/05/19-14:50:44:15375-datasto
re1-client-0-0-0
[2017-05-19 14:51:09.079325] I [MSGID: 101055] [client_t.c:415:gf_client_unref]
0-datastore1-server: Shutting down connection
srvpve1-144530-2017/05/19-14:50:43:303790-datast
ore1-client-0-0-0
[2017-05-19 14:51:37.825420] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:51:37.825449] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144656-2017/05/19-14:51:37:744255-d
atastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:51:37.850244] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 20 length 539669364736 [No such device or
address]
[2017-05-19 14:51:37.850291] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(bc8f6a7e-31e5-4b48-946c-f779a4b2e64f) ==> (No such de
vice or address) [No such device or address]
[2017-05-19 14:51:38.083555] I [login.c:76:gf_auth] 0-auth/login: allowed user
names: b10d5e19-7220-4f42-b7c4-480eebe1f118
[2017-05-19 14:51:38.083576] I [MSGID: 115029]
[server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client
from srvpve1-144656-2017/05/19-14:51:38:66829-da
tastore1-client-0-0-0 (version: 3.8.11)
[2017-05-19 14:51:38.108907] E [MSGID: 113107] [posix.c:1051:posix_seek]
0-datastore1-posix: seek failed on fd 21 length 966582009856 [No such device or
address]
[2017-05-19 14:51:38.108940] E [MSGID: 115089]
[server-rpc-fops.c:2007:server_seek_cbk] 0-datastore1-server: 18: SEEK-2
(9e66f0d2-501b-4cf9-80db-f423e2e2ef0f) ==> (No such de
vice or address) [No such device or address]

This time I also get:
root at srvpve1:/var/log# gluster volume heal datastore1 info
Brick srvpve1g:/data/brick1/brick
/images/101/vm-101-disk-1.qcow2 - Possibly undergoing heal

/images/101/vm-101-disk-2.qcow2 - Possibly undergoing heal

Status: Connected
Number of entries: 2

Brick srvpve2g:/data/brick1/brick
/images/101/vm-101-disk-1.qcow2 - Possibly undergoing heal

/images/101/vm-101-disk-2.qcow2 - Possibly undergoing heal

Status: Connected
Number of entries: 2

Brick srvpve3g:/data/brick1/brick
/images/101/vm-101-disk-1.qcow2 - Possibly undergoing heal

/images/101/vm-101-disk-2.qcow2 - Possibly undergoing heal

Status: Connected
Number of entries: 2

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list