[Gluster-users] brick offline after restart glusterd
Tiemen Ruiten
t.ruiten at rdmedia.com
Mon Jul 13 16:25:45 UTC 2015
Hello,
We have a two-node gluster cluster, running version 3.7.1, that hosts an
oVirt storage domain. This afternoon I tried creating a template in oVirt,
but within a minute VM's stopped responding and Gluster started generating
errors like the following:
[2015-07-13 14:09:51.772629] W [rpcsvc.c:270:rpcsvc_program_actor]
0-rpc-service: RPC program not available (req 1298437 330) for
10.100.3.40:1021
[2015-07-13 14:09:51.772675] E [rpcsvc.c:565:rpcsvc_check_and_reply_error]
0-rpcsvc: rpc actor failed to complete successfully
I managed to get things in working order again by restarting glusterd and
glusterfsd, but now one brick is down:
$sudo gluster volume status vmimage
Status of volume: vmimage
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.100.3.10:/export/gluster01/brick N/A N/A N
36736
Brick 10.100.3.11:/export/gluster01/brick 49153 0 Y
11897
NFS Server on localhost 2049 0 Y
36720
Self-heal Daemon on localhost N/A N/A Y
36730
NFS Server on 10.100.3.11 2049 0 Y
11919
Self-heal Daemon on 10.100.3.11 N/A N/A Y
11924
Task Status of Volume vmimage
------------------------------------------------------------------------------
There are no active volume tasks
$ sudo gluster peer status
Number of Peers: 1
Hostname: 10.100.3.11
Uuid: f9872fea-47f5-41f6-8094-c9fabd3c1339
State: Peer in Cluster (Connected)
Additionally in the etc-glusterfs-glusterd.vol.log I see these messages
repeating every 3 seconds:
[2015-07-13 16:15:21.737044] W [socket.c:642:__socket_rwv] 0-management:
readv on /var/run/gluster/2bfe3a2242d586d0850775f601f1c3ee.socket failed
(Invalid argument)
The message "I [MSGID: 106005]
[glusterd-handler.c:4667:__glusterd_brick_rpc_notify] 0-management: Brick
10.100.3.10:/export/gluster01/brick has disconnected from glusterd."
repeated 39 times between [2015-07-13 16:13:24.717611] and [2015-07-13
16:15:21.737862]
[2015-07-13 16:15:24.737694] W [socket.c:642:__socket_rwv] 0-management:
readv on /var/run/gluster/2bfe3a2242d586d0850775f601f1c3ee.socket failed
(Invalid argument)
[2015-07-13 16:15:24.738498] I [MSGID: 106005]
[glusterd-handler.c:4667:__glusterd_brick_rpc_notify] 0-management: Brick
10.100.3.10:/export/gluster01/brick has disconnected from glusterd.
[2015-07-13 16:15:27.738194] W [socket.c:642:__socket_rwv] 0-management:
readv on /var/run/gluster/2bfe3a2242d586d0850775f601f1c3ee.socket failed
(Invalid argument)
[2015-07-13 16:15:30.738991] W [socket.c:642:__socket_rwv] 0-management:
readv on /var/run/gluster/2bfe3a2242d586d0850775f601f1c3ee.socket failed
(Invalid argument)
[2015-07-13 16:15:33.739735] W [socket.c:642:__socket_rwv] 0-management:
readv on /var/run/gluster/2bfe3a2242d586d0850775f601f1c3ee.socket failed
(Invalid argument)
Can I get this brick back up without bringing the volume/cluster down?
--
Tiemen Ruiten
Systems Engineer
R&D Media
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150713/3a40514d/attachment.html>
More information about the Gluster-users
mailing list