[Gluster-users] Resolve brick failed in restore

Ayelet Shemesh shemesh.ayelet at gmail.com
Thu Jan 3 13:07:06 UTC 2013


Hi,

I have a lab with 10 machines acting as storage servers for some compute
machines, using glusterfs to distribute the data as two volumes.

Created using:
gluster volume create vol1 192.168.10.{221..230}:/data/vol1
gluster volume create vol2 replica 2 192.168.10.{221..230}:/data/vol2

and mounted on the client and server machines using:
mount -t glusterfs 192.168.10.221:/vol1 /mnt/vol1
mount -t glusterfs 192.168.10.221:/vol2 /mnt/vol2

Everything worked great for almost two months now, but for some reason the
bricks at 192.168.10.230 do not respond any more, making the non replicated
volume very troublesome.

In the client machine under /var/log/gluster/mnt-vol1.log I see lots and
lots of:
0-vol1-clinet-9: remote operation failed: Transport endpoint is not
connected
and some:
0-vol1-clinet-9: remote operation failed: Transport endpoint is not
connected. Path: / (00000000-0000-0000-0000-000000000001)

In the server I see under /var/log/gluster/etc-glusterfs-glusterd.vol.log :
0-: Unknown key: brick-0
0-: Unknown key: brick-1
0-: Unknown key: brick-2
0-: Unknown key: brick-3
0-: Unknown key: brick-4
0-: Unknown key: brick-5
0-: Unknown key: brick-6
0-: Unknown key: brick-7
0-: Unknown key: brick-8
0-: Unknown key: brick-9
...
0-management: setting frame-timeout to 600
0-management: connect returned 0
....
0-glusterd: resolve brick failed in restore
0-glusterd: cannot resolve brick: 192.168.10.230:/data/vol1
0-glusterd: cannot resolve brick: 192.168.10.230:/data/vol2
0-management: Found brick
....
0-: Stopping gluster glustershd running in pid: 3589
...
Given volfile:
+----------------------------------------
1: volume management
2:   type mgmt/glusterd
3:   option working directory /var/lib/glusterd
4:   option transport-type socket,rdma
...
8: end-volume
+--------------------------------------
0-transport: disconnecting now
...
0-management: connection to  failed (Connection timed out)
....



What's the correct way to resolve this problem?

(btw - sorry I can not attach actual fragments of log - my lab is not
connected to the Internet so I had to copy it manually).


Thanks in advance,
Ayelet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130103/ea2d63e2/attachment.html>


More information about the Gluster-users mailing list