[Gluster-users] Error at boot and can't mount : failed to get the port number
Nicolas Repentin
nicolas at shivaserv.fr
Wed Jul 8 07:50:33 UTC 2015
Hello
I'm trying to find a solution to an error, maybe someone can help.
I'm using Centos 7, and glusterfs 3.6.3.
I've got 2 nodes on the same network and a volume replicated.
If both nodes are up, the volume is OK, and I can mount it on NFS on each node.
If one node is down, when I reboot the other, the volume can't be mounted.
I've got the error on log "failed to get the port number for remote subvolume" :
+------------------------------------------------------------------------------+
1: volume data-sync-client-0
2: type protocol/client
3: option ping-timeout 42
4: option remote-host host1
5: option remote-subvolume /gluster
6: option transport-type socket
7: option username fbd26745-afb8-4729-801e-e1a2db8ff38f
8: option password d077f325-1d03-494d-bfe5-d662ce2d22fe
9: option send-gids true
10: end-volume
11:
12: volume data-sync-client-1
13: type protocol/client
14: option ping-timeout 42
15: option remote-host host2
16: option remote-subvolume /gluster
17: option transport-type socket
18: option username fbd26745-afb8-4729-801e-e1a2db8ff38f
19: option password d077f325-1d03-494d-bfe5-d662ce2d22fe
20: option send-gids true
21: end-volume
22:
23: volume data-sync-replicate-0
24: type cluster/replicate
25: subvolumes data-sync-client-0 data-sync-client-1
26: end-volume
27:
28: volume data-sync-dht
29: type cluster/distribute
30: subvolumes data-sync-replicate-0
31: end-volume
32:
33: volume data-sync-write-behind
34: type performance/write-behind
35: subvolumes data-sync-dht
36: end-volume
37:
38: volume data-sync-read-ahead
39: type performance/read-ahead
40: subvolumes data-sync-write-behind
41: end-volume
42:
43: volume data-sync-io-cache
44: type performance/io-cache
45: subvolumes data-sync-read-ahead
46: end-volume
47:
48: volume data-sync-quick-read
49: type performance/quick-read
50: subvolumes data-sync-io-cache
51: end-volume
52:
53: volume data-sync-open-behind
54: type performance/open-behind
55: subvolumes data-sync-quick-read
56: end-volume
57:
58: volume data-sync-md-cache
59: type performance/md-cache
60: subvolumes data-sync-open-behind
61: end-volume
62:
63: volume data-sync
64: type debug/io-stats
65: option latency-measurement off
66: option count-fop-hits off
67: subvolumes data-sync-md-cache
68: end-volume
69:
70: volume meta-autoload
71: type meta
72: subvolumes data-sync
73: end-volume
74:
+------------------------------------------------------------------------------+
[2015-07-08 06:06:08.088983] E [client-handshake.c:1496:client_query_portmap_cbk]
0-data-sync-client-1: failed to get the port number for remote subvolume. Please run 'gluster
volume status' on server to see if brick process is running.
[2015-07-08 06:06:08.089034] I [client.c:2215:client_rpc_notify] 0-data-sync-client-1: disconnected
from data-sync-client-1. Client process will keep trying to connect to glusterd until brick's port
is available
[2015-07-08 06:06:10.769962] E [socket.c:2276:socket_connect_finish] 0-data-sync-client-0:
connection to 192.168.1.12:24007 failed (No route to host)
[2015-07-08 06:06:10.769991] E [MSGID: 108006] [afr-common.c:3708:afr_notify]
0-data-sync-replicate-0: All subvolumes are down. Going offline until atleast one of them comes
back up.
[2015-07-08 06:06:10.772310] I [fuse-bridge.c:5080:fuse_graph_setup] 0-fuse: switched to graph 0
[2015-07-08 06:06:10.772430] I [fuse-bridge.c:4009:fuse_init] 0-glusterfs-fuse: FUSE inited with
protocol versions: glusterfs 7.22 kernel 7.22
[2015-07-08 06:06:10.772503] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no
subvolumes up
[2015-07-08 06:06:10.772631] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no
subvolumes up
[2015-07-08 06:06:10.772653] W [fuse-bridge.c:779:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / =>
-1 (Transport endpoint is not connected)
[2015-07-08 06:06:10.776974] I [afr-common.c:3839:afr_local_init] 0-data-sync-replicate-0: no
subvolumes up
[2015-07-08 06:06:10.777810] I [fuse-bridge.c:4921:fuse_thread_proc] 0-fuse: unmounting /data-sync
[2015-07-08 06:06:10.778007] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15),
shutting down
[2015-07-08 06:06:10.778022] I [fuse-bridge.c:5599:fini] 0-fuse: Unmounting '/data-sync'.
The volume is started but not online.
# gluster volume status
Status of volume: data-sync
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick host2:/gluster N/A N N/A
NFS Server on localhost N/A N N/A
Self-heal Daemon on localhost N/A N N/A
Task Status of Volume data-sync
------------------------------------------------------------------------------
There are no active volume tasks
To resolve it I need to stop the volume, and start it, and mount.
I can't find how to resolve it to each boot correctly.
I saw on a bug report it's a protection, the volume will stay offline if another node is not online, to avoid stale data.
Any idea to force it be online at boot ?
Thanks
Nicolas
More information about the Gluster-users
mailing list