[Bugs] [Bug 1524058] New: gluster peer command stops working with unhelpful error messages when DNS doens 't work
bugzilla at redhat.com
bugzilla at redhat.com
Sat Dec 9 17:44:38 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1524058
Bug ID: 1524058
Summary: gluster peer command stops working with unhelpful
error messages when DNS doens't work
Product: GlusterFS
Version: mainline
Component: core
Assignee: bugs at gluster.org
Reporter: nh2-redhatbugzilla at deditus.de
CC: bugs at gluster.org
Description of problem:
Gluster 3.12.3 on Linux.
Consider the following outputs. None of that makes any sense to me:
[root at node-1:~]# time gluster peer probe status
peer probe: success. Host status port 24007 already in peer list
real 0m10.060s
[root at node-1:~]# time gluster peer status
peer status: failed
real 0m0.051s
[root at node-1:~]# time gluster pool list
pool list: failed
real 0m0.050s
[root at node-1:~]# gluster peer probe 10.0.0.1
peer probe: success. Probe on localhost not needed
[root at node-1:~]# gluster peer probe 10.0.0.2
peer probe: success. Host 10.0.0.2 port 24007 already in peer list
[root at node-1:~]# gluster peer detach status
peer detach: failed: One of the peers is probably down. Check with 'peer
status'
[root at node-1:~]# gluster peer status
peer status: failed
First, when I run `gluster peer probe status` (which is not a reasonable
command, as it now thinks that `status` is a hostname), why does it say "peer
probe: success. Host status port 24007 already in peer list"? That makes no
sense, there is no host called "status" in my network.
Next, `gluster peer status` fails; the error message in extremely unhelpful
"peer status: failed" as it contains no information on the failure.
Later probes of e.g. `10.0.0.1` suggest that there's already a working "peer
list" with some contents, but apparently I have no way at all to list those
peers.
When I try to detach the apparently-attached garbage peer called "status", I
get told to run `peer status`, but it doesn't work.
What's going on here?
The glusterd log (/var/log/glusterfs/glusterd.log) gives some insight:
[2017-12-09 17:34:21.858454] I [MSGID: 106487]
[glusterd-handler.c:1485:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
[2017-12-09 17:34:21.858517] W [dict.c:912:str_to_data]
(-->/nix/store/y9qg9jan88wnsszmb1badhyfak2znpz7-glusterfs-3.12.3/lib/glusterfs/3.12.3/xlator/mgmt/glusterd.so(+0x104db4)
[0x7f6f54fdadb4]
-->/nix/store/y9qg9jan88wnsszmb1badhyfak2znpz7-glusterfs-3.12.3/lib/libglusterfs.so.0(dict_set_str+0x16)
[0x7f6f60919be6]
-->/nix/store/y9qg9jan88wnsszmb1badhyfak2znpz7-glusterfs-3.12.3/lib/libglusterfs.so.0(str_to_data+0x82)
[0x7f6f60918122] ) 0-dict: value is NULL [Invalid argument]
[2017-12-09 17:34:23.103687] E [MSGID: 101075]
[common-utils.c:320:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Temporary
failure in name resolution)
[2017-12-09 17:34:23.103714] E [name.c:267:af_inet_client_get_remote_sockaddr]
0-management: DNS resolution failed on host status
Looks like the real error is `getaddrinfo failed`, probably some DNS problem on
my system.
So:
* Could `gluster peer status` tell me directly about this problem, instead of
saying "failed"?
* Why do `gluster peer status` and `gluster pool list` fail if DNS doesn't
work? I'd assume if there is a list of hosts, I should be able to view it, any
time.
* What's going on with the weird success message of adding a non-existant host?
* What's up with `0-dict: value is NULL [Invalid argument]`?
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list