[Gluster-users] Big problems after update to 9.6
David Cunningham
dcunningham at voisonics.com
Thu Feb 23 21:56:10 UTC 2023
Is it possible that version 9.1 and 9.6 can't talk to each other? My
understanding was that they should be able to.
On Fri, 24 Feb 2023 at 10:36, David Cunningham <dcunningham at voisonics.com>
wrote:
> We've tried to remove "sg" from the cluster so we can re-install the
> GlusterFS node on it, but the following command run on "br" also gives a
> timeout error:
>
> gluster volume remove-brick gvol0 replica 1
> sg:/nodirectwritedata/gluster/gvol0 force
>
> How can we tell "br" to just remove "sg" without trying to contact it?
>
>
> On Fri, 24 Feb 2023 at 10:31, David Cunningham <dcunningham at voisonics.com>
> wrote:
>
>> Hello,
>>
>> We have a cluster with two nodes, "sg" and "br", which were running
>> GlusterFS 9.1, installed via the Ubuntu package manager. We updated the
>> Ubuntu packages on "sg" to version 9.6, and now have big problems. The "br"
>> node is still on version 9.1.
>>
>> Running "gluster volume status" on either host gives "Error : Request
>> timed out". On "sg" not all processes are running, compared to "br", as
>> below. Restarting the services on "sg" doesn't help. Can anyone advise how
>> we should proceed? This is a production system.
>>
>> root at sg:~# ps -ef | grep gluster
>> root 15196 1 0 22:37 ? 00:00:00 /usr/sbin/glusterd -p
>> /var/run/glusterd.pid --log-level INFO
>> root 15426 1 0 22:39 ? 00:00:00 /usr/bin/python3
>> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
>> root 15457 15426 0 22:39 ? 00:00:00 /usr/bin/python3
>> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
>> root 19341 13695 0 23:24 pts/1 00:00:00 grep --color=auto gluster
>>
>> root at br:~# ps -ef | grep gluster
>> root 2052 1 0 2022 ? 00:00:00 /usr/bin/python3
>> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
>> root 2062 1 3 2022 ? 10-11:57:16 /usr/sbin/glusterfs
>> --fuse-mountopts=noatime --process-name fuse --volfile-server=br
>> --volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime
>> /mnt/glusterfs
>> root 2379 2052 0 2022 ? 00:00:00 /usr/bin/python3
>> /usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
>> root 5884 1 5 2022 ? 18-16:08:53 /usr/sbin/glusterfsd
>> -s br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p
>> /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S
>> /var/run/gluster/61df1d4e1c65300e.socket --brick-name
>> /nodirectwritedata/gluster/gvol0 -l
>> /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log
>> --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6
>> --process-name brick --brick-port 49152 --xlator-option
>> gvol0-server.listen-port=49152
>> root 10463 18747 0 23:24 pts/1 00:00:00 grep --color=auto gluster
>> root 27744 1 0 2022 ? 03:55:10 /usr/sbin/glusterfsd -s
>> br --volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p
>> /var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S
>> /var/run/gluster/61df1d4e1c65300e.socket --brick-name
>> /nodirectwritedata/gluster/gvol0 -l
>> /var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log
>> --xlator-option *-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6
>> --process-name brick --brick-port 49153 --xlator-option
>> gvol0-server.listen-port=49153
>> root 48227 1 0 Feb17 ? 00:00:26 /usr/sbin/glusterd -p
>> /var/run/glusterd.pid --log-level INFO
>>
>> On "sg" in glusterd.log we're seeing:
>>
>> [2023-02-23 20:26:57.619318 +0000] E [rpc-clnt.c:181:call_bail]
>> 0-management: bailing out frame type(glusterd mgmt v3), op(--(6)), xid =
>> 0x11, unique = 27, sent = 2023-02-23 20:16:50.596447 +0000, timeout = 600
>> for 10.20.20.11:24007
>> [2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115]
>> [glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking
>> failed on br. Please check log file for details.
>> [2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151]
>> [glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock
>> on some peer(s)
>> [2023-02-23 20:26:57.619693 +0000] W
>> [glusterd-locks.c:817:glusterd_mgmt_v3_unlock]
>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9)
>> [0x7fadf47fa9b9]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20)
>> [0x7fadf47f9e20]
>> -->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904)
>> [0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0
>> held by 11e528b0-8c69-4b5d-82ed-c41dd25536d6
>> [2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117]
>> [glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release
>> lock for gvol0
>> [2023-02-23 20:26:57.619939 +0000] I
>> [socket.c:3811:socket_submit_outgoing_msg] 0-socket.management: not
>> connected (priv->connected = -1)
>> [2023-02-23 20:26:57.619969 +0000] E
>> [rpcsvc.c:1567:rpcsvc_submit_generic] 0-rpc-service: failed to submit
>> message (XID: 0x3, Program: GlusterD svc cli, ProgVers: 2, Proc: 27) to
>> rpc-transport (socket.management)
>> [2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430]
>> [glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission
>> failed
>>
>> And in the brick log:
>>
>> [2023-02-23 20:22:56.717721 +0000] I [addr.c:54:compare_addr_and_update]
>> 0-/nodirectwritedata/gluster/gvol0: allowed = "*", received addr =
>> "10.20.20.11"
>> [2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login:
>> allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e
>> [2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029]
>> [server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client
>> from
>> CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0
>> (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0
>> [2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv]
>> 0-tcp.gvol0-server: readv on 10.20.20.11:49144 failed (No data available)
>> [2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036]
>> [server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection
>> [{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}]
>>
>> [2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055]
>> [client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection
>> CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0
>>
>>
>> Thanks for your help,
>>
>> --
>> David Cunningham, Voisonics Limited
>> http://voisonics.com/
>> USA: +1 213 221 1092
>> New Zealand: +64 (0)28 2558 3782
>>
>
>
> --
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20230224/96c49112/attachment.html>
More information about the Gluster-users
mailing list