[Gluster-users] gluster volume status show second node is offline
Dario Lesca
d.lesca at solinos.it
Tue Sep 7 17:04:34 UTC 2021
I have setup a similar test environment with two VM on my PC, exactly
identical to the one in production .
All work fine.
But when I restart the node 2, all start and work fine, but the volume
status of node2 is offline
[root at virt2 ~]# gluster volume status gfsvol1
Status of volume: gfsvol1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick virt1.local:/gfsvol1/brick1 49152 0 Y 8591
Brick virt2.local:/gfsvol1/brick1 N/A N/A N N/A
Self-heal Daemon on localhost N/A N/A Y 970
Self-heal Daemon on virt1.local N/A N/A Y 8608
Task Status of Volume gfsvol1
------------------------------------------------------------------------------
There are no active volume tasks
I have found this solution:
https://bobcares.com/blog/gluster-bring-brick-online/
And when I re-run "gluster volume start gfsvol1 force" the volume
bring back online
[root at virt2 ~]# gluster volume start gfsvol1 force
volume start: gfsvol1: success
[root at virt2 ~]# gluster volume status gfsvol1
Status of volume: gfsvol1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick virt1.local:/gfsvol1/brick1 49152 0 Y 8591
Brick virt2.local:/gfsvol1/brick1 49153 0 Y 1422
Self-heal Daemon on localhost N/A N/A Y 970
Self-heal Daemon on virt1.local N/A N/A Y 8608
Task Status of Volume gfsvol1
------------------------------------------------------------------------------
There are no active volume tasks
But if I reboot the node 2 server, when system is started the volume is
already offline
Also if I restart glusterd service on node2 volume bring back online.
systemctl restart glusterd
Seem a serialize startup systemd service problem ...
Seem glusterd is start Before network is online
I have try modify the systemd unit
- from this
After=network.target
Before=network-online.target
- to this
After=network.target network-online.target
#Before=network-online.target
and now when I restart node 2 server all work fine and volume is always
online
What is wrong?
Let me know and many thanks for help
Dario
Il giorno mar, 07/09/2021 alle 09.46 +0200, Dario Lesca ha scritto:
> These are last line into /var/log/glusterfs/bricks/gfsvol1-brick1.log
> log
>
> [2021-09-06 21:29:02.165238 +0000] I [addr.c:54:compare_addr_and_update] 0-/gfsvol1/brick1: allowed = "*", received addr = "172.16.3.1"
> [2021-09-06 21:29:02.165365 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 12261a60-60a5-4791-a3f1-6da397046ee5
> [2021-09-06 21:29:02.165402 +0000] I [MSGID: 115029] [server-handshake.c:561:server_setvolume] 0-gfsvol1-server: accepted client from CTX_ID:444e0582-ac68-4f20-9552-c4dbc7724967-GRAPH_ID:0-PID:227500-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0 (version: 9.3) with subvol /gfsvol1/brick1
> [2021-09-06 21:29:02.179387 +0000] W [socket.c:767:__socket_rwv] 0-tcp.gfsvol1-server: readv on 172.16.3.1:49144 failed (No data available)
> [2021-09-06 21:29:02.179451 +0000] I [MSGID: 115036] [server.c:500:server_rpc_notify] 0-gfsvol1-server: disconnecting connection [{client-uid=CTX_ID:444e0582-ac68-4f20-9552-c4dbc7724967-GRAPH_ID:0-PID:227500-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0}]
> [2021-09-06 21:29:02.179877 +0000] I [MSGID: 101055] [client_t.c:397:gf_client_unref] 0-gfsvol1-server: Shutting down connection CTX_ID:444e0582-ac68-4f20-9552-c4dbc7724967-GRAPH_ID:0-PID:227500-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0
> [2021-09-06 21:29:10.254230 +0000] I [addr.c:54:compare_addr_and_update] 0-/gfsvol1/brick1: allowed = "*", received addr = "172.16.3.1"
> [2021-09-06 21:29:10.254283 +0000] I [login.c:110:gf_auth] 0-auth/login: allowed user names: 12261a60-60a5-4791-a3f1-6da397046ee5
> [2021-09-06 21:29:10.254300 +0000] I [MSGID: 115029] [server-handshake.c:561:server_setvolume] 0-gfsvol1-server: accepted client from CTX_ID:fef710c3-11bf-4a91-b749-f52a536d6dad-GRAPH_ID:0-PID:227541-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0 (version: 9.3) with subvol /gfsvol1/brick1
> [2021-09-06 21:29:10.272069 +0000] W [socket.c:767:__socket_rwv] 0-tcp.gfsvol1-server: readv on 172.16.3.1:49140 failed (No data available)
> [2021-09-06 21:29:10.272133 +0000] I [MSGID: 115036] [server.c:500:server_rpc_notify] 0-gfsvol1-server: disconnecting connection [{client-uid=CTX_ID:fef710c3-11bf-4a91-b749-f52a536d6dad-GRAPH_ID:0-PID:227541-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0}]
> [2021-09-06 21:29:10.272430 +0000] I [MSGID: 101055] [client_t.c:397:gf_client_unref] 0-gfsvol1-server: Shutting down connection CTX_ID:fef710c3-11bf-4a91-b749-f52a536d6dad-GRAPH_ID:0-PID:227541-HOST:s-virt1.realdomain.it-PC_NAME:gfsvol1-client-1-RECON_NO:-0
>
> I have a network adapter reserved and direct connected from the two
> server with dedicated IP 172.16.3.1/30 and 172.16.3.2/30, named via
> /etc/hosts virt1.local and virt2.local
>
> In this logs I see also the real server name ( ... HOST:s-
> virt1.realdomain.it-PC_NAME: ...) which has another IP on another
> network.
>
> Now this cluster is in production and support some VM.
>
> What is the bes way to solve this dangerous situation without risk?
>
> Many thanks
> Dario
>
> Il giorno mar, 07/09/2021 alle 05.28 +0000, Strahil Nikolov ha
> scritto:
> > No, it's not normal.
> > Go to the virt2 and in /var/log/gluster directory you will find
> > 'bricks' . Check the logs in bricks for more information.
> >
> > Best Regards,
> > Strahil Nikolov
> >
> >
> > > On Tue, Sep 7, 2021 at 1:13, Dario Lesca
> > > <d.lesca at solinos.it> wrote:
> > > Hello everybody!
> > > I'm a novice with gluster. I have setup my first cluster with two
> > > nodes
> > >
> > > This is the current volume info:
> > >
> > > [root at s-virt1 ~]# gluster volume info gfsvol1
> > > Volume Name: gfsvol1
> > > Type: Replicate
> > > Volume ID: 5bad4a23-58cc-44d7-8195-88409720b941
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 1 x 2 = 2
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: virt1.local:/gfsvol1/brick1
> > > Brick2: virt2.local:/gfsvol1/brick1
> > > Options Reconfigured:
> > > performance.client-io-threads: off
> > > nfs.disable: on
> > > transport.address-family: inet
> > > storage.fips-mode-rchecksum: on
> > > cluster.granular-entry-heal: on
> > > storage.owner-uid: 107
> > > storage.owner-gid: 107
> > > server.allow-insecure: on
> > >
> > > For now all seem work fine.
> > >
> > > I have mount the gfs volume on all two nodes and use the VM into
> > > it
> > >
> > > But today I noticed that the second node (virt2) is offline:
> > >
> > > [root at s-virt1 ~]# gluster volume status
> > > Status of volume: gfsvol1
> > > Gluster process TCP Port RDMA Port
> > > Online Pid
> > > ---------------------------------------------------------------
> > > ---------------
> > > Brick virt1.local:/gfsvol1/brick1 49152 0
> > > Y 3090
> > > Brick virt2.local:/gfsvol1/brick1 N/A N/A
> > > N N/A
> > > Self-heal Daemon on localhost N/A N/A
> > > Y 3105
> > > Self-heal Daemon on virt2.local N/A N/A
> > > Y 3140
> > >
> > > Task Status of Volume gfsvol1
> > > ---------------------------------------------------------------
> > > ---------------
> > > There are no active volume tasks
> > >
> > > [root at s-virt1 ~]# gluster volume status gfsvol1 detail
> > > Status of volume: gfsvol1
> > > ---------------------------------------------------------------
> > > ---------------
> > > Brick : Brick virt1.local:/gfsvol1/brick1
> > > TCP Port : 49152
> > > RDMA Port : 0
> > > Online : Y
> > > Pid : 3090
> > > File System : xfs
> > > Device : /dev/mapper/rl-gfsvol1
> > > Mount Options :
> > > rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=1
> > > 28,swidth=128,noquota
> > > Inode Size : 512
> > > Disk Space Free : 146.4GB
> > > Total Disk Space : 999.9GB
> > > Inode Count : 307030856
> > > Free Inodes : 307026149
> > > ---------------------------------------------------------------
> > > ---------------
> > > Brick : Brick virt2.local:/gfsvol1/brick1
> > > TCP Port : N/A
> > > RDMA Port : N/A
> > > Online : N
> > > Pid : N/A
> > > File System : xfs
> > > Device : /dev/mapper/rl-gfsvol1
> > > Mount Options :
> > > rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=1
> > > 28,swidth=128,noquota
> > > Inode Size : 512
> > > Disk Space Free : 146.4GB
> > > Total Disk Space : 999.9GB
> > > Inode Count : 307052016
> > > Free Inodes : 307047307
> > >
> > > What does it mean?
> > > What's wrong?
> > > Is this normal or I missing some setting?
> > >
> > > If you need more information let me know
> > >
> > > Many thanks for your help
> > >
> > >
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210907/ef39bc4b/attachment.html>
More information about the Gluster-users
mailing list