[Gluster-users] Upgrade 5.3 -> 5.4 on debian: public IP is used instead of LAN IP

Tue Mar 5 10:01:42 UTC 2019

fyi: did a downgrade 5.4 -> 5.3 and it worked. all replicas are up and
running. Awaiting updated v5.4.

thx :-)

Am Di., 5. März 2019 um 09:26 Uhr schrieb Hari Gowtham <hgowtham at redhat.com>:
>
> There are plans to revert the patch causing this error and rebuilt 5.4.
> This should happen faster. the rebuilt 5.4 should be void of this upgrade issue.
>
> In the meantime, you can use 5.3 for this cluster.
> Downgrading to 5.3 will work if it was just one node that was upgrade to 5.4
> and the other nodes are still in 5.3.
>
> On Tue, Mar 5, 2019 at 1:07 PM Hu Bert <revirii at googlemail.com> wrote:
> >
> > Hi Hari,
> >
> > thx for the hint. Do you know when this will be fixed? Is a downgrade
> > 5.4 -> 5.3 a possibility to fix this?
> >
> > Hubert
> >
> > Am Di., 5. März 2019 um 08:32 Uhr schrieb Hari Gowtham <hgowtham at redhat.com>:
> > >
> > > Hi,
> > >
> > > This is a known issue we are working on.
> > > As the checksum differs between the updated and non updated node, the
> > > peers are getting rejected.
> > > The bricks aren't coming because of the same issue.
> > >
> > > More about the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1685120
> > >
> > > On Tue, Mar 5, 2019 at 12:56 PM Hu Bert <revirii at googlemail.com> wrote:
> > > >
> > > > Interestingly: gluster volume status misses gluster1, while heal
> > > > statistics show gluster1:
> > > >
> > > > gluster volume status workdata
> > > > Status of volume: workdata
> > > > Gluster process                             TCP Port  RDMA Port  Online  Pid
> > > > ------------------------------------------------------------------------------
> > > > Brick gluster2:/gluster/md4/workdata        49153     0          Y       1723
> > > > Brick gluster3:/gluster/md4/workdata        49153     0          Y       2068
> > > > Self-heal Daemon on localhost               N/A       N/A        Y       1732
> > > > Self-heal Daemon on gluster3                N/A       N/A        Y       2077
> > > >
> > > > vs.
> > > >
> > > > gluster volume heal workdata statistics heal-count
> > > > Gathering count of entries to be healed on volume workdata has been successful
> > > >
> > > > Brick gluster1:/gluster/md4/workdata
> > > > Number of entries: 0
> > > >
> > > > Brick gluster2:/gluster/md4/workdata
> > > > Number of entries: 10745
> > > >
> > > > Brick gluster3:/gluster/md4/workdata
> > > > Number of entries: 10744
> > > >
> > > > Am Di., 5. März 2019 um 08:18 Uhr schrieb Hu Bert <revirii at googlemail.com>:
> > > > >
> > > > > Hi Miling,
> > > > >
> > > > > well, there are such entries, but those haven't been a problem during
> > > > > install and the last kernel update+reboot. The entries look like:
> > > > >
> > > > > PUBLIC_IP  gluster2.alpserver.de gluster2
> > > > >
> > > > > 192.168.0.50 gluster1
> > > > > 192.168.0.51 gluster2
> > > > > 192.168.0.52 gluster3
> > > > >
> > > > > 'ping gluster2' resolves to LAN IP; I removed the last entry in the
> > > > > 1st line, did a reboot ... no, didn't help. From
> > > > > /var/log/glusterfs/glusterd.log
> > > > >  on gluster 2:
> > > > >
> > > > > [2019-03-05 07:04:36.188128] E [MSGID: 106010]
> > > > > [glusterd-utils.c:3483:glusterd_compare_friend_volume] 0-management:
> > > > > Version of Cksums persistent differ. local cksum = 3950307018, remote
> > > > > cksum = 455409345 on peer gluster1
> > > > > [2019-03-05 07:04:36.188314] I [MSGID: 106493]
> > > > > [glusterd-handler.c:3843:glusterd_xfer_friend_add_resp] 0-glusterd:
> > > > > Responded to gluster1 (0), ret: 0, op_ret: -1
> > > > >
> > > > > Interestingly there are no entries in the brick logs of the rejected
> > > > > server. Well, not surprising as no brick process is running. The
> > > > > server gluster1 is still in rejected state.
> > > > >
> > > > > 'gluster volume start workdata force' starts the brick process on
> > > > > gluster1, and some heals are happening on gluster2+3, but via 'gluster
> > > > > volume status workdata' the volumes still aren't complete.
> > > > >
> > > > > gluster1:
> > > > > ------------------------------------------------------------------------------
> > > > > Brick gluster1:/gluster/md4/workdata        49152     0          Y       2523
> > > > > Self-heal Daemon on localhost               N/A       N/A        Y       2549
> > > > >
> > > > > gluster2:
> > > > > Gluster process                             TCP Port  RDMA Port  Online  Pid
> > > > > ------------------------------------------------------------------------------
> > > > > Brick gluster2:/gluster/md4/workdata        49153     0          Y       1723
> > > > > Brick gluster3:/gluster/md4/workdata        49153     0          Y       2068
> > > > > Self-heal Daemon on localhost               N/A       N/A        Y       1732
> > > > > Self-heal Daemon on gluster3                N/A       N/A        Y       2077
> > > > >
> > > > >
> > > > > Hubert
> > > > >
> > > > > Am Di., 5. März 2019 um 07:58 Uhr schrieb Milind Changire <mchangir at redhat.com>:
> > > > > >
> > > > > > There are probably DNS entries or /etc/hosts entries with the public IP Addresses that the host names (gluster1, gluster2, gluster3) are getting resolved to.
> > > > > > /etc/resolv.conf would tell which is the default domain searched for the node names and the DNS servers which respond to the queries.
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 5, 2019 at 12:14 PM Hu Bert <revirii at googlemail.com> wrote:
> > > > > >>
> > > > > >> Good morning,
> > > > > >>
> > > > > >> i have a replicate 3 setup with 2 volumes, running on version 5.3 on
> > > > > >> debian stretch. This morning i upgraded one server to version 5.4 and
> > > > > >> rebooted the machine; after the restart i noticed that:
> > > > > >>
> > > > > >> - no brick process is running
> > > > > >> - gluster volume status only shows the server itself:
> > > > > >> gluster volume status workdata
> > > > > >> Status of volume: workdata
> > > > > >> Gluster process                             TCP Port  RDMA Port  Online  Pid
> > > > > >> ------------------------------------------------------------------------------
> > > > > >> Brick gluster1:/gluster/md4/workdata        N/A       N/A        N       N/A
> > > > > >> NFS Server on localhost                     N/A       N/A        N       N/A
> > > > > >>
> > > > > >> - gluster peer status on the server
> > > > > >> gluster peer status
> > > > > >> Number of Peers: 2
> > > > > >>
> > > > > >> Hostname: gluster3
> > > > > >> Uuid: c7b4a448-ca6a-4051-877f-788f9ee9bc4a
> > > > > >> State: Peer Rejected (Connected)
> > > > > >>
> > > > > >> Hostname: gluster2
> > > > > >> Uuid: 162fea82-406a-4f51-81a3-e90235d8da27
> > > > > >> State: Peer Rejected (Connected)
> > > > > >>
> > > > > >> - gluster peer status on the other 2 servers:
> > > > > >> gluster peer status
> > > > > >> Number of Peers: 2
> > > > > >>
> > > > > >> Hostname: gluster1
> > > > > >> Uuid: 9a360776-7b58-49ae-831e-a0ce4e4afbef
> > > > > >> State: Peer Rejected (Connected)
> > > > > >>
> > > > > >> Hostname: gluster3
> > > > > >> Uuid: c7b4a448-ca6a-4051-877f-788f9ee9bc4a
> > > > > >> State: Peer in Cluster (Connected)
> > > > > >>
> > > > > >> I noticed that, in the brick logs, i see that the public IP is used
> > > > > >> instead of the LAN IP. brick logs from one of the volumes:
> > > > > >>
> > > > > >> rejected node: https://pastebin.com/qkpj10Sd
> > > > > >> connected nodes: https://pastebin.com/8SxVVYFV
> > > > > >>
> > > > > >> Why is the public IP suddenly used instead of the LAN IP? Killing all
> > > > > >> gluster processes and rebooting (again) didn't help.
> > > > > >>
> > > > > >>
> > > > > >> Thx,
> > > > > >> Hubert
> > > > > >> _______________________________________________
> > > > > >> Gluster-users mailing list
> > > > > >> Gluster-users at gluster.org
> > > > > >> https://lists.gluster.org/mailman/listinfo/gluster-users
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Milind
> > > > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Hari Gowtham.
>
>
>
> --
> Regards,
> Hari Gowtham.