[Gluster-users] Geo-replication status Faulty

Tue Oct 27 12:15:04 UTC 2020

Dear Felix

I have applied this parameters to the 2-node gluster:

gluster vol set VMS cluster.heal-timeout 10
gluster volume heal VMS enable
gluster vol set VMS cluster.quorum-reads false
gluster vol set VMS cluster.quorum-count 1
gluster vol set VMS network.ping-timeout 2
gluster volume set VMS cluster.favorite-child-policy mtime
gluster volume heal VMS granular-entry-heal enable
gluster volume set VMS cluster.data-self-heal-algorithm full

As you can see, I used this for virtualization purposes.
Then I mount the gluster volume putting this line in the fstab file:

In gluster01

gluster01:VMS /vms glusterfs
defaults,_netdev,x-systemd.automount,backupvolfile-server=gluster02 0 0

In gluster02

gluster02:VMS /vms glusterfs
defaults,_netdev,x-systemd.automount,backupvolfile-server=gluster01 0 0

Then after shutdown the gluster01, gluster02 is still access the mounted
gluster volume...

Just the geo-rep has failure.

I could see why, but I'll make further investigation.

Thanks

---
Gilberto Nunes Ferreira

Em ter., 27 de out. de 2020 às 04:57, Felix Kölzow <felix.koelzow at gmx.de>
escreveu:

> Dear Gilberto,
>
>
> If I am right, you ran into server-quorum if you startet a 2-node replica
> and shutdown one host.
>
> From my perspective, its fine.
>
>
> Please correct me if I am wrong here.
>
>
> Regards,
>
> Felix
> On 27/10/2020 01:46, Gilberto Nunes wrote:
>
> Well I do not reboot the host. I shut down the host. Then after 15 min
> give up.
> Don't know why that happened.
> I will try it latter
>
> ---
> Gilberto Nunes Ferreira
>
>
>
>
>
>
>
>
> Em seg., 26 de out. de 2020 às 21:31, Strahil Nikolov <
> hunter86_bg at yahoo.com> escreveu:
>
>> Usually there is always only 1 "master" , but when you power off one of
>> the 2 nodes - the geo rep should handle that and the second node should
>> take the job.
>>
>> How long did you wait after gluster1 has been rebooted ?
>>
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>>
>>
>>
>> В понеделник, 26 октомври 2020 г., 22:46:21 Гринуич+2, Gilberto Nunes <
>> gilberto.nunes32 at gmail.com> написа:
>>
>>
>>
>>
>>
>> I was able to solve the issue restarting all servers.
>>
>> Now I have another issue!
>>
>> I just powered off the gluster01 server and then the geo-replication
>> entered in faulty status.
>> I tried to stop and start the gluster geo-replication like that:
>>
>> gluster volume geo-replication DATA root at gluster03::DATA-SLAVE resume
>>  Peer gluster01.home.local, which is a part of DATA volume, is down. Please
>> bring up the peer and retry. geo-replication command failed
>> How can I have geo-replication with 2 master and 1 slave?
>>
>> Thanks
>>
>>
>> ---
>> Gilberto Nunes Ferreira
>>
>>
>>
>>
>>
>>
>>
>> Em seg., 26 de out. de 2020 às 17:23, Gilberto Nunes <
>> gilberto.nunes32 at gmail.com> escreveu:
>> > Hi there...
>> >
>> > I'd created a 2 gluster vol and another 1 gluster server acting as a
>> backup server, using geo-replication.
>> > So in gluster01 I'd issued the command:
>> >
>> > gluster peer probe gluster02;gluster peer probe gluster03
>> > gluster vol create DATA replica 2 gluster01:/DATA/master01-data
>> gluster02:/DATA/master01-data/
>> >
>> > Then in gluster03 server:
>> >
>> > gluster vol create DATA-SLAVE gluster03:/DATA/slave-data/
>> >
>> > I'd setted the ssh powerless session between this 3 servers.
>> >
>> > Then I'd used this script
>> >
>> > https://github.com/gilbertoferreira/georepsetup
>> >
>> > like this
>> >
>> > georepsetup
>>            /usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
>> CryptographyDeprecationWarning: Python 2 is no longer supported by the
>> Python core team. Support for it is now deprecated in cryptography, and
>> will be removed in a future release.  from cryptography.hazmat.backends
>> import default_backend usage: georepsetup [-h] [--force] [--no-color]
>> MASTERVOL SLAVE SLAVEVOL georepsetup: error: too few arguments gluster01:~#
>> georepsetup DATA gluster03 DATA-SLAVE
>> /usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
>> CryptographyDeprecationWarning: Python 2 is no longer supported by the
>> Python core team. Support for it is now deprecated in cryptography, and
>> will be removed in a future release.  from cryptography.hazmat.backends
>> import default_backend Geo-replication session will be established between
>> DATA and gluster03::DATA-SLAVE Root password of gluster03 is required to
>> complete the setup. NOTE: Password will not be stored. root at gluster03's
>> password:  [    OK] gluster03 is Reachable(Port 22) [    OK] SSH Connection
>> established root at gluster03 [    OK] Master Volume and Slave Volume are
>> compatible (Version: 8.2) [    OK] Common secret pub file present at
>> /var/lib/glusterd/geo-replication/common_secret.pem.pub [    OK]
>> common_secret.pem.pub file copied to gluster03 [    OK] Master SSH Keys
>> copied to all Up Slave nodes [    OK] Updated Master SSH Keys to all Up
>> Slave nodes authorized_keys file [    OK] Geo-replication Session
>> Established
>> > Then I reboot the 3 servers...
>> > After a while everything works ok, but after a few minutes, I get
>> Faulty status in gluster01....
>> >
>> > There's the log
>> >
>> >
>> > [2020-10-26 20:16:41.362584] I
>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
>> Change [{status=Initializing...}] [2020-10-26 20:16:41.362937] I
>> [monitor(monitor):160:monitor] Monitor: starting gsyncd worker
>> [{brick=/DATA/master01-data}, {slave_node=gluster03}] [2020-10-26
>> 20:16:41.508884] I [resource(worker
>> /DATA/master01-data):1387:connect_remote] SSH: Initializing SSH connection
>> between master and slave... [2020-10-26 20:16:42.996678] I [resource(worker
>> /DATA/master01-data):1436:connect_remote] SSH: SSH connection between
>> master and slave established. [{duration=1.4873}] [2020-10-26
>> 20:16:42.997121] I [resource(worker /DATA/master01-data):1116:connect]
>> GLUSTER: Mounting gluster volume locally... [2020-10-26 20:16:44.170661] E
>> [syncdutils(worker /DATA/master01-data):110:gf_mount_ready] <top>: failed
>> to get the xattr value [2020-10-26 20:16:44.171281] I [resource(worker
>> /DATA/master01-data):1139:connect] GLUSTER: Mounted gluster volume
>> [{duration=1.1739}] [2020-10-26 20:16:44.171772] I [subcmds(worker
>> /DATA/master01-data):84:subcmd_worker] <top>: Worker spawn successful.
>> Acknowledging back to monitor [2020-10-26 20:16:46.200603] I [master(worker
>> /DATA/master01-data):1645:register] _GMaster: Working dir
>> [{path=/var/lib/misc/gluster/gsyncd/DATA_gluster03_DATA-SLAVE/DATA-master01-data}]
>> [2020-10-26 20:16:46.201798] I [resource(worker
>> /DATA/master01-data):1292:service_loop] GLUSTER: Register time
>> [{time=1603743406}] [2020-10-26 20:16:46.226415] I [gsyncdstatus(worker
>> /DATA/master01-data):281:set_active] GeorepStatus: Worker Status Change
>> [{status=Active}] [2020-10-26 20:16:46.395112] I [gsyncdstatus(worker
>> /DATA/master01-data):253:set_worker_crawl_status] GeorepStatus: Crawl
>> Status Change [{status=History Crawl}] [2020-10-26 20:16:46.396491] I
>> [master(worker /DATA/master01-data):1559:crawl] _GMaster: starting history
>> crawl [{turns=1}, {stime=(1603742506, 0)},{etime=1603743406},
>> {entry_stime=(1603743226, 0)}] [2020-10-26 20:16:46.399292] E
>> [resource(worker /DATA/master01-data):1312:service_loop] GLUSTER: Changelog
>> History Crawl failed [{error=[Errno 0] Sucesso}] [2020-10-26
>> 20:16:47.177205] I [monitor(monitor):228:monitor] Monitor: worker died in
>> startup phase [{brick=/DATA/master01-data}] [2020-10-26 20:16:47.184525] I
>> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
>> Change [{status=Faulty}]
>> >
>> > Any advice will be welcome.
>> >
>> > Thanks
>> >
>> > ---
>> > Gilberto Nunes Ferreira
>> >
>> >
>> >
>> >
>> >
>> >
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing listGluster-users at gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/7b75b846/attachment.html>