[Gluster-users] Upgrade to 4.1.1 geo-replication does not work

Kotresh Hiremath Ravishankar khiremat at redhat.com
Thu Jul 12 05:41:36 UTC 2018


Hi Marcus,

I think the fix [1] is needed in 4.1
Could you please this out and let us know if that works for you?

[1] https://review.gluster.org/#/c/20207/

Thanks,
Kotresh HR

On Thu, Jul 12, 2018 at 1:49 AM, Marcus Pedersén <marcus.pedersen at slu.se>
wrote:

> Hi all,
>
> I have upgraded from 3.12.9 to 4.1.1 and been following upgrade
> instructions for offline upgrade.
>
> I upgraded geo-replication side first 1 x (2+1) and the master side after
> that 2 x (2+1).
>
> Both clusters works the way they should on their own.
>
> After upgrade on master side status for all geo-replication nodes
> is Stopped.
>
> I tried to start the geo-replication from master node and response back
> was started successfully.
>
> Status again .... Stopped
>
> Tried to start again and get response started successfully, after that all
> glusterd crashed on all master nodes.
>
> After a restart of all glusterd the master cluster was up again.
>
> Status for geo-replication is still Stopped and every try to start it
> after this gives the response successful but still status Stopped.
>
>
> Please help me get the geo-replication up and running again.
>
>
> Best regards
>
> Marcus Pedersén
>
>
> Part of geo-replication log from master node:
>
> [2018-07-11 18:42:48.941760] I [changelogagent(/urd-gds/gluster):73:__init__]
> ChangelogAgent: Agent listining...
> [2018-07-11 18:42:48.947567] I [resource(/urd-gds/gluster):1780:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-11 18:42:49.363514] E [syncdutils(/urd-gds/gluster):304:log_raise_exception]
> <top>: connection to peer is broken
> [2018-07-11 18:42:49.364279] E [resource(/urd-gds/gluster):210:errlog]
> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-hjRhBo/
> 7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
> /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee
> --local-id .%\
> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
> gluster://localhost:urd-gds-volume   error=2
> [2018-07-11 18:42:49.364586] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> usage: gsyncd.py [-h]
> [2018-07-11 18:42:49.364799] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>
> [2018-07-11 18:42:49.364989] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  {monitor-status,monitor,
> worker,agent,slave,status,config-check,config-get,config-set,config-reset,
> voluuidget,d\
> elete}
> [2018-07-11 18:42:49.365210] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  ...
> [2018-07-11 18:42:49.365408] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status',
> 'monit\
> or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get',
> 'config-set', 'config-reset', 'voluuidget', 'delete')
> [2018-07-11 18:42:49.365919] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:49.369316] I [repce(/urd-gds/gluster):92:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-11 18:42:49.369921] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:49.369694] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection       brick=/urd-gds/gluster
> [2018-07-11 18:42:59.492762] I [monitor(monitor):280:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster
> slave_node=ssh://geouser@urd-gds-geo-000:gluster://
> localhost:urd-gds-volume
> [2018-07-11 18:42:59.558491] I [resource(/urd-gds/gluster):1780:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-07-11 18:42:59.559056] I [changelogagent(/urd-gds/gluster):73:__init__]
> ChangelogAgent: Agent listining...
> [2018-07-11 18:42:59.945693] E [syncdutils(/urd-gds/gluster):304:log_raise_exception]
> <top>: connection to peer is broken
> [2018-07-11 18:42:59.946439] E [resource(/urd-gds/gluster):210:errlog]
> Popen: command returned error    cmd=ssh -oPasswordAuthentication=no
> -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret\
> .pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-992bk7/
> 7e5534547f3675a710a107722317484f.sock geouser at urd-gds-geo-000
> /nonexistent/gsyncd --session-owner 5e94eb7d-219f-4741-a179-d4ae6b50c7ee
> --local-id .%\
> 2Furd-gds%2Fgluster --local-node urd-gds-001 -N --listen --timeout 120
> gluster://localhost:urd-gds-volume   error=2
> [2018-07-11 18:42:59.946748] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> usage: gsyncd.py [-h]
> [2018-07-11 18:42:59.946962] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>
> [2018-07-11 18:42:59.947150] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  {monitor-status,monitor,
> worker,agent,slave,status,config-check,config-get,config-set,config-reset,
> voluuidget,d\
> elete}
> [2018-07-11 18:42:59.947369] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh>                  ...
> [2018-07-11 18:42:59.947552] E [resource(/urd-gds/gluster):214:logerr]
> Popen: ssh> gsyncd.py: error: argument subcmd: invalid choice:
> '5e94eb7d-219f-4741-a179-d4ae6b50c7ee' (choose from 'monitor-status',
> 'monit\
> or', 'worker', 'agent', 'slave', 'status', 'config-check', 'config-get',
> 'config-set', 'config-reset', 'voluuidget', 'delete')
> [2018-07-11 18:42:59.948046] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:59.951392] I [repce(/urd-gds/gluster):92:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-07-11 18:42:59.951760] I [syncdutils(/urd-gds/gluster):271:finalize]
> <top>: exiting.
> [2018-07-11 18:42:59.951817] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection       brick=/urd-gds/gluster
> [2018-07-11 18:43:10.54580] I [monitor(monitor):280:monitor] Monitor:
> starting gsyncd worker    brick=/urd-gds/gluster
> slave_node=ssh://geouser@urd-gds-geo-000:gluster://
> localhost:urd-gds-volume
> [2018-07-11 18:43:10.88356] I [monitor(monitor):345:monitor] Monitor:
> Changelog Agent died, Aborting Worker     brick=/urd-gds/gluster
> [2018-07-11 18:43:10.88613] I [monitor(monitor):353:monitor] Monitor:
> worker died before establishing connection        brick=/urd-gds/gluster
> [2018-07-11 18:43:20.112435] I [gsyncdstatus(monitor):242:set_worker_status]
> GeorepStatus: Worker Status Change status=inconsistent
> [2018-07-11 18:43:20.112885] E [syncdutils(monitor):331:log_raise_exception]
> <top>: FAIL:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line
> 361, in twrap
>     except:
>   File "/usr/libexec/glusterfs/python/syncdaemon/monitor.py", line 428,
> in wmon
>     sys.exit()
> TypeError: 'int' object is not iterable
> [2018-07-11 18:43:20.114610] I [syncdutils(monitor):271:finalize] <top>:
> exiting.
>
> ---
> När du skickar e-post till SLU så innebär detta att SLU behandlar dina
> personuppgifter. För att läsa mer om hur detta går till, klicka här
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180712/6078afc3/attachment.html>


More information about the Gluster-users mailing list