[Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work Now: Upgraded to 4.1.3 geo node Faulty

Kotresh Hiremath Ravishankar khiremat at redhat.com
Fri Aug 31 09:09:13 UTC 2018


Hi Marcus,

Could you attach full logs? Is the same trace back happening repeatedly? It
will be helpful you attach the corresponding mount log as well.
What's the rsync version, you are using?

Thanks,
Kotresh HR

On Fri, Aug 31, 2018 at 12:16 PM, Marcus Pedersén <marcus.pedersen at slu.se>
wrote:

> Hi all,
>
> I had problems with stopping sync after upgrade to 4.1.2.
>
> I upgraded to 4.1.3 and it ran fine for one day, but now one of the master
> nodes shows faulty.
>
> Most of the sync jobs have return code 23, how do I resolve this?
>
> I see messages like:
>
> _GMaster: Sucessfully fixed all entry ops with gfid mismatch
>
> Will this resolve error code 23?
>
> There is also a python error.
>
> The python error was a selinux problem, turning off selinux made node go
> to active again.
>
> See log below.
>
>
> CentOS 7, installed through SIG Gluster (OS updated to latest at the same
> time)
>
> Master cluster: 2 x (2 + 1) distributed, replicated
>
> Client cluster: 1 x (2 + 1) replicated
>
>
> Many thanks in advance!
>
>
> Best regards
>
> Marcus Pedersén
>
>
>
> gsyncd.log from Faulty node:
>
> [2018-08-31 06:25:51.375267] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.8099 num_files=57    job=3
> return_code=23
> [2018-08-31 06:25:51.465895] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0904 num_files=3     job=3
> return_code=23
> [2018-08-31 06:25:52.562107] E [repce(worker /urd-gds/gluster):197:__call__]
> RepceClient: call failed   call=30069:139655665837888:1535696752.35
> method=entry_ops        error=OSError
> [2018-08-31 06:25:52.562346] E [syncdutils(worker
> /urd-gds/gluster):332:log_raise_exception] <top>: FAIL:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in
> main
>     func(args)
>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
> subcmd_worker
>     local.service_loop(remote)
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1288,
> in service_loop
>     g3.crawlwrap(oneshot=True)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in
> crawlwrap
>     self.crawl()
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545,
> in crawl
>     self.changelogs_batch_process(changes)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445,
> in changelogs_batch_process
>     self.process(batch)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280,
> in process
>     self.process_change(change, done, retry)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179,
> in process_change
>     failures = self.slave.server.entry_ops(entries)
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
> __call__
>     return self.ins(self.meth, *a)
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
> __call__
>     raise res
> OSError: [Errno 13] Permission denied
> [2018-08-31 06:25:52.578367] I [repce(agent /urd-gds/gluster):80:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor:
> worker died in startup phase     brick=/urd-gds/gluster
> [2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status]
> GeorepStatus: Worker Status Change status=Faulty
> [2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main]
> <top>: Using session config file       path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main]
> <top>: Using session config file      path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-31 06:26:03.637486] I [changelogagent(agent
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-08-31 06:26:03.650330] I [resource(worker /urd-gds/gluster):1377:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-08-31 06:26:05.296473] I [resource(worker /urd-gds/gluster):1424:connect_remote]
> SSH: SSH connection between master and slave established.
> duration=1.6457
> [2018-08-31 06:26:05.297904] I [resource(worker /urd-gds/gluster):1096:connect]
> GLUSTER: Mounting gluster volume locally...
> [2018-08-31 06:26:06.396939] I [resource(worker /urd-gds/gluster):1119:connect]
> GLUSTER: Mounted gluster volume duration=1.0985
> [2018-08-31 06:26:06.397691] I [subcmds(worker /urd-gds/gluster):70:subcmd_worker]
> <top>: Worker spawn successful. Acknowledging back to monitor
> [2018-08-31 06:26:16.815566] I [master(worker /urd-gds/gluster):1593:register]
> _GMaster: Working dir    path=/var/lib/misc/gluster/
> gsyncd/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/urd-gds-gluster
> [2018-08-31 06:26:16.816423] I [resource(worker /urd-gds/gluster):1282:service_loop]
> GLUSTER: Register time     time=1535696776
> [2018-08-31 06:26:16.888772] I [gsyncdstatus(worker
> /urd-gds/gluster):277:set_active] GeorepStatus: Worker Status
> Change        status=Active
> [2018-08-31 06:26:16.892049] I [gsyncdstatus(worker
> /urd-gds/gluster):249:set_worker_crawl_status] GeorepStatus: Crawl Status
> Change    status=History Crawl
> [2018-08-31 06:26:16.892703] I [master(worker
> /urd-gds/gluster):1507:crawl] _GMaster: starting history crawl    turns=1
> stime=(1525739167, 0)   entry_stime=(1525740143, 0)     etime=1535696776
> [2018-08-31 06:26:17.914803] I [master(worker
> /urd-gds/gluster):1536:crawl] _GMaster: slave's time
> stime=(1525739167, 0)
> [2018-08-31 06:26:18.521718] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.1063 num_files=17    job=3
> return_code=23
> [2018-08-31 06:26:19.260137] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.3441 num_files=34    job=1
> return_code=23
> [2018-08-31 06:26:19.615191] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0923 num_files=7     job=3
> return_code=23
> [2018-08-31 06:26:19.891227] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.1302 num_files=12    job=1
> return_code=23
> [2018-08-31 06:26:19.922700] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.5024 num_files=50    job=2
> return_code=23
>
> [2018-08-31 06:26:21.639342] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=1.5233 num_files=5     job=3
> return_code=23
> [2018-08-31 06:26:22.12726] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken    duration=0.1191 num_files=7     job=1
> return_code=23
> [2018-08-31 06:26:22.86136] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken    duration=0.0731 num_files=4     job=1
> return_code=23
> [2018-08-31 06:26:22.503290] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0779 num_files=15    job=2
> return_code=23
> [2018-08-31 06:26:23.214704] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0738 num_files=9     job=3
> return_code=23
> [2018-08-31 06:26:23.251876] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.2478 num_files=33    job=2
> return_code=23
> [2018-08-31 06:26:23.802699] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0873 num_files=9     job=3
> return_code=23
> [2018-08-31 06:26:23.828176] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0758 num_files=3     job=2
> return_code=23
> [2018-08-31 06:26:23.854063] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.2662 num_files=34    job=1
> return_code=23
> [2018-08-31 06:26:24.403228] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0997 num_files=30    job=3
> return_code=23
> [2018-08-31 06:26:25.526] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken      duration=0.0965 num_files=8     job=3
> return_code=23
> [2018-08-31 06:26:25.438527] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0832 num_files=9     job=1
> return_code=23
> [2018-08-31 06:26:25.447256] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.6180 num_files=86    job=2
> return_code=23
> [2018-08-31 06:26:25.571913] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0706 num_files=2     job=3
> return_code=0
> [2018-08-31 06:26:27.21325] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken    duration=0.0814 num_files=1     job=1
> return_code=23
> [2018-08-31 06:26:27.615520] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0933 num_files=13    job=1
> return_code=23
> [2018-08-31 06:26:27.668323] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.2190 num_files=95    job=2
> return_code=23
> [2018-08-31 06:26:27.740139] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0716 num_files=11    job=2
> return_code=23
> [2018-08-31 06:26:28.191068] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.1167 num_files=38    job=3
> return_code=23
> [2018-08-31 06:26:28.268213] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0768 num_files=7     job=3
> return_code=23
> [2018-08-31 06:26:28.317909] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0770 num_files=4     job=2
> return_code=23
> [2018-08-31 06:26:28.710064] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0932 num_files=23    job=1
> return_code=23
> [2018-08-31 06:26:28.907250] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0886 num_files=26    job=2
> return_code=23
> [2018-08-31 06:26:28.976679] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0692 num_files=4     job=2
> return_code=23
> [2018-08-31 06:26:29.55774] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken    duration=0.0788 num_files=9     job=2
> return_code=23
> [2018-08-31 06:26:29.295576] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0847 num_files=16    job=1
> return_code=23
> [2018-08-31 06:26:29.665076] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.1087 num_files=25    job=2
> return_code=23
> [2018-08-31 06:26:30.277998] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.1122 num_files=40    job=2
> return_code=23
> [2018-08-31 06:26:31.153105] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.3822 num_files=74    job=3
> return_code=23
> [2018-08-31 06:26:31.227639] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0743 num_files=18    job=3
> return_code=23
> [2018-08-31 06:26:31.302660] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0748 num_files=18    job=3
> return_code=23
>
>
> ---
> När du skickar e-post till SLU så innebär detta att SLU behandlar dina
> personuppgifter. För att läsa mer om hur detta går till, klicka här
> <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> E-mailing SLU will result in SLU processing your personal data. For more
> information on how this is done, click here
> <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Thanks and Regards,
Kotresh H R
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180831/9660fd9a/attachment.html>


More information about the Gluster-users mailing list