[Bugs] [Bug 1364422] [libgfchangelog]: If changelogs are not available for the requested time range, no distinguished error
bugzilla at redhat.com
bugzilla at redhat.com
Fri Mar 10 11:08:04 UTC 2017
https://bugzilla.redhat.com/show_bug.cgi?id=1364422
Rahul Hinduja <rhinduja at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ON_QA |VERIFIED
--- Comment #16 from Rahul Hinduja <rhinduja at redhat.com> ---
Thanks to kotresh in helping this race condition verification. With 3.1.3 it
crashes and remains faulty. With 3.2.0 it goes to hybrid and continuous. Moving
this bug to verified state.
On 3.1.3
Set stime to less than start time available as follows.
[root at dhcp42-195 ~]# getfattr -d -m . -e hex /rhs/brick3/b3/
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick3/b3/
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.5c513245-edca-436e-a98f-9367f1105a6a.stime=0x59d24c220008dc36
trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.xtime=0x58c24c2c000acca9
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0xd820a210d7a74b6c807fbe7e5d5fbad6
[root at dhcp42-195 ~]# getfattr -d -m . -e hex /rhs/brick3/b3/
getfattr: Removing leading '/' from absolute path names
# file: rhs/brick3/b3/
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.5c513245-edca-436e-a98f-9367f1105a6a.stime=0x59d24c220008dc36
trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.xtime=0x58c24c2c000acca9
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
trusted.glusterfs.volume-id=0xd820a210d7a74b6c807fbe7e5d5fbad6
[root at dhcp42-195 ~]# less
/rhs/brick3/b3/.glusterfs/changelogs/htime/HTIME.1489086377
1489086392 <<<< start time available
[root at dhcp42-195 ~]# python
Python 2.7.5 (default, Oct 11 2015, 17:47:16)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xattr
>>> import os
>>> xattr.set("/rhs/brick3/b3/", "trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.5c513245-edca-436e-a98f-9367f1105a6a.stime", struct.pack("!II", 1489086300, 0))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'struct' is not defined
>>> import struct
>>> xattr.set("/rhs/brick3/b3/", "trusted.glusterfs.d820a210-d7a7-4b6c-807f-be7e5d5fbad6.5c513245-edca-436e-a98f-9367f1105a6a.stime", struct.pack("!II", 1489086300, 0))
>>>
KeyboardInterrupt
>>> quit
Use quit() or Ctrl-D (i.e. EOF) to exit
>>>
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.195 r /rhs/brick3/b3 root 10.70.43.124::s
N/A Faulty N/A N/A
10.70.42.195 r /rhs/brick2/b2 root 10.70.43.124::s
10.70.43.124 Active Changelog Crawl 2017-03-10 01:48:15
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s stop
Stopping geo-replication session between r & 10.70.43.124::s has been
successful
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s start
Starting geo-replication session between r & 10.70.43.124::s has been
successful
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
--------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.195 r /rhs/brick3/b3 root 10.70.43.124::s
N/A Initializing... N/A N/A
10.70.42.195 r /rhs/brick2/b2 root 10.70.43.124::s
N/A Initializing... N/A N/A
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.195 r /rhs/brick3/b3 root 10.70.43.124::s
10.70.43.124 Active History Crawl 2017-03-10 01:48:12
10.70.42.195 r /rhs/brick2/b2 root 10.70.43.124::s
10.70.43.124 Active History Crawl 2017-03-10 01:48:15
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.195 r /rhs/brick3/b3 root 10.70.43.124::s
N/A Faulty N/A N/A
10.70.42.195 r /rhs/brick2/b2 root 10.70.43.124::s
10.70.43.124 Active History Crawl 2017-03-10 01:48:15
[root at dhcp42-195 ~]# gluster volume geo-replication r 10.70.43.124::s status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.195 r /rhs/brick3/b3 root 10.70.43.124::s
N/A Faulty N/A N/A
10.70.42.195 r /rhs/brick2/b2 root 10.70.43.124::s
10.70.43.124 Active History Crawl 2017-03-10 01:48:15
[root at dhcp42-195 ~]# less
/var/log/glusterfs/geo-replication/r/ssh%3A%2F%2Froot%4010.70.43.124%3Agluster%3A%2F%2F127.0.0.1%3As.log
[root at dhcp42-195 ~]#
It crashes with following traceback:
[2017-03-10 07:40:44.911781] I [monitor(monitor):343:monitor] Monitor:
worker(/rhs/brick3/b3) died in startup phase
[2017-03-10 07:40:55.161108] I [monitor(monitor):266:monitor] Monitor:
------------------------------------------------------------
[2017-03-10 07:40:55.161722] I [monitor(monitor):267:monitor] Monitor: starting
gsyncd worker
[2017-03-10 07:40:55.412802] I [gsyncd(/rhs/brick3/b3):710:main_i] <top>:
syncing: gluster://localhost:r -> ssh://root@10.70.43.124:gluster://localhost:s
[2017-03-10 07:40:55.417292] I [changelogagent(agent):73:__init__]
ChangelogAgent: Agent listining...
[2017-03-10 07:40:58.478969] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2017-03-10 07:40:58.479608] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:40:58.481811] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2017-03-10 07:40:58.482242] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:40:58.484389] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2017-03-10 07:40:58.484896] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:41:00.517904] I [master(/rhs/brick3/b3):1249:register] _GMaster:
xsync temp directory:
/var/lib/misc/glusterfsd/r/ssh%3A%2F%2Froot%4010.70.43.124%3Agluster%3A%2F%2F127.0.0.1%3As/c4853a41d2b23f970bf0b4e046255cd5/xsync
[2017-03-10 07:41:00.518466] I [resource(/rhs/brick3/b3):1491:service_loop]
GLUSTER: Register time: 1489131660
[2017-03-10 07:41:00.547487] I [master(/rhs/brick3/b3):510:crawlwrap] _GMaster:
primary master with volume id d820a210-d7a7-4b6c-807f-be7e5d5fbad6 ...
[2017-03-10 07:41:00.553791] I [master(/rhs/brick3/b3):519:crawlwrap] _GMaster:
crawl interval: 1 seconds
[2017-03-10 07:41:00.562241] I [master(/rhs/brick3/b3):1163:crawl] _GMaster:
starting history crawl... turns: 1, stime: (1489086300, 0), etime: 1489131660
[2017-03-10 07:41:00.563826] E [repce(agent):117:worker] <top>: call failed:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 54,
in history
num_parallel)
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 100,
in cl_history_changelog
cls.raise_changelog_err()
File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 27,
in raise_changelog_err
raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 2] No such file or directory
[2017-03-10 07:41:00.565355] E [repce(/rhs/brick3/b3):207:__call__]
RepceClient: call 11457:140177899616064:1489131660.56 (history) failed on peer
with ChangelogException
[2017-03-10 07:41:00.565733] E [resource(/rhs/brick3/b3):1506:service_loop]
GLUSTER: Changelog History Crawl failed, [Errno 2] No such file or directory
[2017-03-10 07:41:00.566288] I [syncdutils(/rhs/brick3/b3):220:finalize] <top>:
exiting.
[2017-03-10 07:41:00.571651] I [repce(agent):92:service_loop] RepceServer:
terminating on reaching EOF.
[2017-03-10 07:41:00.572334] I [syncdutils(agent):220:finalize] <top>: exiting.
[2017-03-10 07:41:01.483324] I [monitor(monitor):343:monitor] Monitor:
worker(/rhs/brick3/b3) died in startup phase
[2017-03-10 07:41:11.734217] I [monitor(monitor):266:monitor] Monitor:
------------------------------------------------------------
[2017-03-10 07:41:11.734749] I [monitor(monitor):267:monitor] Monitor: starting
gsyncd worker
[2017-03-10 07:41:11.922350] I [gsyncd(/rhs/brick3/b3):710:main_i] <top>:
syncing: gluster://localhost:r -> ssh://root@10.70.43.124:gluster://localhost:s
[2017-03-10 07:41:11.922379] I [changelogagent(agent):73:__init__]
ChangelogAgent: Agent listining...
[2017-03-10 07:41:14.980911] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2017-03-10 07:41:14.981561] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:41:14.983578] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2017-03-10 07:41:14.983989] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:41:14.985780] I [master(/rhs/brick3/b3):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2017-03-10 07:41:14.986200] I [master(/rhs/brick3/b3):367:__init__] _GMaster:
using 'rsync' as the sync engine
======================================================================================================================================================
On 3.2: 1489129736 is the start time available. Set it to 1489129600. It does
not crash it goes to hybrid crawl instead.
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs stop
Stopping geo-replication session between ra & 10.70.41.211::rs has been
successful
[root at dhcp42-7 ~]# python
Python 2.7.5 (default, Aug 2 2016, 04:20:16)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xattr
>>> import os
>>> import struct
>>> xattr.set("/rhs/brick1/b1", "trusted.glusterfs.04979d53-79f1-47a9-9eb8-8b19c6765368.39db963b-d234-4409-8b06-bb5bcea75911.stime", struct.pack("!II", 1489129600, 0))
>>>
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
------------------------------------------------------------------------------------------------------------------------------------
10.70.42.7 ra /rhs/brick1/b1 root 10.70.41.211::rs
N/A Stopped N/A N/A
10.70.42.7 ra /rhs/brick2/b2 root 10.70.41.211::rs
N/A Stopped N/A N/A
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs start
Starting geo-replication session between ra & 10.70.41.211::rs has been
successful
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs status
^[[A
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
--------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.7 ra /rhs/brick1/b1 root 10.70.41.211::rs
N/A Initializing... N/A N/A
10.70.42.7 ra /rhs/brick2/b2 root 10.70.41.211::rs
N/A Initializing... N/A N/A
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs status
[root at dhcp42-7 ~]#
[root at dhcp42-7 ~]#
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
----------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.7 ra /rhs/brick1/b1 root 10.70.41.211::rs
10.70.41.211 Active Hybrid Crawl 2017-03-10 07:09:25
10.70.42.7 ra /rhs/brick2/b2 root 10.70.41.211::rs
10.70.41.211 Active History Crawl 2017-03-10 07:09:25
[root at dhcp42-7 ~]# gluster vol geo-rep ra 10.70.41.211::rs status
MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE
SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED
------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.7 ra /rhs/brick1/b1 root 10.70.41.211::rs
10.70.41.211 Active Changelog Crawl 2017-03-10 07:09:13
10.70.42.7 ra /rhs/brick2/b2 root 10.70.41.211::rs
10.70.41.211 Active History Crawl 2017-03-10 07:09:25
Does not crash:
[2017-03-10 07:44:56.887825] I [gsyncd(/rhs/brick1/b1):747:main_i] <top>:
syncing: gluster://localhost:ra ->
ssh://root@10.70.41.211:gluster://localhost:rs
[2017-03-10 07:45:08.826819] I [master(/rhs/brick2/b2):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2017-03-10 07:45:08.827837] I [master(/rhs/brick2/b2):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:08.827426] I [master(/rhs/brick1/b1):83:gmaster_builder]
<top>: setting up xsync change detection mode
[2017-03-10 07:45:08.829051] I [master(/rhs/brick1/b1):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:08.831597] I [master(/rhs/brick2/b2):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2017-03-10 07:45:08.832722] I [master(/rhs/brick2/b2):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:08.836221] I [master(/rhs/brick1/b1):83:gmaster_builder]
<top>: setting up changelog change detection mode
[2017-03-10 07:45:08.837248] I [master(/rhs/brick1/b1):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:08.838807] I [master(/rhs/brick2/b2):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2017-03-10 07:45:08.839443] I [master(/rhs/brick2/b2):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:08.839672] I [master(/rhs/brick1/b1):83:gmaster_builder]
<top>: setting up changeloghistory change detection mode
[2017-03-10 07:45:08.840241] I [master(/rhs/brick1/b1):369:__init__] _GMaster:
using 'rsync' as the sync engine
[2017-03-10 07:45:10.901714] I [master(/rhs/brick2/b2):1253:register] _GMaster:
xsync temp directory:
/var/lib/misc/glusterfsd/ra/ssh%3A%2F%2Froot%4010.70.41.211%3Agluster%3A%2F%2F127.0.0.1%3Ars/764586b145d7206a154a778f64bd2f50/xsync
[2017-03-10 07:45:10.902595] I [resource(/rhs/brick2/b2):1533:service_loop]
GLUSTER: Register time: 1489131910
[2017-03-10 07:45:10.919552] I [master(/rhs/brick2/b2):512:crawlwrap] _GMaster:
primary master with volume id 04979d53-79f1-47a9-9eb8-8b19c6765368 ...
[2017-03-10 07:45:10.919812] I [master(/rhs/brick1/b1):1253:register] _GMaster:
xsync temp directory:
/var/lib/misc/glusterfsd/ra/ssh%3A%2F%2Froot%4010.70.41.211%3Agluster%3A%2F%2F127.0.0.1%3Ars/c19b89ac45352ab8c894d210d136dd56/xsync
[2017-03-10 07:45:10.920842] I [resource(/rhs/brick1/b1):1533:service_loop]
GLUSTER: Register time: 1489131910
[2017-03-10 07:45:10.930277] I [master(/rhs/brick2/b2):521:crawlwrap] _GMaster:
crawl interval: 1 seconds
[2017-03-10 07:45:10.930512] I [master(/rhs/brick1/b1):512:crawlwrap] _GMaster:
primary master with volume id 04979d53-79f1-47a9-9eb8-8b19c6765368 ...
[2017-03-10 07:45:10.955785] I [master(/rhs/brick1/b1):521:crawlwrap] _GMaster:
crawl interval: 1 seconds
[2017-03-10 07:45:10.962408] I [master(/rhs/brick2/b2):1167:crawl] _GMaster:
starting history crawl... turns: 1, stime: (1489129765, 0), etime: 1489131910
[2017-03-10 07:45:10.972481] I [master(/rhs/brick1/b1):1167:crawl] _GMaster:
starting history crawl... turns: 1, stime: (-1, 0), etime: 1489131910
[2017-03-10 07:45:10.972976] I [master(/rhs/brick1/b1):1170:crawl] _GMaster:
stime not available, abandoning history crawl
[2017-03-10 07:45:10.973434] I [resource(/rhs/brick1/b1):1548:service_loop]
GLUSTER: No stime available, using xsync crawl
[2017-03-10 07:45:10.977698] I [master(/rhs/brick1/b1):512:crawlwrap] _GMaster:
primary master with volume id 04979d53-79f1-47a9-9eb8-8b19c6765368 ...
[2017-03-10 07:45:10.984404] I [master(/rhs/brick1/b1):521:crawlwrap] _GMaster:
crawl interval: 60 seconds
[2017-03-10 07:45:10.992049] I [master(/rhs/brick1/b1):1275:crawl] _GMaster:
starting hybrid crawl..., stime: (-1, 0)
[2017-03-10 07:45:11.968451] I [master(/rhs/brick2/b2):1196:crawl] _GMaster:
slave's time: (1489129765, 0)
[2017-03-10 07:45:12.15741] I [master(/rhs/brick1/b1):1285:crawl] _GMaster:
processing xsync changelog
/var/lib/misc/glusterfsd/ra/ssh%3A%2F%2Froot%4010.70.41.211%3Agluster%3A%2F%2F127.0.0.1%3Ars/c19b89ac45352ab8c894d210d136dd56/xsync/XSYNC-CHANGELOG.1489131910
[2017-03-10 07:45:13.115169] I [master(/rhs/brick1/b1):1282:crawl] _GMaster:
finished hybrid crawl syncing, stime: (1489129753, 210362)
[2017-03-10 07:45:13.119362] I [master(/rhs/brick1/b1):512:crawlwrap] _GMaster:
primary master with volume id 04979d53-79f1-47a9-9eb8-8b19c6765368 ...
[2017-03-10 07:45:13.126692] I [master(/rhs/brick1/b1):521:crawlwrap] _GMaster:
crawl interval: 3 seconds
[2017-03-10 07:45:13.235455] I [master(/rhs/brick2/b2):1210:crawl] _GMaster:
finished history crawl syncing, endtime: 1489131896, stime: (1489129765, 0)
[2017-03-10 07:45:25.250497] I [master(/rhs/brick2/b2):1167:crawl] _GMaster:
starting history crawl... turns: 2, stime: (1489129765, 0), etime: 1489131925
[2017-03-10 07:45:25.256480] I [master(/rhs/brick2/b2):1210:crawl] _GMaster:
finished history crawl syncing, endtime: 1489131911, stime: (1489129765, 0)
[2017-03-10 07:45:25.261801] I [master(/rhs/brick2/b2):512:crawlwrap] _GMaster:
primary master with volume id 04979d53-79f1-47a9-9eb8-8b19c6765368 ...
--
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=3O89N7ieFe&a=cc_unsubscribe
More information about the Bugs
mailing list