[Bugs] [Bug 1611114] New: [geo-rep]: [Errno 2] No such file or directory
bugzilla at redhat.com
bugzilla at redhat.com
Thu Aug 2 05:42:05 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1611114
Bug ID: 1611114
Summary: [geo-rep]: [Errno 2] No such file or directory
Product: GlusterFS
Version: 4.1
Component: geo-replication
Severity: high
Assignee: bugs at gluster.org
Reporter: khiremat at redhat.com
CC: avishwan at redhat.com, bugs at gluster.org,
csaba at redhat.com, rallan at redhat.com,
rhinduja at redhat.com, rhs-bugs at redhat.com,
sankarshan at redhat.com, storage-qa-internal at redhat.com
Depends On: 1598384, 1598884
+++ This bug was initially created as a clone of Bug #1598884 +++
+++ This bug was initially created as a clone of Bug #1598384 +++
Description of problem:
=========================
Worker crashed with the following traceback while using geo-rep scheduler:
[2018-07-04 06:35:30.242285] E
[syncdutils(/rhs/brick2/b4):348:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 210, in main
main_i()
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 803, in
main_i
local.service_loop(*[r for r in [remote] if r])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1568, in
service_loop
g3.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 597, in
crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1470, in
crawl
self.changelogs_batch_process(changes)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1370, in
changelogs_batch_process
self.process(batch)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1204, in
process
self.process_change(change, done, retry)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1114, in
process_change
failures = self.slave.server.entry_ops(entries)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 228, in
__call__
return self.ins(self.meth, *a)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 210, in
__call__
raise res
OSError: [Errno 2] No such file or directory:
'/rhs/brick1/b1/.glusterfs/2e/94/2e9400f3-61c5-4943-bc5d-26562fc7f47d'
[2018-07-04 06:35:30.294939] I [syncdutils(/rhs/brick2/b4):288:finalize] <top>:
exiting.
Version-Release number of selected component (if applicable):
============================================================
mainline
How reproducible:
================
1/1
Steps to Reproduce:
===================
1.Have a geo-replication session up
2.Create IO on the master
3.Run the scheduler: python /usr/share/glusterfs/scripts/schedule_georep.py
master 10.70.42.164 slave
The geo-rep scheduler does the following:
1. Stop Geo-replication if Started
2. Start Geo-replication
3. Set Checkpoint
4. Check the Status and see Checkpoint is Complete.(LOOP)
5. If checkpoint complete, Stop Geo-replication
Actual results:
===============
Worker crashed with No such file or directory
Expected results:
=================
Worker should not crash
Traceback on the slave:
-----------------------
[2018-07-04 06:37:14.873959] E [repce(slave):117:worker] <top>: call failed:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 113, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 644, in
entry_ops
gfid[2:4], gfid))
OSError: [Errno 2] No such file or directory:
'/rhs/brick1/b1/.glusterfs/e1/ab/e1ab496b-a1bb-4f1c-a543-a44ee1ee0272'
[2018-07-04 06:37:14.942406] I [repce(slave):92:service_loop] RepceServer:
terminating on reaching EOF.
--- Additional comment from Worker Ant on 2018-07-06 14:23:10 EDT ---
REVIEW: https://review.gluster.org/20473 (geo-rep: Fix issues with gfid
conflict handling) posted (#1) for review on master by Kotresh HR
--- Additional comment from Worker Ant on 2018-07-20 12:23:45 EDT ---
COMMIT: https://review.gluster.org/20473 committed in master by "Kotresh HR"
<khiremat at redhat.com> with a commit message- geo-rep: Fix issues with gfid
conflict handling
1. MKDIR/RMDIR is recorded on all bricks. So if
one brick succeeds creating it, other bricks
should ignore it. But this was not happening.
The fix rename of directories in hybrid crawl,
was trying to rename the directory to itself
and in the process crashing with ENOENT if the
directory is removed.
2. If file is created, deleted and a directory is
created with same name, it was failing to sync.
Again the issue is around the fix for rename
of directories in hybrid crawl. Fixed the same.
If the same case was done with hardlink present
for the file, it was failing. This patch fixes
that too.
fixes: bz#1598884
Change-Id: I6f3bca44e194e415a3d4de3b9d03cc8976439284
Signed-off-by: Kotresh HR <khiremat at redhat.com>
Referenced Bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1598384
[Bug 1598384] [geo-rep]: [Errno 2] No such file or directory
https://bugzilla.redhat.com/show_bug.cgi?id=1598884
[Bug 1598884] [geo-rep]: [Errno 2] No such file or directory
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list