[Bugs] [Bug 1655333] New: OSError: [Errno 116] Stale file handle due to rotated files
bugzilla at redhat.com
bugzilla at redhat.com
Sun Dec 2 21:21:26 UTC 2018
https://bugzilla.redhat.com/show_bug.cgi?id=1655333
Bug ID: 1655333
Summary: OSError: [Errno 116] Stale file handle due to rotated
files
Product: GlusterFS
Version: 4.1
Component: geo-replication
Assignee: bugs at gluster.org
Reporter: mrxlazuardin at gmail.com
CC: bugs at gluster.org
Description of problem:
Geo-rep worker goes faulty on some bricks (not all bricks) if there is file
rotation inside GlusterFS mount
Version-Release number of selected component (if applicable):
4.1.5 on CentOS 7.5 (I have not tested on different version and OS)
How reproducible:
Always
Steps to Reproduce:
1. Mount a geo-replicated volume from Master node
2. Create a file (such as log file)
3. Do some file rotation to that file
Actual results:
Geo-rep worker goes faulty on some bricks (not all bricks)
gsyncd.log on Master
--------------------
[2018-12-01 20:39:49.653356] E [repce(worker /mnt/BRICK3):197:__call__]
RepceClient: call failed call=25197:139717822179136:1543696787.31
method=entry_ops error=OSError
[2018-12-01 20:39:49.653767] E [syncdutils(worker
/mnt/BRICK3):332:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in main
func(args)
File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
subcmd_worker
local.service_loop(remote)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1295, in
service_loop
g3.crawlwrap(oneshot=True)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in
crawlwrap
self.crawl()
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545, in
crawl
self.changelogs_batch_process(changes)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445, in
changelogs_batch_process
self.process(batch)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280, in
process
self.process_change(change, done, retry)
File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179, in
process_change
failures = self.slave.server.entry_ops(entries)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
__call__
return self.ins(self.meth, *a)
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
__call__
raise res
OSError: [Errno 116] Stale file handle
gsyncd.log on Slave
-------------------
[2018-12-01 20:59:52.571860] W [syncdutils(slave
gluster-eadmin-data.vm/mnt/BRICK3):552:errno_wrap] <top>: reached maximum
retries
args=['.gfid/86ba8c38-5ab0-417e-9130-64dd2d7cf4aa/glue_app_debug_log.log.82',
'.gfid/86ba8c38-5ab0-417e-9130-64dd2d7cf4aa/glue_app_debug_log.log.83']
error=[Errno 116] Stale file handle
[2018-12-01 20:59:52.572635] E [repce(slave
gluster-eadmin-data.vm/mnt/BRICK3):105:worker] <top>: call failed:
Traceback (most recent call last):
File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 101, in worker
res = getattr(self.obj, rmeth)(*in_data[2:])
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 675, in
entry_ops
uid, gid)
File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 526, in
rename_with_disk_gfid_confirmation
[ENOENT, EEXIST], [ESTALE, EBUSY])
File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 540, in
errno_wrap
return call(*arg)
OSError: [Errno 116] Stale file handle
Expected results:
Geo-rep worker goes normal
Additional info:
Those error are gone if I move rotated files (glue_app_debug_log.log.82 and
glue_app_debug_log.log.83 in above log) from Gluster mount to temporary place
and move back to origin place of Gluster mount.
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list