[Bugs] [Bug 1689981] New: OSError: [Errno 1] Operation not permitted - failing with socket files?

bugzilla at redhat.com bugzilla at redhat.com
Mon Mar 18 14:50:40 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1689981

            Bug ID: 1689981
           Summary: OSError: [Errno 1] Operation not permitted - failing
                    with socket files?
           Product: GlusterFS
           Version: 4.1
          Hardware: x86_64
                OS: Linux
            Status: NEW
         Component: geo-replication
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: davobbi at gmail.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community



Description of problem:


georeplciation during "History Crawl" starts failing on each of the three
bricks, one after the other. I have enabled DEBUG for all the logs configurable
by the geo-replication command.

Running glusterfs v4.16 the behaviour is as follow:
- The "History Crawl" worked fine for about one hr, it actually replicated some
files and folders albeit most of them looks empty
- at some point it starts becoming faulty, try to start on another brick,
faulty and so on
- in the logs, Python exception above mentioned is raised:
[2019-03-17 18:52:49.565040] E [syncdutils(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):332:log_raise_exception]
<top>: FAIL:                                                                    
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1291, in
service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in
crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1569, in
crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1469, in
changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1304, in
process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1203, in
process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
__call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
__call__
    raise res
OSError: [Errno 1] Operation not permitted

- The operation before the exception:
[2019-03-17 18:52:49.545103] D [master(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):1186:process_change]
_GMaster: entries: [{'uid': 7575, 'gfid':
'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'gid': 100, 'mode'
: 49536, 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op': 'MKNOD'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9
234d005406a13deb4375459715', 'stat': {'atime': 1552661403.3846507, 'gid': 100,
'mtime': 1552661403.3846507, 'uid': 7575, 'mode': 49536}, 'link': None, 'op':
'LINK'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.con
trol_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op':
'UNLINK'}]
[2019-03-17 18:52:49.548614] D [repce(worker
/var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):179:push]
RepceClient: call 56917:140179359156032:1552848769.55 entry_ops([{'uid': 7575,
'gfid': 'e1ad7c98-f32a-4e48-9902-
cc75840de7c3', 'gid': 100, 'mode': 49536, 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op': 'MKNOD'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry':
'.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b
129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715', 'stat': {'atime':
1552661403.3846507, 'gid': 100, 'mtime': 1552661403.3846507, 'uid': 7575,
'mode': 49536}, 'link': None, 'op': 'LINK'}, {'gfid':
'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8
-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY',
'op': 'UNLINK'}],) ...

- The gfid highlighted, is pointing to these control files which are "unix
sockets" as per below:
rw-------  2 pippo users     0 Mar 14 16:32
.control_31c3a99664c1f956f949311e58434037e6a52d22
srw-------  2 pippo users     0 Mar 14 16:33
.control_a9b82937042529bca677b9f43eba9eb02ca7c5ee
srw-------  2 pippo users     0 Mar 14 16:32
.control_f429221460d52570066d9f25521011fe7e081cf5
srw-------  2 pippo users     0 Mar 15 15:50
.control_f7c33270dc9db9234d005406a13deb4375459715

So it seems geo-replicaiton should be at least skipping such file rather than
raising an exception? 


Steps to Reproduce:
1. replicate unix socket files

Actual results:
Os Error exception

Expected results:
Files to be skipped and replication continues

Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list