[Bugs] [Bug 1644322] New: flooding log with "glusterfs-fuse: read from /dev/ fuse returned -1 (Operation not permitted)"

bugzilla at redhat.com bugzilla at redhat.com
Tue Oct 30 14:06:43 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1644322

            Bug ID: 1644322
           Summary: flooding log with "glusterfs-fuse: read from /dev/fuse
                    returned -1 (Operation not permitted)"
           Product: GlusterFS
           Version: 4.1
         Component: geo-replication
          Assignee: bugs at gluster.org
          Reporter: lohmaier+rhbz at gmail.com
                CC: bugs at gluster.org



Description of problem:
>From time to time gluster runs amok and floods the geo-replication logs with 
"W [fuse-bridge.c:5098:fuse_thread_proc] 0-glusterfs-fuse: read from /dev/fuse
returned -1 (Operation not permitted)"

at a rate as fast as the disk can keep up with, filling /var with tens of
gigabytes of the same log message, until either /var is run full or for some
magical reason gluster recovers before.

Seen this on both master as well as geo-replication slave, but is more frequent
on the master (which tends to have more load)

How reproducible:

No clear recipe, but seems related to system load

Sample lines of when the flood starts:

[2018-10-30 13:42:35.336146] W [rpc-clnt.c:1753:rpc_clnt_submit]
0-backup1-client-0: error returned while attempting to connect to host:(null),
port:0
[2018-10-30 13:42:35.336533] W [dict.c:923:str_to_data]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/4.1.5/xlator/protocol/client.so(+0x3d6a3)
[0x7f1bc82336a3]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(dict_set_str+0x16)
[0x7f1bce3d16e6]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(str_to_data+0x8a)
[0x7f1bce3ce67a] ) 0-dict: value is NULL [Invalid argument]
[2018-10-30 13:42:35.336560] I [MSGID: 114006]
[client-handshake.c:1308:client_setvolume] 0-backup1-client-0: failed to set
process-name in handshake msg
[2018-10-30 13:42:35.336594] W [rpc-clnt.c:1753:rpc_clnt_submit]
0-backup1-client-0: error returned while attempting to connect to host:(null),
port:0
[2018-10-30 13:42:35.337190] I [MSGID: 114046]
[client-handshake.c:1176:client_setvolume_cbk] 0-backup1-client-0: Connected to
backup1-client-0, attached to remote volume '/srv/backup/brick1'.
[2018-10-30 13:42:35.338506] W [fuse-bridge.c:5098:fuse_thread_proc]
0-glusterfs-fuse: read from /dev/fuse returned -1 (Operation not permitted)
[2018-10-30 13:42:35.338531] W [fuse-bridge.c:5098:fuse_thread_proc]
0-glusterfs-fuse: read from /dev/fuse returned -1 (Operation not permitted)
[2018-10-30 13:42:35.338539] W [fuse-bridge.c:5098:fuse_thread_proc]
0-glusterfs-fuse: read from /dev/fuse returned -1 (Operation not permitted)
[2018-10-30 13:42:35.338547] W [fuse-bridge.c:5098:fuse_thread_proc]
0-glusterfs-fuse: read from /dev/fuse returned -1 (Operation not permitted)

continues with gigabytes of those/until /var is filled up and you'll end up
with follow-up errors since no further changelogs, etc can be created in /var
with no free diskspace.

the logging desperately is in need for some rate-limiting, as filling up /var
is kinda DOS attack on geo-replication, all sessions go to faulty mode and
geo-replication falls behind until /var is cleaned up and the sessions can be
resumed, likely having causes some inconsistencies in between since it couldn't
properly write changelogs in the meantime.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list