[Bugs] [Bug 1772006] New: NULL dict messages flooding fuse mount log

bugzilla at redhat.com bugzilla at redhat.com
Wed Nov 13 12:21:25 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1772006

            Bug ID: 1772006
           Summary: NULL dict messages flooding fuse mount log
           Product: GlusterFS
           Version: mainline
            Status: NEW
         Component: replicate
          Assignee: bugs at gluster.org
          Reporter: ravishankar at redhat.com
                CC: bugs at gluster.org
  Target Milestone: ---
    Classification: Community



Description of problem:
Commit "ccf33e789 - dict.c: remove redundant checks (6 days ago) <Yaniv Kaul>"
removed some NULL checks in certain dict functions. This caused flooding of
fuse mount logs when I/O was done on the mount:

Message:
W [dict.c:1478:dict_get_with_refn]
(-->/usr/local/lib/libglusterfs.so.0(dict_get_uint32+0x4d) [0x7ff9121ec963]
-->/usr/local/lib/libglusterfs.so.0(dict_get_with_ref+0x90) [0x7ff9121eb93f]
-->/usr/local/lib/libglusterfs.so.0(+0x229be) [0x7ff9121eb9be] ) 0-dict: dict
OR key (glusterfs.lk.lkmode) is NULL [Invalid argument]

Two observed code paths in AFR that caused these logs:
==============================================================
#0  dict_get_with_refn (this=0x0, key=0x7ff8ffa90e16 "glusterfs.lk.lkmode",
keylen=19, data=0x7ff900b49690) at dict.c:1478
#1  0x00007ff9121eb93f in dict_get_with_ref (this=0x0, key=0x7ff8ffa90e16
"glusterfs.lk.lkmode", data=0x7ff900b49690) at dict.c:1467
#2  0x00007ff9121ec963 in dict_get_uint32 (this=0x0, key=0x7ff8ffa90e16
"glusterfs.lk.lkmode", val=0x7ff900b496c8) at dict.c:1894
#3  0x00007ff8ffa631fd in afr_is_lock_mode_mandatory (xdata=0x0) at
afr-common.c:667
#4  0x00007ff8ffa6d0c0 in afr_discover_unwind (frame=0x7ff8ec00ee78,
this=0x7ff8f801caa0) at afr-common.c:3650
#5  0x00007ff8ffa6dddd in afr_discover_done (frame=0x7ff8ec00ee78,
this=0x7ff8f801caa0) at afr-common.c:3745
#6  0x00007ff8ffa6bd4b in afr_lookup_metadata_heal_check (frame=0x7ff8ec00ee78,
this=0x7ff8f801caa0) at afr-common.c:3400
#7  0x00007ff8ffa6e166 in afr_discover_cbk (frame=0x7ff8ec00ee78, cookie=0x2,
this=0x7ff8f801caa0, op_ret=0, op_errno=0, inode=0x7ff8f81b22f8,
buf=0x7ff900b499b0, xdata=0x7ff8f0024798, postparent=0x7ff900b49910) at
afr-common.c:3786
#8  0x00007ff8ffb1dfae in client4_0_lookup_cbk (req=0x7ff8ec01ef08,
iov=0x7ff8ec01ef38, count=1, myframe=0x7ff8ec006078) at
client-rpc-fops_v2.c:2631
#9  0x00007ff91219769a in rpc_clnt_handle_reply (clnt=0x7ff8f8069740,
pollin=0x7ff8f0011fd0) at rpc-clnt.c:768
#10 0x00007ff912197bcb in rpc_clnt_notify (trans=0x7ff8f8069a40,
mydata=0x7ff8f8069770, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7ff8f0011fd0)
at rpc-clnt.c:935
#11 0x00007ff912193a4a in rpc_transport_notify (this=0x7ff8f8069a40,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7ff8f0011fd0) at rpc-transport.c:545
#12 0x00007ff900c1ca5e in socket_event_poll_in_async (xl=0x7ff8f8010b60,
async=0x7ff8f00120e8) at socket.c:2601
#13 0x00007ff900c13e67 in gf_async (async=0x7ff8f00120e8, xl=0x7ff8f8010b60,
cbk=0x7ff900c1ca07 <socket_event_poll_in_async>) at
../../../../libglusterfs/src/glusterfs/async.h:189
#14 0x00007ff900c1cbed in socket_event_poll_in (this=0x7ff8f8069a40,
notify_handled=true) at socket.c:2642
#15 0x00007ff900c1db8a in socket_event_handler (fd=16, idx=4, gen=1,
data=0x7ff8f8069a40, poll_in=1, poll_out=0, poll_err=0, event_thread_died=0
'\000') at socket.c:3033
#16 0x00007ff91227504b in event_dispatch_epoll_handler (event_pool=0xdd10e0,
event=0x7ff900b4a02c) at event-epoll.c:650
#17 0x00007ff91227553e in event_dispatch_epoll_worker (data=0xe1b930) at
event-epoll.c:763
#18 0x00007ff911f915a2 in start_thread () from /lib64/libpthread.so.0
#19 0x00007ff911bdd023 in clone () from /lib64/libc.so.6
==============================================================
#0  dict_get_with_refn (this=0x0, key=0x7f893c059250
"trusted.afr.testvol-client-0", keylen=28, data=0x7f8944378460) at dict.c:1478
#1  0x00007f8955a1a93f in dict_get_with_ref (this=0x0, key=0x7f893c059250
"trusted.afr.testvol-client-0", data=0x7f8944378460) at dict.c:1467
#2  0x00007f8955a1c6d4 in dict_get_ptr (this=0x0, key=0x7f893c059250
"trusted.afr.testvol-client-0", ptr=0x7f89443784b0) at dict.c:2247
#3  0x00007f8943294d79 in afr_accused_fill (this=0x7f893c01caa0, xdata=0x0,
accused=0x7f8944378590 "", type=AFR_ENTRY_TRANSACTION) at afr-common.c:1569
#4  0x00007f8943295173 in afr_readables_fill (frame=0x7f8934152848,
this=0x7f893c01caa0, inode=0x7f89300272a8, data_accused=0x7f8944378590 "",
metadata_accused=0x7f8944378570 "", data_readable=0x7f8944378580
"\001\001\001<\211\177",
    metadata_readable=0x7f8944378560 "\001\001\001\064\211\177",
replies=0x7f8934173550) at afr-common.c:1655
#5  0x00007f89432954b7 in afr_replies_interpret (frame=0x7f8934152848,
this=0x7f893c01caa0, inode=0x7f89300272a8, start_heal=0x0) at afr-common.c:1711
#6  0x00007f89432223cf in __afr_dir_write_finalize (frame=0x7f8934152848,
this=0x7f893c01caa0) at afr-dir-write.c:98
#7  0x00007f8943223a40 in __afr_dir_write_cbk (frame=0x7f8934152848,
cookie=0x2, this=0x7f893c01caa0, op_ret=0, op_errno=2, buf=0x7f8944378910,
preparent=0x7f8944378870, postparent=0x7f89443787d0, preparent2=0x0,
postparent2=0x0,
    xdata=0x0) at afr-dir-write.c:237
#8  0x00007f8943226d68 in afr_mkdir_wind_cbk (frame=0x7f8934152848, cookie=0x2,
this=0x7f893c01caa0, op_ret=0, op_errno=2, inode=0x7f89300272a8,
buf=0x7f8944378910, preparent=0x7f8944378870, postparent=0x7f89443787d0,
xdata=0x0)
    at afr-dir-write.c:629
#9  0x00007f894333cdc4 in client4_0_mkdir_cbk (req=0x7f893c0020f8,
iov=0x7f893c002128, count=1, myframe=0x7f893c003cd8) at
client-rpc-fops_v2.c:217
==============================================================

Version-Release number of selected component (if applicable):
Master.

How reproducible:
Always

Steps to Reproduce:
1. Create dist-rep volume and fuse mount it.
2. perform I/O- mkdir/rmdir, touch, stat, rm etc.


Actual results:
Fuse log is flooded with above messages.

Expected results:
No spurious logs.

Additional info:
Not sure if it is better to revert Yaniv's patch or retain it, but fix AFR (and
possibly any other xlators that may do call dict_* functions on a null dict in
future). In any case, I'm sending out the AFR fix to see what the consensus is.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list