[Gluster-users] Message repeated over and over after upgrade from 4.1 to 5.3: W [dict.c:761:dict_ref] (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329) [0x7fd966fcd329] -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5) [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58) [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]

Artem Russakovskii archon810 at gmail.com
Fri Feb 1 17:03:33 UTC 2019


Hi,

The first (and so far only) crash happened at 2am the next day after we
upgraded, on only one of four servers and only to one of two mounts.

I have no idea what caused it, but yeah, we do have a pretty busy site (
apkmirror.com), and it caused a disruption for any uploads or downloads
from that server until I woke up and fixed the mount.

I wish I could be more helpful but all I have is that stack trace.

I'm glad it's a blocker and will hopefully be resolved soon.

On Thu, Jan 31, 2019, 7:26 PM Amar Tumballi Suryanarayan <
atumball at redhat.com> wrote:

> Hi Artem,
>
> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1671603 (ie, as a
> clone of other bugs where recent discussions happened), and marked it as a
> blocker for glusterfs-5.4 release.
>
> We already have fixes for log flooding - https://review.gluster.org/22128,
> and are the process of identifying and fixing the issue seen with crash.
>
> Can you please tell if the crashes happened as soon as upgrade ? or was
> there any particular pattern you observed before the crash.
>
> -Amar
>
>
> On Thu, Jan 31, 2019 at 11:40 PM Artem Russakovskii <archon810 at gmail.com>
> wrote:
>
>> Within 24 hours after updating from rock solid 4.1 to 5.3, I already got
>> a crash which others have mentioned in
>> https://bugzilla.redhat.com/show_bug.cgi?id=1313567 and had to unmount,
>> kill gluster, and remount:
>>
>>
>> [2019-01-31 09:38:04.317604] W [dict.c:761:dict_ref]
>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>> [0x7fcccafcd329]
>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>> [2019-01-31 09:38:04.319308] W [dict.c:761:dict_ref]
>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>> [0x7fcccafcd329]
>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>> [2019-01-31 09:38:04.320047] W [dict.c:761:dict_ref]
>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>> [0x7fcccafcd329]
>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>> [2019-01-31 09:38:04.320677] W [dict.c:761:dict_ref]
>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>> [0x7fcccafcd329]
>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>> [0x7fcccb1deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>> [0x7fccd705b218] ) 2-dict: dict is NULL [Invalid argument]
>> The message "I [MSGID: 108031]
>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>> selecting local read_child SITE_data1-client-3" repeated 5 times between
>> [2019-01-31 09:37:54.751905] and [2019-01-31 09:38:03.958061]
>> The message "E [MSGID: 101191]
>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>> handler" repeated 72 times between [2019-01-31 09:37:53.746741] and
>> [2019-01-31 09:38:04.696993]
>> pending frames:
>> frame : type(1) op(READ)
>> frame : type(1) op(OPEN)
>> frame : type(0) op(0)
>> patchset: git://git.gluster.org/glusterfs.git
>> signal received: 6
>> time of crash:
>> 2019-01-31 09:38:04
>> configuration details:
>> argp 1
>> backtrace 1
>> dlfcn 1
>> libpthread 1
>> llistxattr 1
>> setfsid 1
>> spinlock 1
>> epoll.h 1
>> xattr.h 1
>> st_atim.tv_nsec 1
>> package-string: glusterfs 5.3
>> /usr/lib64/libglusterfs.so.0(+0x2764c)[0x7fccd706664c]
>> /usr/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7fccd7070cb6]
>> /lib64/libc.so.6(+0x36160)[0x7fccd622d160]
>> /lib64/libc.so.6(gsignal+0x110)[0x7fccd622d0e0]
>> /lib64/libc.so.6(abort+0x151)[0x7fccd622e6c1]
>> /lib64/libc.so.6(+0x2e6fa)[0x7fccd62256fa]
>> /lib64/libc.so.6(+0x2e772)[0x7fccd6225772]
>> /lib64/libpthread.so.0(pthread_mutex_lock+0x228)[0x7fccd65bb0b8]
>>
>> /usr/lib64/glusterfs/5.3/xlator/cluster/replicate.so(+0x32c4d)[0x7fcccbb01c4d]
>>
>> /usr/lib64/glusterfs/5.3/xlator/protocol/client.so(+0x65778)[0x7fcccbdd1778]
>> /usr/lib64/libgfrpc.so.0(+0xe820)[0x7fccd6e31820]
>> /usr/lib64/libgfrpc.so.0(+0xeb6f)[0x7fccd6e31b6f]
>> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fccd6e2e063]
>> /usr/lib64/glusterfs/5.3/rpc-transport/socket.so(+0xa0b2)[0x7fccd0b7e0b2]
>> /usr/lib64/libglusterfs.so.0(+0x854c3)[0x7fccd70c44c3]
>> /lib64/libpthread.so.0(+0x7559)[0x7fccd65b8559]
>> /lib64/libc.so.6(clone+0x3f)[0x7fccd62ef81f]
>> ---------
>>
>> Do the pending patches fix the crash or only the repeated warnings? I'm
>> running glusterfs on OpenSUSE 15.0 installed via
>> http://download.opensuse.org/repositories/home:/glusterfs:/Leap15-5/openSUSE_Leap_15.0/,
>> not too sure how to make it core dump.
>>
>> If it's not fixed by the patches above, has anyone already opened a
>> ticket for the crashes that I can join and monitor? This is going to create
>> a massive problem for us since production systems are crashing.
>>
>> Thanks.
>>
>> Sincerely,
>> Artem
>>
>> --
>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>> <http://www.apkmirror.com/>, Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>> <http://twitter.com/ArtemR>
>>
>>
>> On Wed, Jan 30, 2019 at 6:37 PM Raghavendra Gowdappa <rgowdapp at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jan 31, 2019 at 2:14 AM Artem Russakovskii <archon810 at gmail.com>
>>> wrote:
>>>
>>>> Also, not sure if related or not, but I got a ton of these "Failed to
>>>> dispatch handler" in my logs as well. Many people have been commenting
>>>> about this issue here
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1651246.
>>>>
>>>
>>> https://review.gluster.org/#/c/glusterfs/+/22046/ addresses this.
>>>
>>>
>>>> ==> mnt-SITE_data1.log <==
>>>>> [2019-01-30 20:38:20.783713] W [dict.c:761:dict_ref]
>>>>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>> [0x7fd966fcd329]
>>>>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>> [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>> ==> mnt-SITE_data3.log <==
>>>>> The message "E [MSGID: 101191]
>>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>> handler" repeated 413 times between [2019-01-30 20:36:23.881090] and
>>>>> [2019-01-30 20:38:20.015593]
>>>>> The message "I [MSGID: 108031]
>>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>> selecting local read_child SITE_data3-client-0" repeated 42 times between
>>>>> [2019-01-30 20:36:23.290287] and [2019-01-30 20:38:20.280306]
>>>>> ==> mnt-SITE_data1.log <==
>>>>> The message "I [MSGID: 108031]
>>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>> selecting local read_child SITE_data1-client-0" repeated 50 times between
>>>>> [2019-01-30 20:36:22.247367] and [2019-01-30 20:38:19.459789]
>>>>> The message "E [MSGID: 101191]
>>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>> handler" repeated 2654 times between [2019-01-30 20:36:22.667327] and
>>>>> [2019-01-30 20:38:20.546355]
>>>>> [2019-01-30 20:38:21.492319] I [MSGID: 108031]
>>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data1-replicate-0:
>>>>> selecting local read_child SITE_data1-client-0
>>>>> ==> mnt-SITE_data3.log <==
>>>>> [2019-01-30 20:38:22.349689] I [MSGID: 108031]
>>>>> [afr-common.c:2543:afr_local_discovery_cbk] 2-SITE_data3-replicate-0:
>>>>> selecting local read_child SITE_data3-client-0
>>>>> ==> mnt-SITE_data1.log <==
>>>>> [2019-01-30 20:38:22.762941] E [MSGID: 101191]
>>>>> [event-epoll.c:671:event_dispatch_epoll_worker] 2-epoll: Failed to dispatch
>>>>> handler
>>>>
>>>>
>>>> I'm hoping raising the issue here on the mailing list may bring some
>>>> additional eyeballs and get them both fixed.
>>>>
>>>> Thanks.
>>>>
>>>> Sincerely,
>>>> Artem
>>>>
>>>> --
>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>> beerpla.net | +ArtemRussakovskii
>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>> <http://twitter.com/ArtemR>
>>>>
>>>>
>>>> On Wed, Jan 30, 2019 at 12:26 PM Artem Russakovskii <
>>>> archon810 at gmail.com> wrote:
>>>>
>>>>> I found a similar issue here:
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1313567. There's a
>>>>> comment from 3 days ago from someone else with 5.3 who started seeing the
>>>>> spam.
>>>>>
>>>>> Here's the command that repeats over and over:
>>>>> [2019-01-30 20:23:24.481581] W [dict.c:761:dict_ref]
>>>>> (-->/usr/lib64/glusterfs/5.3/xlator/performance/quick-read.so(+0x7329)
>>>>> [0x7fd966fcd329]
>>>>> -->/usr/lib64/glusterfs/5.3/xlator/performance/io-cache.so(+0xaaf5)
>>>>> [0x7fd9671deaf5] -->/usr/lib64/libglusterfs.so.0(dict_ref+0x58)
>>>>> [0x7fd9731ea218] ) 2-dict: dict is NULL [Invalid argument]
>>>>>
>>>>
>>> +Milind Changire <mchangir at redhat.com> Can you check why this message
>>> is logged and send a fix?
>>>
>>>
>>>>> Is there any fix for this issue?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Sincerely,
>>>>> Artem
>>>>>
>>>>> --
>>>>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>>>>> <http://www.apkmirror.com/>, Illogical Robot LLC
>>>>> beerpla.net | +ArtemRussakovskii
>>>>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
>>>>> <http://twitter.com/ArtemR>
>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Amar Tumballi (amarts)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190201/4dae7683/attachment.html>


More information about the Gluster-users mailing list