[Bugs] [Bug 1806595] New: glusterfsd crashed with "'MemoryError' Cannot access memory at address"

Mon Feb 24 15:24:26 UTC 2020

https://bugzilla.redhat.com/show_bug.cgi?id=1806595

            Bug ID: 1806595
           Summary: glusterfsd crashed with "'MemoryError' Cannot access
                    memory at address"
           Product: GlusterFS
           Version: 5
          Hardware: All
                OS: Linux
            Status: NEW
         Component: rpc
          Severity: high
          Priority: high
          Assignee: bugs at gluster.org
          Reporter: moagrawa at redhat.com
                CC: alchan at redhat.com, amukherj at redhat.com,
                    bkunal at redhat.com, bugs at gluster.org,
                    jpankaja at redhat.com, kramdoss at redhat.com,
                    madam at redhat.com, moagrawa at redhat.com,
                    nbalacha at redhat.com, nravinas at redhat.com,
                    pasik at iki.fi, pdhange at redhat.com, pprakash at redhat.com,
                    rgowdapp at redhat.com, rhs-bugs at redhat.com,
                    rtalur at redhat.com, sheggodu at redhat.com,
                    ykaul at redhat.com
        Depends On: 1763036
            Blocks: 1804523, 1741193, 1778175, 1778182
  Target Milestone: ---
    Classification: Community

+++ This bug was initially created as a clone of Bug #1763036 +++

+++ This bug was initially created as a clone of Bug #1741193 +++

Description of problem:
glusterfsd crashed on all the gluster nodes and created a huge corefile causing
the mounts to go down

Primary analysis of the corefile:
(/cases/02441282/_additional_files/x-coredump/inf03-core.6572.) 

[file seems incomplete]

Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.48 --volfile-id
vol_0005ed9e52ceb05f16b1bb48cd'.
Program terminated with signal 6, Aborted.
#0  0x00007f35de36e207 in ?? ()
(gdb) bt
Python Exception <class 'gdb.MemoryError'> Cannot access memory at address
0x7f2bdef8da58: 

(gdb) info proc glusterfsd
exe = '/usr/sbin/glusterfsd -s 10.7.199.48 --volfile-id
vol_0005ed9e52ceb05f16b1bb48cd'

(gdb) backtrace
#0  0x00007f35de36e207 in ?? ()
Cannot access memory at address 0x7f2bdef8da58
(gdb) down 0x7f2bdef8da58
Cannot access memory at address 0x7f2bdef8da58

(gdb) info frame
Stack level 0, frame at 0x7f2bdef8da60:
 rip = 0x7f35de36e207; saved rip Cannot access memory at address 0x7f2bdef8da58

(gdb) up 0x7f2bdef8da58
#0  0x00007f35de36e207 in ?? ()
(gdb) i r
rax            0x0      0
rbx            0x7f35e01ff000   139869370183680
rcx            0xffffffffffffffff       -1
rdx            0x6      6
rsi            0x2da6   11686
rdi            0x19ac   6572
rbp            0x7f35de4c2520   0x7f35de4c2520
rsp            0x7f2bdef8da58   0x7f2bdef8da58
r8             0x0      0
r9             0x7f17b4f04770   139739796621168
r10            0x8      8
r11            0x206    518
r12            0x7f35dfdb875a   139869365700442
r13            0x7f35dfdb8b00   139869365701376
r14            0x0      0
r15            0x7f255fc70ac0   139798497397440
rip            0x7f35de36e207   0x7f35de36e207
eflags         0x206    [ PF IF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
(gdb) 

Please verify the corefile further to see if the root cause could be isolated

10.7.199.48:-var-lib-heketi-mounts-vg_77cff90104e3566f7856e523efd82152-brick_3ac56467ac42041325057fdfd61bf7cc-brick

uuid=e0b70592-6ab6-4d66-b1b6-d54bd4797d70
hostname=10.7.199.48
path=/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick
real_path=/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick
listen-port=49153
rdma.listen-port=0
decommissioned=0
brick-id=vol_0005ed9e52ceb05f16b1bb48cd526ea9-client-2
mount_dir=/brick
snap-status=0
brick-fsid=16842170

[2019-08-10 06:34:18.342656] I [MSGID: 106005]
[glusterd-handler.c:6179:__glusterd_brick_rpc_notify] 0-management: Brick
10.7.199.48:/var/lib/heketi/mounts/vg_77cff90104e3566f7856e523efd82152/brick_3ac56467ac42041325057fdfd61bf7cc/brick
has disconnected from glusterd.

(Please feel free to change the "Component" if found inapprpriate. 

Many thanks,
Jay

Version-Release number of selected component (if applicable):
rhgs3.4

How reproducible:
The issue occurred twice, around the same time (morning, latest one was on Aug
10)

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

--- Additional comment from Jay Samson on 2019-09-23 10:19:15 UTC ---

Here is the analysis of the issue so far (will post more when I get time) :

[New LWP 21371]
[New LWP 23168]
[New LWP 117329]
bt[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id
vol_b763bb6afb974e6f80384673ba'.
Program terminated with signal 6, Aborted.
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
55        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x7f9efc94507a
"slot->fd == fd", file=file at entry=0x7f9efc945054 "event-epoll.c",
line=line at entry=417, 
    function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd",
file=file at entry=0x7f9efc945054 "event-epoll.c", line=line at entry=417,
function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, data=<optimized out>,
poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0
'\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0,
poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at
pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 7
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
759                     ret = event_dispatch_epoll_handler (event_pool,
&event);
(gdb) p *event_pool
$1 = {ops = 0x7f9efcb6b4e0 <event_ops_epoll>, fd = 3, breaker = {0, 0}, count =
16384, reg = 0x0, ereg = {0x563fac5b6e50, 0x7f9e247f81b0, 0x0 <repeats 1022
times>}, slots_used = {1021, 10, 0 <repeats 1022 times>}, poller_death = {next
= 0x563fac5b6fe0, 
    prev = 0x563fac5b6fe0}, poller_death_sliced = 0, poller_gen = 54, used = 0,
changed = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers
= 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next =
0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__lock
= 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex =
0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>,
__align = 0}, evcache = 0x0, 
  evcache_size = 0, eventthreadcount = 197, pollers = {140320709637888,
140320701245184, 140319441008384, 140318519322368, 140318091491072,
140317311362816, 140317135181568, 140316095010560, 140315306489600,
140314870265600, 140314023028480, 140313586804480, 
    140312873785088, 140312370452224, 140311590323968, 140311028274944,
140310508189440, 140310340400896, 140310247036672, 140310153672448,
140310060308224, 140309966944000, 140309803562752, 140309710198528,
140309616834304, 140309451233024, 140309357868800, 
    140309130221312, 140309028464384, 140308935100160, 140308841735936,
140308748371712, 140308655007488, 140308561643264, 140308468279040,
140308374914816, 140308281550592, 140308188186368, 140308086429440,
140307993065216, 140307899700992, 140307806336768, 
    140307704579840, 140307611215616, 140307509458688, 140307416094464,
140307322730240, 140307229366016, 140307136001792, 140307042637568,
140306949273344, 140306855909120, 140306762544896, 140306660787968,
140306559031040, 140306457274112, 140306363909888, 
    140306270545664, 140306177181440, 140306083817216, 140305990452992,
140305818961664, 140305725597440, 140305632233216, 140305538868992,
140305445504768, 140305352140544, 140305258776320, 140305022048000,
140304928683776, 140304762005248, 140304668641024, 
    140304566884096, 140304465127168, 140304363370240, 140304261613312,
140304168249088, 140304074884864, 140303981520640, 140303879763712,
140303786399488, 140303693035264, 140303599671040, 140303506306816,
140303412942592, 140303317473024, 140303222003456, 
    140303126533888, 140303031064320, 140302935594752, 140302840125184,
140302744655616, 140302640527104, 140302536398592, 140302432270080,
140302336534272, 140302240798464, 140302145062656, 140302049326848,
140301953591040, 140301857855232, 140301711763200, 
    140301607634688, 140301503506176, 140301407770368, 140301230380800,
140301128623872, 140301026866944, 140300924057344, 140300819928832,
140300715800320, 140300550899456, 140300383143680, 140300281386752,
140300106315520, 140300004558592, 140299902801664, 
    140299801044736, 140299697182464, 140299601712896, 140299506243328,
140299410507520, 140299306379008, 140299202250496, 140299098121984,
140298993993472, 140298898257664, 140298802521856, 140298698393344,
140298602657536, 140298498529024, 140298402793216, 
    140298307057408, 140298211321600, 140298107193088, 140298003064576,
140297907328768, 140297811592960, 140297715857152, 140297603335936,
140297499207424, 140297403471616, 140297307735808, 140297212000000,
140297116264192, 140297020528384, 140296924792576, 
    140296820664064, 140296650192640, 140296556828416, 140296463464192,
140296370099968, 140296274630400, 140296178894592, 140296012674816,
140295836526336, 140295743162112, 140295576483584, 140295483119360,
140295389755136, 140295296390912, 140295201974016, 
    140295106504448, 140295002642176, 140294907172608, 140294811436800,
140294715700992, 140294619965184, 140294524229376, 140294428493568,
140294332757760, 140294237021952, 140294141286144, 140294045550336,
140293949814528, 140293854078720, 140293758342912, 
    140293662607104, 140293558478592, 140293454350080, 140293350221568,
140293254485760, 140293158749952, 140293063014144, 140292967278336,
140292863149824, 140292767414016, 140292598515456, 140292496758528,
140292403394304, 140292308711168, 140292212975360, 
    140292117239552, 140292021503744, 140291925767936, 140291751245568,
140291717674752, 0 <repeats 827 times>}, destroy = 0, activethreadcount = 198,
auto_thread_count = 195}
(gdb) f 6
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
643                     ret = handler (fd, idx, gen, data,
(gdb) p fd
$2 = 12
(gdb) p idx
$3 = 3
(gdb) l    
638     
639             if (!handler)
640                     goto out;
641     
642             if (!handled_error_previously) {
643                     ret = handler (fd, idx, gen, data,
644                                    (event->events & (EPOLLIN|EPOLLPRI)),
645                                    (event->events & (EPOLLOUT)),
646                                    (event->events & (EPOLLERR|EPOLLHUP)),
0);
647             }
(gdb) p data
$4 = (void *) 0x7f9ee80403f0
(gdb) f 5socket_server_event_handler
Invalid number "5socket".
(gdb) f 5
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0,
poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
event_thread_died=0 '\000') at socket.c:2950
2950                                            event_register
(ctx->event_pool,
(gdb) l
2945                             */
2946                            ret = rpc_transport_notify (this,
RPC_TRANSPORT_ACCEPT, new_trans);
2947    
2948                            if (ret != -1) {
2949                                    new_priv->idx =
2950                                            event_register
(ctx->event_pool,
2951                                                            new_sock,
2952                                                           
socket_event_handler,
2953                                                            new_trans,
2954                                                            1, 0,
(gdb) p nds_sock
No symbol "nds_sock" in current context.
(gdb) p new_sock
$5 = 1340
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, data=<optimized out>,
poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0
'\000') at event-epoll.c:417
417             assert (slot->fd == fd);
(gdb) l
412                     return -1;
413             }
414     
415             slot = event_slot_get (event_pool, idx);
416     
417             assert (slot->fd == fd);
418     
419             LOCK (&slot->lock);
420             {
421                     /* make epoll 'singleshot', which
(gdb) p *event_pool
$6 = {ops = 0x7f9efcb6b4e0 <event_ops_epoll>, fd = 3, breaker = {0, 0}, count =
16384, reg = 0x0, ereg = {0x563fac5b6e50, 0x7f9e247f81b0, 0x0 <repeats 1022
times>}, slots_used = {1021, 10, 0 <repeats 1022 times>}, poller_death = {next
= 0x563fac5b6fe0, 
    prev = 0x563fac5b6fe0}, poller_death_sliced = 0, poller_gen = 54, used = 0,
changed = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers
= 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next =
0x0}}, 
    __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__lock
= 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, __woken_seq = 0, __mutex =
0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = '\000' <repeats 47 times>,
__align = 0}, evcache = 0x0, 
  evcache_size = 0, eventthreadcount = 197, pollers = {140320709637888,
140320701245184, 140319441008384, 140318519322368, 140318091491072,
140317311362816, 140317135181568, 140316095010560, 140315306489600,
140314870265600, 140314023028480, 140313586804480, 
    140312873785088, 140312370452224, 140311590323968, 140311028274944,
140310508189440, 140310340400896, 140310247036672, 140310153672448,
140310060308224, 140309966944000, 140309803562752, 140309710198528,
140309616834304, 140309451233024, 140309357868800, 
    140309130221312, 140309028464384, 140308935100160, 140308841735936,
140308748371712, 140308655007488, 140308561643264, 140308468279040,
140308374914816, 140308281550592, 140308188186368, 140308086429440,
140307993065216, 140307899700992, 140307806336768, 
    140307704579840, 140307611215616, 140307509458688, 140307416094464,
140307322730240, 140307229366016, 140307136001792, 140307042637568,
140306949273344, 140306855909120, 140306762544896, 140306660787968,
140306559031040, 140306457274112, 140306363909888, 
    140306270545664, 140306177181440, 140306083817216, 140305990452992,
140305818961664, 140305725597440, 140305632233216, 140305538868992,
140305445504768, 140305352140544, 140305258776320, 140305022048000,
140304928683776, 140304762005248, 140304668641024, 
    140304566884096, 140304465127168, 140304363370240, 140304261613312,
140304168249088, 140304074884864, 140303981520640, 140303879763712,
140303786399488, 140303693035264, 140303599671040, 140303506306816,
140303412942592, 140303317473024, 140303222003456, 
    140303126533888, 140303031064320, 140302935594752, 140302840125184,
140302744655616, 140302640527104, 140302536398592, 140302432270080,
140302336534272, 140302240798464, 140302145062656, 140302049326848,
140301953591040, 140301857855232, 140301711763200, 
    140301607634688, 140301503506176, 140301407770368, 140301230380800,
140301128623872, 140301026866944, 140300924057344, 140300819928832,
140300715800320, 140300550899456, 140300383143680, 140300281386752,
140300106315520, 140300004558592, 140299902801664, 
    140299801044736, 140299697182464, 140299601712896, 140299506243328,
140299410507520, 140299306379008, 140299202250496, 140299098121984,
140298993993472, 140298898257664, 140298802521856, 140298698393344,
140298602657536, 140298498529024, 140298402793216, 
    140298307057408, 140298211321600, 140298107193088, 140298003064576,
140297907328768, 140297811592960, 140297715857152, 140297603335936,
140297499207424, 140297403471616, 140297307735808, 140297212000000,
140297116264192, 140297020528384, 140296924792576, 
    140296820664064, 140296650192640, 140296556828416, 140296463464192,
140296370099968, 140296274630400, 140296178894592, 140296012674816,
140295836526336, 140295743162112, 140295576483584, 140295483119360,
140295389755136, 140295296390912, 140295201974016, 
    140295106504448, 140295002642176, 140294907172608, 140294811436800,
140294715700992, 140294619965184, 140294524229376, 140294428493568,
140294332757760, 140294237021952, 140294141286144, 140294045550336,
140293949814528, 140293854078720, 140293758342912, 
    140293662607104, 140293558478592, 140293454350080, 140293350221568,
140293254485760, 140293158749952, 140293063014144, 140292967278336,
140292863149824, 140292767414016, 140292598515456, 140292496758528,
140292403394304, 140292308711168, 140292212975360, 
    140292117239552, 140292021503744, 140291925767936, 140291751245568,
140291717674752, 0 <repeats 827 times>}, destroy = 0, activethreadcount = 198,
auto_thread_count = 195}
(gdb) p idx
$7 = <optimized out>
(gdb) ptype event_pool
type = struct event_pool {
    struct event_ops *ops;
    int fd;
    int breaker[2];
    int count;
    struct event_slot_poll *reg;
    struct event_slot_epoll *ereg[1024];
    int slots_used[1024];
    struct list_head poller_death;
    int poller_death_sliced;
    int poller_gen;
    int used;
    int changed;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    void *evcache;
    int evcache_size;
    int eventthreadcount;
    pthread_t pollers[1024];
    int destroy;
    int activethreadcount;
    int auto_thread_count;
} *
(gdb) p *event_pool->event_slot_epoll
There is no member named event_slot_epoll.
(gdb) p *event_pool->ereg
$8 = (struct event_slot_epoll *) 0x563fac5b6e50
(gdb) p (struct event_slot_epoll *)event_pool->ereg
$9 = (struct event_slot_epoll *) 0x563fac588170
(gdb) p *(struct event_slot_epoll *)event_pool->ereg
$10 = {fd = -1403294128, events = 22079, gen = 612336048, idx = 32670, ref = 0,
do_close = 0, in_handler = 0, handled_error = 0, data = 0x0, handler = 0x0,
lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
__nusers = 0, __kind = 0, 
        __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}}, poller_death = {next = 0x0,
prev = 0x0}}
(gdb) p *event_pool->slots_used
$11 = 1021
(gdb) p *event_pool->slots_used[1]
Cannot access memory at address 0xa
(gdb) p *event_pool->slots_used[0]
Cannot access memory at address 0x3fd
(gdb) p event_pool->slots_used[0]
$12 = 1021
(gdb) p event_pool->slots_used[1]
$13 = 10
(gdb) p event_pool->slots_used[2]
$14 = 0
(gdb) p event_pool->slots_used[3]
$15 = 0
(gdb) p event_pool->slots_used[1021]
$16 = 0
(gdb) p event_pool->slots_used[1024]
$17 = -1403293728
(gdb) p event_pool->slots_used[1023]
$18 = 0
(gdb) p event_pool->slots_used[10]
$19 = 0
(gdb) p *event_pool->ereg[0]
$20 = {fd = 7, events = 1073741851, gen = 1, idx = 0, ref = 1, do_close = 0,
in_handler = 0, handled_error = 0, data = 0x563fac595a20, handler =
0x7f9ef798c780 <socket_server_event_handler>, lock = {spinlock = 0, mutex =
{__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev
= 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x563fac5b6ea8, prev = 0x563fac5b6ea8}}
(gdb) p *event_pool->ereg[1]
$21 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1,
in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler =
0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data =
{__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev
= 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x7f9efc94507a
"slot->fd == fd", file=file at entry=0x7f9efc945054 "event-epoll.c",
line=line at entry=417, 
    function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd",
file=file at entry=0x7f9efc945054 "event-epoll.c", line=line at entry=417,
function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, data=<optimized out>,
poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0
'\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0,
poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at
pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) l
422                        means we need to re-add the fd with
423                        epoll_ctl(EPOLL_CTL_MOD) after delivery of every
424                        single event. This assures us that while a poller
425                        thread has picked up and is processing an event,
426                        another poller will not try to pick this at the same
427                        time as well.
428                     */
429     
430                     slot->events = EPOLLPRI | EPOLLHUP | EPOLLERR |
EPOLLONESHOT;
431                     slot->handler = handler;
(gdb) p *slot
$22 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1,
in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler =
0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data =
{__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev
= 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) p event_pool->slots_used[11]
$23 = 0
(gdb) p event_pool->slots_used[12]
$24 = 0
(gdb) p *slot
$25 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close = 1,
in_handler = 0, handled_error = 0, data = 0x7f9e3c050be0, handler =
0x7f9ef7989980 <socket_event_handler>, lock = {spinlock = 0, mutex = {__data =
{__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev
= 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x7f9e247f8208, prev = 0x7f9e247f8208}}
(gdb) p slot->fd
$26 = -1
(gdb) p fd      
$27 = <optimized out>
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, data=<optimized out>,
poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0
'\000') at event-epoll.c:417
417             assert (slot->fd == fd);
(gdb) p slot->fd
$28 = -1

--- Additional comment from Prashant Dhange on 2019-09-25 03:43:12 UTC ---

(In reply to Prashant Dhange from comment #33)
> (In reply to Prashant Dhange from comment #32)
> > (In reply to nravinas from comment #30)
> > > Hi, Nithya.
> > > 
> > > * I've been requested to try to help here. Unfortunately, setting a digicore
> > > environment to analyze this core is not working as expected:
> > > 
> > > IP: 10.72.93.42
> > > Username / password: root / redhat
> > > 
> > > cd /root/digicore
> > > 
> > > sh digicore.sh attach 02441282
> > > 
> > > Once attached to the container, I have downloaded a core file at /core
> > > 
> > > gdb /usr/sbin/glusterfsd core
> > > 
> > > .
> > > .
> > > .
> > > 
> > > Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id
> > > vol_b763bb6afb974e6f80384673ba'.
> > > Program terminated with signal 6, Aborted.
> > > #0  0x00007f9efaed5207 in sigorset (dest=0x4193, left=0x5377, right=0x6) at
> > > sigorset.c:27
> > > 27	  if (dest == NULL || left == NULL || right == NULL)
> > > 
> > The above digicore environment is having different gluster/glibc package
> > version and are not matching with customer's coredump environment that is
> > why we are getting incomplete backtrace.
> > 
> > We had passed wrong installed-rpms (rpm -qa) to digicore while creating
> > digicore containter, whereas coredump was generated on environment having
> > glusterfs 3.12.2-47.el7rhgs.x86_64 and glibc 2.17-260.el7_6.4.x86_64
> > 
> > The above digicore container is having glusterfs 3.12.2-47.4.el7rhgs.x86_64
> > and glibc 2.17-292.el7.x86_64 :
> > [root at 7c3000b2e002 case]# rpm -qa |egrep 'gluster|glibc'
> > glibc-2.17-292.el7.x86_64                                   Tue Sep  3
> > 09:12:15 2019
> > glibc-common-2.17-292.el7.x86_64                            Tue Sep  3
> > 09:12:12 2019
> > glusterfs-3.12.2-47.4.el7rhgs.x86_64                        Sun Sep 15
> > 10:20:34 2019
> > glusterfs-api-3.12.2-47.4.el7rhgs.x86_64                    Sun Sep 15
> > 10:20:35 2019
> > glusterfs-client-xlators-3.12.2-47.4.el7rhgs.x86_64         Sun Sep 15
> > 10:20:34 2019
> > glusterfs-cli-3.12.2-47.4.el7rhgs.x86_64                    Sun Sep 15
> > 10:20:34 2019
> > glusterfs-events-3.12.2-47.4.el7rhgs.x86_64                 Sun Sep 15
> > 10:20:36 2019
> > glusterfs-fuse-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:35 2019
> > glusterfs-geo-replication-3.12.2-47.4.el7rhgs.x86_64        Sun Sep 15
> > 10:20:36 2019
> > glusterfs-libs-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:34 2019
> > glusterfs-rdma-3.12.2-47.4.el7rhgs.x86_64                   Sun Sep 15
> > 10:20:38 2019
> > glusterfs-server-3.12.2-47.4.el7rhgs.x86_64                 Sun Sep 15
> > 10:20:35 2019
> > gluster-block-0.2.1-32.el7rhgs.x86_64                       Sun Sep 15
> > 10:25:48 2019
> > gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64               Wed Sep 11
> > 13:24:39 2019
> > gluster-nagios-common-0.2.4-1.el7rhgs.noarch                Wed Sep 11
> > 13:24:12 2019
> > libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.1.x86_64 Sun Sep 15
> > 10:20:36 2019
> > python2-gluster-3.12.2-47.4.el7rhgs.x86_64                  Sun Sep 15
> > 10:20:36 2019
> > vdsm-gluster-4.30.18-1.0.el7rhgs.x86_64                     Sun Sep 15
> > 10:20:38 2019
> > 
> > Correcting glusterfs/glibc version to
> > 3.12.2-47.4.el7rhgs.x86_64/2.17-292.el7.x86_64 respectively on above
> > container should fix the issue and we should get complete backtrace here.
> Correction: it should be "Correcting glusterfs/glibc version to
> 3.12.2-47.el7rhgs.x86_64/2.17-260.el7_6.4.x86_64 respectively on above
> container should fix the issue and we should get complete backtrace here."

I have fixed the package mismatch on above container and we are able to get
proper backtrace.
[root at 7c3000b2e002 case]# rpm -qa|egrep 'gluster|glibc'
glusterfs-libs-3.12.2-47.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-47.el7rhgs.x86_64
glusterfs-cli-3.12.2-47.el7rhgs.x86_64
glusterfs-server-3.12.2-47.el7rhgs.x86_64
gluster-block-0.2.1-31.el7rhgs.x86_64
glibc-2.17-260.el7_6.4.x86_64
glibc-debuginfo-2.17-260.el7_6.4.x86_64
glusterfs-3.12.2-47.el7rhgs.x86_64
glusterfs-api-3.12.2-47.el7rhgs.x86_64
glusterfs-fuse-3.12.2-47.el7rhgs.x86_64
python2-gluster-3.12.2-47.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-47.el7rhgs.x86_64
glibc-common-2.17-260.el7_6.4.x86_64
glibc-debuginfo-common-2.17-260.el7_6.4.x86_64
glusterfs-debuginfo-3.12.2-47.el7rhgs.x86_64

[root at 7c3000b2e002 case]# gdb /usr/sbin/glusterfsd core 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from
/usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.
....
....
[New LWP 21301]
[New LWP 21371]
[New LWP 23168]
[New LWP 117329]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id
vol_b763bb6afb974e6f80384673ba'.
Program terminated with signal 6, Aborted.
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
55        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", assertion=assertion at entry=0x7f9efc94507a
"slot->fd == fd", file=file at entry=0x7f9efc945054 "event-epoll.c",
line=line at entry=417, 
    function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd",
file=file at entry=0x7f9efc945054 "event-epoll.c", line=line at entry=417,
function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll")
    at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, data=<optimized out>,
poll_in=<optimized out>, poll_out=<optimized out>, notify_poller_death=0
'\000') at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, data=0x7f9ee80403f0,
poll_in=<optimized out>, poll_out=<optimized out>, poll_err=0,
event_thread_died=0 '\000') at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at
pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb)

So we have 2 coredump debug environments:
1. Natalia's ocs environment

ocs master:
ip-address : 10.74.251.54 
username : root
password : RedHat1! 

[root at vm251-54 ~]# oc get pods -o wide
NAME                                          READY     STATUS    RESTARTS  
AGE       IP              NODE                               NOMINATED NODE
glusterblock-storage-provisioner-dc-1-49bmc   1/1       Running   0         
11h       10.131.0.119    vm252-19.gsslab.pnq2.redhat.com    <none>
glusterfs-storage-mqkck                       1/1       Running   0         
11h       10.74.252.19    vm252-19.gsslab.pnq2.redhat.com    <none>
glusterfs-storage-w9xfz                       1/1       Running   0         
11h       10.74.254.205   vm254-205.gsslab.pnq2.redhat.com   <none>
glusterfs-storage-zzwd9                       1/1       Running   0         
11h       10.74.252.69    vm252-69.gsslab.pnq2.redhat.com    <none>
heketi-storage-1-8vwsc                        1/1       Running   0         
11h       10.131.0.117    vm252-19.gsslab.pnq2.redhat.com    <none>

[root at vm251-54 ~]# oc rsh glusterfs-storage-mqkck 
sh-4.2# cd /cores/
sh-4.2# gdb /usr/sbin/glusterfsd inf02-core.16787

2. digicore container mentioned in comment#30

--- Additional comment from Prashant Dhange on 2019-09-25 05:07:11 UTC ---

Echoing what Mohit has found earlier here in comment#8

(In reply to Mohit Agrawal from comment #8)
> Hi,
> 
> I have checked gluster processes logs specific to brick crash, I have not
> found any reason of brick crash in logs.
> There are no error messages available in logs just before a brick crash.
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> [2019-08-10 06:32:32.415690] I [addr.c:55:compare_addr_and_update]
> 0-/var/lib/heketi/mounts/vg_dac7467f7481ef3dc72981494e55c93c/
> brick_705d09c3f50d87b740ab828b8c819fee/brick: allowed = "*", received addr =
> "10.7.199.96"
> [2019-08-10 06:32:32.415729] I [MSGID: 115029]
> [server-handshake.c:817:server_setvolume]
> 0-vol_0005ed9e52ceb05f16b1bb48cd526ea9-server: accepted client from
> vlmazocpapp32.fisdev.local-57970-2019/08/10-06:32:32:332389-
> vol_ec6393163cc25b9fed3c1e5e2ff8375c-client-2-0-0 (version: 3.12.2)
> /lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xa0)[0x7f8e5b3a6cc0]
There was no enough memory available to hold backtrace message/ 

> /lib64/libglusterfs.so.0(gf_print_trace+0x334)[0x7f8e5b3b0c04]
> /lib64/libc.so.6(+0x36280)[0x7f8e59a06280]
> /lib64/libc.so.6(gsignal+0x37)[0x7f8e59a06207]
> /lib64/libc.so.6(abort+0x148)[0x7f8e59a078f8]
> /lib64/libc.so.6(+0x2f026)[0x7f8e599ff026]
> /lib64/libc.so.6(+0x2f0d2)[0x7f8e599ff0d2]
> /lib64/libglusterfs.so.0(+0x88c4b)[0x7f8e5b403c4b]
> /usr/lib64/glusterfs/3.12.2/rpc-transport/socket.so(+0xcdaf)[0x7f8e564bedaf]
> /lib64/libglusterfs.so.0(+0x894c4)[0x7f8e5b4044c4]
> /lib64/libpthread.so.0(+0x7dd5)[0x7f8e5a205dd5]
> /lib64/libc.so.6(clone+0x6d)[0x7f8e59aceb3d]
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> On the system side, I have found some of the processes got OOM killed (like
> heapster) multiple times due to cgroup memory limit configured but
> on the basis of current logs, I can not say the same is true for gluster
> also.
> We don't have complete backtrace and available backtrace pattern in logs is
> not
> sufficient to find out the reason for a crash.
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> [11799234.175888] SELinux: mount invalid.  Same superblock, different
> security settings for (dev mqueue, type mqueue)
> [11799235.185554] XFS (dm-1222): Unmounting Filesystem
> [11799532.330862] heapster invoked oom-killer: gfp_mask=0x50, order=0,
> oom_score_adj=985
> [11799532.341972] heapster
> cpuset=docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> mems_allowed=0
> [11799532.349607] CPU: 5 PID: 95365 Comm: heapster Not tainted
> 3.10.0-957.10.1.el7.x86_64 #1
> [11799532.355169] Hardware name: Microsoft Corporation Virtual
> Machine/Virtual Machine, BIOS 090007  06/02/2017
> [11799532.361995] Call Trace:
> [11799532.364181]  [<ffffffffae162e41>] dump_stack+0x19/0x1b
> [11799532.367727]  [<ffffffffae15d86a>] dump_header+0x90/0x229
> [11799532.371755]  [<ffffffffadbba156>] ? find_lock_task_mm+0x56/0xc0
> [11799532.375694]  [<ffffffffadc31628>] ?
> try_get_mem_cgroup_from_mm+0x28/0x60
> [11799532.381063]  [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> [11799532.385394]  [<ffffffffadc35406>]
> mem_cgroup_oom_synchronize+0x546/0x570
> [11799532.389867]  [<ffffffffadc34880>] ? mem_cgroup_charge_common+0xc0/0xc0
> [11799532.393860]  [<ffffffffadbbae94>] pagefault_out_of_memory+0x14/0x90
> [11799532.397908]  [<ffffffffae15bd72>] mm_fault_error+0x6a/0x157
> [11799532.401558]  [<ffffffffae1707a8>] __do_page_fault+0x3c8/0x500
> [11799532.406047]  [<ffffffffae170915>] do_page_fault+0x35/0x90
> [11799532.410130]  [<ffffffffae16c758>] page_fault+0x28/0x30
> [11799532.419650] Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> [11799532.446763] memory: usage 3662108kB, limit 3662108kB, failcnt 5883951
> [11799532.450874] memory+swap: usage 3662108kB, limit 9007199254740988kB,
> failcnt 0
> [11799532.455828] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
> [11799532.459804] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice: cache:0KB rss:0KB
> rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:0KB
> inactive_file:0KB active_file:0KB unevictable:0KB
> [11799532.476295] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 56f2a10610e0295e6716b9de5e32334b96be7201b37f8160f728af4c3e366b08.scope:
> cache:0KB rss:428KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB
> active_anon:428KB inactive_file:0KB active_file:0KB unevictable:0KB
> [11799532.500710] Memory cgroup stats for
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope:
> cache:452KB rss:3661228KB rss_huge:0KB mapped_file:452KB swap:0KB
> inactive_anon:0KB active_anon:3661224KB inactive_file:0KB active_file:368KB
> unevictable:0KB
> [11799532.521834] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents
> oom_score_adj name
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
cgroup memory limitation not allowing to allocate more pages requested by
glusterfsd here.

> 
> I have not found any logs specific to consuming high memory by gluster
> process
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.330862] heapster
> invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=985
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.381063] 
> [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.419650] Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 6e4a91e46c97a87310aab482bdd8b5350d58b0dc2c537f56769f3deddd73246e.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.545283] Memory
> cgroup out of memory: Kill process 107416 (heapster) score 1985 or sacrifice
> child
> messages:Aug  5 04:48:38 vlmazocpinf01 kernel: [11799532.551985] Killed
> process 101585 (heapster) total-vm:5095012kB, anon-rss:3659164kB,
> file-rss:0kB, shmem-rss:0kB
> grep: messages-20190728: No such file or directory
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361383.951387]
> heapster invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=985
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.041571] 
> [<ffffffffadbba604>] oom_kill_process+0x254/0x3d0
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.112290]
> Task in
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice/docker-
> 3d1520b587fa1705dd72e9aacea58b31ae8b3d1b90592e1d00453f2292af1d20.scope
> killed as a result of limit of
> /kubepods.slice/kubepods-burstable.slice/kubepods-burstable-
> pod76432b3c_a713_11e9_93ef_000d3a046b95.slice
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.353431]
> Memory cgroup out of memory: Kill process 40432 (heapster) score 1985 or
> sacrifice child
> messages-20190804:Jul 31 03:06:10 vlmazocpinf01 kernel: [11361384.368790]
> Killed process 9732 (heapster) total-vm:4980396kB, anon-rss:3659704kB,
> file-rss:0kB, shmem-rss:0kB
> 
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> Can we ask if the customer has observed any odd behavior at the time of
> brick crash?
> 
> Thanks,
> Mohit Agrawal

The crash seems to be because of event_slot_alloc could not find slot for new
socket fd=1340, looks like __event_newtable failing to allocate memory to new
table.

--- Additional comment from Mohit Agrawal on 2019-09-25 05:18:40 UTC ---

Hi,

It seems brick process is getting crashed because the function event_slot_alloc
is not able to return valid slot.

bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", 
    assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd",
file=file at entry=0x7f9efc945054 "event-epoll.c", 
    line=line at entry=417, function=function at entry=0x7f9efc945420
<__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd", 
    file=file at entry=0x7f9efc945054 "event-epoll.c", line=line at entry=417, 
    function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>,
notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, 
    data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>,
poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at
pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>,
notify_poller_death=0 '\000')
    at event-epoll.c:417
417             assert (slot->fd == fd);
(gdb) p slot
$3184 = (struct event_slot_epoll *) 0x7f9e247f81b0
(gdb) p *slot
$3185 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close =
1, in_handler = 0, handled_error = 0, 
  data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock
= {spinlock = 0, mutex = {__data = {
        __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins
= 0, __elision = 0, __list = {__prev = 0x0, 
          __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x7f9e247f8208, 
    prev = 0x7f9e247f8208}}

p event_pool->slots_used
$3188 = {1021, 10, 0 <repeats 1022 times>}
f 4
After print the complete fd table i am not able to figure out any entry for
socket 1340
set $tmp = 1
while ($tmp != 1023)
print ((struct event_slot_epoll *)event_pool->ereg[0])[$tmp].fd
set $tmp = $tmp + 1
f 5
p new_sock
1340

As per the current code of event_slot_alloc first, it checks the value of
slot_used to validate the free entry in the table. Current bt is showing the
value of slots used is 1021(less than 1024) it means still it has some free
slot and it sets the table to this registry index(0)

>>>>>>>>>>>>>>>>>>>>>

for (i = 0; i < EVENT_EPOLL_TABLES; i++) {
        switch (event_pool->slots_used[i]) {
            case EVENT_EPOLL_SLOTS:
                continue;
            case 0:
                if (!event_pool->ereg[i]) {
                    table = __event_newtable(event_pool, i);
                    if (!table)
                        return -1;
                } else {
                    table = event_pool->ereg[i];
                }
                break;
            default:
                table = event_pool->ereg[i];
                break;
        }

        if (table)
            /* break out of the loop */
            break;
    }

    if (!table)
        return -1;

    table_idx = i;

>>>>>>>>>>>>>>>>>>>>>>>>>

In below code it tries to check the free entry in the table.As per current
slots_used value ideally 3 entry should be free in table but somehow here no
entry is free. 

The code is not validating the fd assignment in table. It is just returning
idx.

>>>>>>>>>>>>>>>>>>>>>>>>>>

    for (i = 0; i < EVENT_EPOLL_SLOTS; i++) {
        if (table[i].fd == -1) {
            /* wipe everything except bump the generation */
            gen = table[i].gen;
            memset(&table[i], 0, sizeof(table[i]));
            table[i].gen = gen + 1;

            LOCK_INIT(&table[i].lock);
            INIT_LIST_HEAD(&table[i].poller_death);

            table[i].fd = fd;
            if (notify_poller_death) {
                table[i].idx = table_idx * EVENT_EPOLL_SLOTS + i;
                list_add_tail(&table[i].poller_death,
                              &event_pool->poller_death);
            }

            event_pool->slots_used[table_idx]++;

            break;
        }
    }

>>>>>>>>>>>>>>>>>
    return table_idx * EVENT_EPOLL_SLOTS + i;

I think we need to update the code.
I have checked the event code,  I am not able to figure out why slots_used is
not showing correct value.
Ideally slots_used value should be 1024 because in registry table no index is
free and before returning
the index it should validate fd is successfully assigned or not.

I am still checking the code, need some more time to understand the issue.

Thanks,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-09-27 13:26:36 UTC ---

Hello,
May I know if we have an update that can be passed to the customer?

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-09-30 10:23:38 UTC ---

Hi Jay,

 We are still working on this. I don't have any update yet.
 We are suspecting there is some race condition in socket code path. 
 Do you have any idea what is the frequency of getting this crash?
 Is the issue easily reproducible or can we ask something about the reproducer?

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-01 11:34:26 UTC ---

Hi Jay,

Can you confirm with the customer is it possible to test a patch?
On the basis of core we are not able to find out root cause of race.
Does the customer have some test environment where the issue is easily
reproducible?

Thanks,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-01 11:48:10 UTC ---

Hello Mohit,
The customer has updated this today:

" Acutally, in taking a look, it does look like one of the gluster pods core
dumped saturday morning. I'm grabbing the core file now to upload. The problem
is, even though we do tend to see a similar trend here with WHEN the files are
generated (usually saturday mornings 3am-7am or so) I have a really hard time
trying to figure out just what is going on in the cluster at that time that may
be causing this, or trying to map load back to specific pvcs, etc.. 

Last week we updated these nodes to RHEL 7.7 and they were rebooted as well,
just FYI."

He has also attached the core file. I have copied it to supportshell. (Please
let me know if you have a system IP or something where I can copy this file)

I will verify the feasibility of installing a test patch with them and let you
know their inputs. (I'll keep the needinfo on me until this is available)

Thanks,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-01 13:33:33 UTC ---

Hi,

 I don't any extra system, kindly upload new corefile on the same system.

Regards,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-01 14:26:16 UTC ---

Hello,
The latest corefile is copied to 10.72.93.42 under /

filename: core.9838.gz

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-01 15:04:05 UTC ---

Hi Jay,

  It seems there is sufficient space available on the system, can you please
untar and setup
  gdb environment to debug the core. Kindly ask logs also if logs are not
available.

Regards,
Mohit Agrawal

--- Additional comment from Raghavendra G on 2019-10-01 17:21:16 UTC ---

(In reply to Mohit Agrawal from comment #38)
> 
> I think we need to update the code.
> I have checked the event code,  I am not able to figure out why slots_used
> is not showing correct value.
> Ideally slots_used value should be 1024 because in registry table no index
> is free and before returning
> the index it should validate fd is successfully assigned or not.
> 
> I am still checking the code, need some more time to understand the issue.

I found the reason why slots_used for table 0 was wrong. From one of the logs
of a crashed brick process:

var-lib-heketi-mounts-vg_862823908fb7cc063683414098b0ab22-brick_99bcb9083a46f31076bfc008c88791ef-brick.log-20190817:[2019-08-17
04:14:03.801645] E [MSGID: 101063]
[event-epoll.c:611:event_dispatch_epoll_handler] 0-epoll: stale fd found on
idx=500, gen=130, events=1, slot->gen=132

The msg "stale fd found.." indicates that the slot is already deallocated and
hence slot->ref would be 0. However 
1. event_dispatch_epoll_handler would call slot_ref on this slot (ref would be
0+1 = 1 at this point) 
2. event_dispatch_epoll_handler identifies its a stale fd and hence slot is not
valid and exits 
3. event_dispatch_epoll_handler would call slot_unref on this slot (ref would
be 1-1=0 again, resulting in dealloc again without any slot_alloc)

Note that only event_slot_alloc increments event_pool->slots_used[table_idx]
(event_slot_get doesn't). This means for this slot there was one call to
event_slot_alloc but two calls to event_slot_dealloc and hence the value of
slots_used[0] would actually be 1 less than the actual number of slots
allocated in table 0.

Coming back to the log message printed above, idx=500 means this slot belongs
to table 0 and hence there is wrong accounting (1 less than actual value) in
table 0. So if event_pool->slots_used[0] is 1023, the table is actually full.
But event_slot_alloc would wrongly believe there is one slot available and will
loop through all slots of table[0] without any success. But at the end of
__event_slot_alloc we don't actually test whether we got an empty slot, but
instead assume that the iterator 'i' is actually pointing to an empty slot in
table 0 and construct and index out of it. As Mohit as already pointed out
above, as index is not pointing to a valid slot it doesn't contain expected fd
resulting in assertion failure.

The root cause here is that the slot->ref is not incremented atomically when
slot is allocated. Instead it is done later as part of event_slot_get. If in
this window if we happen to run into ref/unref cycles (as explained above) it
would result in more number of calls to event_slot_deallocation than actually
needed resulting in wrong accounting of slots_used in slot table.

The fix would be:
1. increment slot->ref atomically in __event_slot_alloc
2. Add checks to __event_slot_alloc whether it actually returns a valid slot
instead of assuming it does.

> 
> Thanks,
> Mohit Agrawal

--- Additional comment from Raghavendra G on 2019-10-02 02:46:39 UTC ---

(In reply to Raghavendra G from comment #46)
>
> The root cause here is that the slot->ref is not incremented atomically when
> slot is allocated. Instead it is done later as part of event_slot_get. If in
> this window if we happen to run into ref/unref cycles (as explained above)
> it would result in more number of calls to event_slot_deallocation than
> actually needed resulting in wrong accounting of slots_used in slot table.
> 
> The fix would be:
> 1. increment slot->ref atomically in __event_slot_alloc
> 2. Add checks to __event_slot_alloc whether it actually returns a valid slot
> instead of assuming it does.

While the problem I mentioned above related to non-atomic increment of
slot->ref while allocating a slot is a valid one, the other problem is
ref/unref cycles on a deallocated slot resulting in calling of deallocation of
an already deallocated slot. This problem won't be fixed by atomically
incrementing slot->ref in __event_slot_alloc. The fix for this problem would
require preventing of ref/unref cycles and/or deallocation of an unused socket.
One trivial fix that comes to my mind is to not return a slot at all (instead
return NULL) if index is not pointing to a valid slot (a valid slot will have a
non-negative fd and generation number matches with the value passed in argument
of event_slot_get). We can continue discussions on this over the patch on
gerrit.

--- Additional comment from Mohit Agrawal on 2019-10-03 03:29:35 UTC ---

Hi Jay,

  It seems there is no sufficient space available on the system(10.72.93.42),
can you please unzip and setup
  gdb environment to debug the core. Kindly ask logs also if logs are not
available.

Regards,
Mohit Agrawal

--- Additional comment from Jay Samson on 2019-10-03 13:28:34 UTC ---

Hi Mohit,

The core debug env is set-up (access instructions same as Comment #30)

# gdb /usr/sbin/glusterfsd core.9838
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/glusterfsd...Reading symbols from
/usr/lib/debug/usr/sbin/glusterfsd.debug...done.
done.

Core was generated by `/usr/sbin/glusterfsd -s 10.7.199.47 --volfile-id
vol_ad9ae40b319276e6e36ff24f81'.
Program terminated with signal 6, Aborted.
#0  0x00007fb134b52207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
55        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);

(gdb) t a a bt

Thread 3594 (Thread 0x7fae1f7fe700 (LWP 12072)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fae9beba838) at
changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fae1f7fe700) at
pthread_create.c:307
#3  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3593 (Thread 0x7fa90143a700 (LWP 46542)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007fb12c5f4f50 in iot_worker (data=0x7fac1ec29890) at io-threads.c:194
#2  0x00007fb135352dd5 in start_thread (arg=0x7fa90143a700) at
pthread_create.c:307
#3  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3592 (Thread 0x7faa7f7fe700 (LWP 14802)):
#0  0x00007fb134be0e2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb134be0cc4 in __sleep (seconds=0, seconds at entry=5) at
../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007fb12ea3f5c1 in posix_disk_space_check_thread_proc
(data=0x7fab07c5d7a0) at posix-helpers.c:1920
#3  0x00007fb135352dd5 in start_thread (arg=0x7faa7f7fe700) at
pthread_create.c:307
#4  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3591 (Thread 0x7fabb6158700 (LWP 14062)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fac0685e0b8) at
changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fabb6158700) at
pthread_create.c:307
#3  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3590 (Thread 0x7fad5ad98700 (LWP 12601)):
#0  0x00007fb134c10f73 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fb12d8d4aca in changelog_ev_dispatch (data=0x7fad95d8c0a8) at
changelog-ev-handle.c:350
#2  0x00007fb135352dd5 in start_thread (arg=0x7fad5ad98700) at
pthread_create.c:307
#3  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

<snipped>

Regards,
Jay

--- Additional comment from Mohit Agrawal on 2019-10-14 04:57:03 UTC ---

Hi,

Thanks for setup the environment to debug the core.
The new core(./case/core.9838) is also similar to the older one.

>>>>>>>>>>>>>>>>>>>

#0  0x00007fb134b52207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007fb134b538f8 in __GI_abort () at abort.c:90
#2  0x00007fb134b4b026 in __assert_fail_base (fmt=0x7fb134ca5ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", 
    assertion=assertion at entry=0x7fb1365c207a "slot->fd == fd",
file=file at entry=0x7fb1365c2054 "event-epoll.c", 
    line=line at entry=417, function=function at entry=0x7fb1365c2420
<__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007fb134b4b0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7fb1365c207a "slot->fd == fd", 
    file=file at entry=0x7fb1365c2054 "event-epoll.c", line=line at entry=417, 
    function=function at entry=0x7fb1365c2420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:101
#4  0x00007fb136574d04 in event_register_epoll (event_pool=0x561792cb1150,
fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>,
notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007fb131509eb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, 
    data=0x7fb12003eee0, poll_in=<optimized out>, poll_out=<optimized out>,
poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007fb136575870 in event_dispatch_epoll_handler (event=0x7faceb1ece30,
event_pool=0x561792cb1150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7fad1d7569a0) at event-epoll.c:759
#8  0x00007fb135352dd5 in start_thread (arg=0x7faceb1ed700) at
pthread_create.c:307
#9  0x00007fb134c19ead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

>>>>>>>>>>>>>>>>>>>.

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-18 06:01:22 UTC ---

Hi,

It seems the brick process is getting crashed because the function
event_slot_alloc is not able to 
return a valid slot.

bt
#0  0x00007f9efaed5207 in __GI_raise (sig=sig at entry=6) at
../nptl/sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007f9efaed68f8 in __GI_abort () at abort.c:90
#2  0x00007f9efaece026 in __assert_fail_base (fmt=0x7f9efb028ea0 "%s%s%s:%u:
%s%sAssertion `%s' failed.\n%n", 
    assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd",
file=file at entry=0x7f9efc945054 "event-epoll.c", 
    line=line at entry=417, function=function at entry=0x7f9efc945420
<__PRETTY_FUNCTION__.11118> "event_register_epoll")
    at assert.c:92
#3  0x00007f9efaece0d2 in __GI___assert_fail
(assertion=assertion at entry=0x7f9efc94507a "slot->fd == fd", 
    file=file at entry=0x7f9efc945054 "event-epoll.c", line=line at entry=417, 
    function=function at entry=0x7f9efc945420 <__PRETTY_FUNCTION__.11118>
"event_register_epoll") at assert.c:101
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>,
notify_poller_death=0 '\000')
    at event-epoll.c:417
#5  0x00007f9ef798ceb2 in socket_server_event_handler (fd=<optimized out>,
idx=<optimized out>, gen=<optimized out>, 
    data=0x7f9ee80403f0, poll_in=<optimized out>, poll_out=<optimized out>,
poll_err=0, event_thread_died=0 '\000')
    at socket.c:2950
#6  0x00007f9efc8f8870 in event_dispatch_epoll_handler (event=0x7f98fffc9e70,
event_pool=0x563fac588150) at event-epoll.c:643
#7  event_dispatch_epoll_worker (data=0x7f992c8ea110) at event-epoll.c:759
#8  0x00007f9efb6d5dd5 in start_thread (arg=0x7f98fffca700) at
pthread_create.c:307
#9  0x00007f9efaf9cead in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) f 4
#4  0x00007f9efc8f7d04 in event_register_epoll (event_pool=0x563fac588150,
fd=<optimized out>, handler=<optimized out>, 
    data=<optimized out>, poll_in=<optimized out>, poll_out=<optimized out>,
notify_poller_death=0 '\000')
    at event-epoll.c:417
417             assert (slot->fd == fd);
(gdb) p slot
$3184 = (struct event_slot_epoll *) 0x7f9e247f81b0
(gdb) p *slot
$3185 = {fd = -1, events = 1073741851, gen = 216, idx = 0, ref = 1, do_close =
1, in_handler = 0, handled_error = 0, 
  data = 0x7f9e3c050be0, handler = 0x7f9ef7989980 <socket_event_handler>, lock
= {spinlock = 0, mutex = {__data = {
        __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins
= 0, __elision = 0, __list = {__prev = 0x0, 
          __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}},
poller_death = {next = 0x7f9e247f8208, 
    prev = 0x7f9e247f8208}}

p event_pool->slots_used
$3188 = {1021, 10, 0 <repeats 1022 times>}
f 4
After print the complete fd table i am not able to figure out any entry for
socket 1340
set $tmp = 1
while ($tmp != 1023)
print ((struct event_slot_epoll *)event_pool->ereg[0])[$tmp].fd
set $tmp = $tmp + 1
f 5
p new_sock
1340

As per the current code of event_slot_alloc first, it checks the value of
slot_used to validate the free entry in the table. Current bt is showing the
value of slots used is 1021(less than 1024) it means still it has some free
slot and it sets the table to this registry index(0)

>>>>>>>>>>>>>>>>>>>>>

for (i = 0; i < EVENT_EPOLL_TABLES; i++) {
        switch (event_pool->slots_used[i]) {
            case EVENT_EPOLL_SLOTS:
                continue;
            case 0:
                if (!event_pool->ereg[i]) {
                    table = __event_newtable(event_pool, i);
                    if (!table)
                        return -1;
                } else {
                    table = event_pool->ereg[i];
                }
                break;
            default:
                table = event_pool->ereg[i];
                break;
        }

        if (table)
            /* break out of the loop */
            break;
    }

    if (!table)
        return -1;

    table_idx = i;

>>>>>>>>>>>>>>>>>>>>>>>>>

In below code it tries to check the free entry in the table.As per current
slots_used value ideally 3 entry should be free in table but somehow here no
entry is free. 

The code is not validating the fd assignment in the table. It is just returning
idx.

>>>>>>>>>>>>>>>>>>>>>>>>>>

    for (i = 0; i < EVENT_EPOLL_SLOTS; i++) {
        if (table[i].fd == -1) {
            /* wipe everything except bump the generation */
            gen = table[i].gen;
            memset(&table[i], 0, sizeof(table[i]));
            table[i].gen = gen + 1;

            LOCK_INIT(&table[i].lock);
            INIT_LIST_HEAD(&table[i].poller_death);

            table[i].fd = fd;
            if (notify_poller_death) {
                table[i].idx = table_idx * EVENT_EPOLL_SLOTS + i;
                list_add_tail(&table[i].poller_death,
                              &event_pool->poller_death);
            }

            event_pool->slots_used[table_idx]++;

            break;
        }
    }

>>>>>>>>>>>>>>>>>
    return table_idx * EVENT_EPOLL_SLOTS + i;

I think we need to update the code.
I have checked the event code,  I am not able to figure out why slots_used is
not showing correct value.
Ideally slots_used value should be 1024 because in registry table no index is
free and before returning
the index it should validate fd is successfully assigned or not.

RCA: The slot->ref is not incremented atomically when slot is allocated.
Instead it is done later as part of event_slot_get. If in this window if we
happen to run into ref/unref cycles (as explained above) it would result in
more calls to event_slot_deallocation than actually needed resulting in wrong
accounting of slots_used in slot table.

The fix would be:
1. increment slot->ref atomically in __event_slot_alloc
2. Add checks to __event_slot_alloc whether it actually returns a valid slot
instead of assuming it does.

Thanks,
Mohit Agrawal

--- Additional comment from Mohit Agrawal on 2019-10-18 06:05:29 UTC ---

Patch is posted to resolve the same
https://review.gluster.org/#/c/glusterfs/+/23508/

--- Additional comment from Worker Ant on 2019-10-22 13:47:20 UTC ---

REVIEW: https://review.gluster.org/23508 (rpc: Synchronize slot allocation
code) merged (#7) on master by Raghavendra G

Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1763036
[Bug 1763036] glusterfsd crashed with "'MemoryError' Cannot access memory at
address"
https://bugzilla.redhat.com/show_bug.cgi?id=1778175
[Bug 1778175] glusterfsd crashed with "'MemoryError' Cannot access memory at
address"
https://bugzilla.redhat.com/show_bug.cgi?id=1778182
[Bug 1778182] glusterfsd crashed with "'MemoryError' Cannot access memory at
address"
https://bugzilla.redhat.com/show_bug.cgi?id=1804523
[Bug 1804523] glusterfsd crashed with "'MemoryError' Cannot access memory at
address"
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.