[Bugs] [Bug 1716626] New: Invalid memory access while executing cleanup_and_exit

bugzilla at redhat.com bugzilla at redhat.com
Mon Jun 3 19:22:48 UTC 2019


https://bugzilla.redhat.com/show_bug.cgi?id=1716626

            Bug ID: 1716626
           Summary: Invalid memory access while executing cleanup_and_exit
           Product: Red Hat Gluster Storage
           Version: rhgs-3.5
            Status: NEW
         Component: replicate
          Keywords: Reopened
          Assignee: ksubrahm at redhat.com
          Reporter: rkavunga at redhat.com
        QA Contact: nchilaka at redhat.com
                CC: bugs at gluster.org, pkarampu at redhat.com,
                    rhs-bugs at redhat.com, sankarshan at redhat.com,
                    storage-qa-internal at redhat.com
        Depends On: 1708926
  Target Milestone: ---
    Classification: Red Hat



+++ This bug was initially created as a clone of Bug #1708926 +++

Description of problem:

when executing a cleanup_and_exit, a shd daemon is crashed. This is because
there is a chance that a parallel graph free thread might be executing another
cleanup

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. run ./tests/bugs/glusterd/reset-brick-and-daemons-follow-quorum.t in a loop
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2019-05-11 17:59:31 UTC ---

REVIEW: https://review.gluster.org/22709 (glusterfsd/cleanup: Protect graph
object under a lock) posted (#1) for review on master by mohammed rafi  kc

--- Additional comment from Pranith Kumar K on 2019-05-14 07:09:23 UTC ---

Rafi,
      Could you share the bt of the core so that it is easier to understand why
exactly it crashed?

Pranith

--- Additional comment from Mohammed Rafi KC on 2019-05-14 16:01:36 UTC ---

          Stack trace of thread 30877:
                #0  0x0000000000406a07 cleanup_and_exit (glusterfsd)
                #1  0x0000000000406b5d glusterfs_sigwaiter (glusterfsd)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30879:
                #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
(libpthread.so.0)
                #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
                #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30881:
                #0  0x00007f50ffd14cdf __GI___select (libc.so.6)
                #1  0x00007f51003ef1cd runner (libglusterfs.so.0)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30880:
                #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
(libpthread.so.0)
                #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
                #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30876:
                #0  0x00007f51000d7500 __GI___nanosleep (libpthread.so.0)
                #1  0x00007f510038a346 gf_timer_proc (libglusterfs.so.0)
                #2  0x00007f51000cd58e start_thread (libpthread.so.0)
                #3  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30882:
                #0  0x00007f50ffd1e06e epoll_ctl (libc.so.6)
                #1  0x00007f51003d931e event_handled_epoll (libglusterfs.so.0)
                #2  0x00007f50eed9a781 socket_event_poll_in (socket.so)
                #3  0x00007f51003d8c9b event_dispatch_epoll_handler
(libglusterfs.so.0)
                #4  0x00007f51000cd58e start_thread (libpthread.so.0)
                #5  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30875:
                #0  0x00007f51000cea6d __GI___pthread_timedjoin_ex
(libpthread.so.0)
                #1  0x00007f51003d8387 event_dispatch_epoll (libglusterfs.so.0)
                #2  0x0000000000406592 main (glusterfsd)
                #3  0x00007f50ffc44413 __libc_start_main (libc.so.6)
                #4  0x00000000004067de _start (glusterfsd)

                Stack trace of thread 30878:
                #0  0x00007f50ffce97f8 __GI___nanosleep (libc.so.6)
                #1  0x00007f50ffce96fe __sleep (libc.so.6)
                #2  0x00007f51003a4f5a pool_sweeper (libglusterfs.so.0)
                #3  0x00007f51000cd58e start_thread (libpthread.so.0)
                #4  0x00007f50ffd1d683 __clone (libc.so.6)

                Stack trace of thread 30883:
                #0  0x00007f51000d6b8d __lll_lock_wait (libpthread.so.0)
                #1  0x00007f51000cfda9 __GI___pthread_mutex_lock
(libpthread.so.0)
                #2  0x00007f510037cd1f _gf_msg_plain_internal
(libglusterfs.so.0)
                #3  0x00007f510037ceb3 _gf_msg_plain (libglusterfs.so.0)
                #4  0x00007f5100382d43 gf_log_dump_graph (libglusterfs.so.0)
                #5  0x00007f51003b514f glusterfs_process_svc_attach_volfp
(libglusterfs.so.0)
                #6  0x000000000040b16d mgmt_process_volfile (glusterfsd)
                #7  0x0000000000410792 mgmt_getspec_cbk (glusterfsd)
                #8  0x00007f51003256b1 rpc_clnt_handle_reply (libgfrpc.so.0)
                #9  0x00007f5100325a53 rpc_clnt_notify (libgfrpc.so.0)
                #10 0x00007f5100322973 rpc_transport_notify (libgfrpc.so.0)
                #11 0x00007f50eed9a45c socket_event_poll_in (socket.so)
                #12 0x00007f51003d8c9b event_dispatch_epoll_handler
(libglusterfs.so.0)
                #13 0x00007f51000cd58e start_thread (libpthread.so.0)
                #14 0x00007f50ffd1d683 __clone (libc.so.6)

--- Additional comment from Pranith Kumar K on 2019-05-15 05:34:33 UTC ---

(In reply to Mohammed Rafi KC from comment #3)
>           Stack trace of thread 30877:
>                 #0  0x0000000000406a07 cleanup_and_exit (glusterfsd)
>                 #1  0x0000000000406b5d glusterfs_sigwaiter (glusterfsd)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30879:
>                 #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
> (libpthread.so.0)
>                 #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
>                 #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30881:
>                 #0  0x00007f50ffd14cdf __GI___select (libc.so.6)
>                 #1  0x00007f51003ef1cd runner (libglusterfs.so.0)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30880:
>                 #0  0x00007f51000d3a7a futex_abstimed_wait_cancelable
> (libpthread.so.0)
>                 #1  0x00007f51003b8616 syncenv_task (libglusterfs.so.0)
>                 #2  0x00007f51003b9240 syncenv_processor (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30876:
>                 #0  0x00007f51000d7500 __GI___nanosleep (libpthread.so.0)
>                 #1  0x00007f510038a346 gf_timer_proc (libglusterfs.so.0)
>                 #2  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #3  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30882:
>                 #0  0x00007f50ffd1e06e epoll_ctl (libc.so.6)
>                 #1  0x00007f51003d931e event_handled_epoll
> (libglusterfs.so.0)
>                 #2  0x00007f50eed9a781 socket_event_poll_in (socket.so)
>                 #3  0x00007f51003d8c9b event_dispatch_epoll_handler
> (libglusterfs.so.0)
>                 #4  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #5  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30875:
>                 #0  0x00007f51000cea6d __GI___pthread_timedjoin_ex
> (libpthread.so.0)
>                 #1  0x00007f51003d8387 event_dispatch_epoll
> (libglusterfs.so.0)
>                 #2  0x0000000000406592 main (glusterfsd)
>                 #3  0x00007f50ffc44413 __libc_start_main (libc.so.6)
>                 #4  0x00000000004067de _start (glusterfsd)
>                 
>                 Stack trace of thread 30878:
>                 #0  0x00007f50ffce97f8 __GI___nanosleep (libc.so.6)
>                 #1  0x00007f50ffce96fe __sleep (libc.so.6)
>                 #2  0x00007f51003a4f5a pool_sweeper (libglusterfs.so.0)
>                 #3  0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #4  0x00007f50ffd1d683 __clone (libc.so.6)
>                 
>                 Stack trace of thread 30883:
>                 #0  0x00007f51000d6b8d __lll_lock_wait (libpthread.so.0)
>                 #1  0x00007f51000cfda9 __GI___pthread_mutex_lock
> (libpthread.so.0)
>                 #2  0x00007f510037cd1f _gf_msg_plain_internal
> (libglusterfs.so.0)
>                 #3  0x00007f510037ceb3 _gf_msg_plain (libglusterfs.so.0)
>                 #4  0x00007f5100382d43 gf_log_dump_graph (libglusterfs.so.0)
>                 #5  0x00007f51003b514f glusterfs_process_svc_attach_volfp
> (libglusterfs.so.0)
>                 #6  0x000000000040b16d mgmt_process_volfile (glusterfsd)
>                 #7  0x0000000000410792 mgmt_getspec_cbk (glusterfsd)
>                 #8  0x00007f51003256b1 rpc_clnt_handle_reply (libgfrpc.so.0)
>                 #9  0x00007f5100325a53 rpc_clnt_notify (libgfrpc.so.0)
>                 #10 0x00007f5100322973 rpc_transport_notify (libgfrpc.so.0)
>                 #11 0x00007f50eed9a45c socket_event_poll_in (socket.so)
>                 #12 0x00007f51003d8c9b event_dispatch_epoll_handler
> (libglusterfs.so.0)
>                 #13 0x00007f51000cd58e start_thread (libpthread.so.0)
>                 #14 0x00007f50ffd1d683 __clone (libc.so.6)

Was graph->active NULL? What lead to the crash?

--- Additional comment from Worker Ant on 2019-05-17 18:08:44 UTC ---

REVIEW: https://review.gluster.org/22743 (afr/frame: Destroy frame after
afr_selfheal_entry_granular) posted (#1) for review on master by mohammed rafi 
kc

--- Additional comment from Worker Ant on 2019-05-21 11:37:12 UTC ---

REVIEW: https://review.gluster.org/22743 (afr/frame: Destroy frame after
afr_selfheal_entry_granular) merged (#3) on master by Pranith Kumar Karampuri

--- Additional comment from Worker Ant on 2019-05-31 11:28:15 UTC ---

REVIEW: https://review.gluster.org/22709 (glusterfsd/cleanup: Protect graph
object under a lock) merged (#10) on master by Amar Tumballi


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1708926
[Bug 1708926] Invalid memory access while executing cleanup_and_exit
-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Bugs mailing list