[Bugs] [Bug 1568373] New: timer: Possible race condition between gf_timer_* routines

bugzilla at redhat.com bugzilla at redhat.com
Tue Apr 17 11:25:55 UTC 2018


https://bugzilla.redhat.com/show_bug.cgi?id=1568373

            Bug ID: 1568373
           Summary: timer: Possible race condition between gf_timer_*
                    routines
           Product: Red Hat Gluster Storage
           Version: 3.4
         Component: core
          Keywords: Triaged
          Severity: medium
          Assignee: vbellur at redhat.com
          Reporter: sheggodu at redhat.com
        QA Contact: rhinduja at redhat.com
                CC: bugs at gluster.org, ndevos at redhat.com,
                    rhs-bugs at redhat.com, sankarshan at redhat.com,
                    skoduri at redhat.com, storage-qa-internal at redhat.com
        Depends On: 1568374, 1509189
            Blocks: 1508817, 1564465 (glusterfs-3.12.8), 1565590
        Depends On: 1568374



+++ This bug was initially created as a clone of Bug #1509189 +++

Description of problem:

As mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1508817#c4, there
is a chance of hitting race between gf_timer_registry_destroy().
gf_timer_call_cancel() and gf_timer_proc() leading to use_after_free. 

As explained by Dan, the flow is as below -
gf_timer_proc() is called, locks reg, and gets an event.  It unlocks reg, and
calls the callback.

Now, gf_timer_registry_destroy() is called, and removes reg from ctx, and joins
on gf_timer_proc().

Now, gf_timer_call_cancel() is called on the event being processed.  It cannot
find reg (since it's been removed from reg), so it frees event.

Now the callback returns into gf_timer_proc(), and it tries to free event, but
it's already free, so double free.



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Worker Ant on 2017-11-03 06:44:11 EDT ---

REVIEW: https://review.gluster.org/18652 (timer: Fix possible race during
cleanup) posted (#1) for review on master by soumya k

--- Additional comment from Worker Ant on 2017-11-21 03:56:56 EST ---

COMMIT: https://review.gluster.org/18652 committed in master by \"soumya k\"
<skoduri at redhat.com> with a commit message- timer: Fix possible race during
cleanup

As mentioned in bug1509189, there is a possible race
between gf_timer_cancel(), gf_timer_proc() and
gf_timer_registry_destroy() leading to use_after_free.

Problem:

1) gf_timer_proc() is called, locks reg, and gets an event.
It unlocks reg, and calls the callback.

2) Meanwhile gf_timer_registry_destroy() is called, and removes
reg from ctx, and joins on gf_timer_proc().

3) gf_timer_call_cancel() is called on the event being
processed.  It cannot find reg (since it's been removed from reg),
so it frees event.

4) the callback returns into gf_timer_proc(), and it tries to free
event, but it's already free, so double free.

Solution:
The fix is to bail out in gf_timer_cancel() when registry
is not found. The logic behind this is that, gf_timer_cancel()
is called only on any existing event. That means there was a valid
registry earlier while creating that event. And the only reason
we cannot find that registry now is that it must have got set to
NULL when context cleanup is started.
Since gf_timer_proc() takes care of releasing all the remaining
events active on that registry, it seems safe to bail out
in gf_timer_cancel().

Change-Id: Ia9b088533141c3bb335eff2fe06b52d1575bb34f
BUG: 1509189
Reported-by: Daniel Gryniewicz <dang at redhat.com>
Signed-off-by: Soumya Koduri <skoduri at redhat.com>

--- Additional comment from Shyamsundar on 2018-03-15 07:19:42 EDT ---

This bug is getting closed because a release has been made available that
should address the reported issue. In case the problem is still not fixed with
glusterfs-4.0.0, please open a new bug report.

glusterfs-4.0.0 has been announced on the Gluster mailinglists [1], packages
for several distributions should become available in the near future. Keep an
eye on the Gluster Users mailinglist [2] and the update infrastructure for your
distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-March/000092.html
[2] https://www.gluster.org/pipermail/gluster-users/


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1508817
[Bug 1508817] [Ganesha] : Ganesha crashed while restarting Ganesha post vol
stop/deletes in loop.
https://bugzilla.redhat.com/show_bug.cgi?id=1509189
[Bug 1509189] timer: Possible race condition between gf_timer_* routines
https://bugzilla.redhat.com/show_bug.cgi?id=1564465
[Bug 1564465] GlusterFS 3.12.8 tracker
https://bugzilla.redhat.com/show_bug.cgi?id=1565590
[Bug 1565590] timer: Possible race condition between gf_timer_* routines
https://bugzilla.redhat.com/show_bug.cgi?id=1568374
[Bug 1568374] timer: Possible race condition between gf_timer_* routines
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=aMyw60HB7I&a=cc_unsubscribe


More information about the Bugs mailing list