[Gluster-devel] NetBSD swapcontext() portability fix

Emmanuel Dreyfus manu at netbsd.org
Thu Aug 9 16:32:20 UTC 2012


I move this discussion from http://review.gluster.com/#change,3794 to
gluster-devel@ as it is more convenient to discuss the thing.  Jeff 
Darcy's position on the problem is below.

The patch sets SYNCENV_PROC_MAX to 1 for NetBSD. If I understand correctly,
that address the problem raised here: there is only one thread in a
syncenv. Did I get it wrong?

I asked about the issue on tech-kern at netbsd.org. I get a first reply 
suggesting thet setjmp()/longjmp() would be better suited than 
swapcontext() for that job. Any opinions?

----- Forwarded message from "Jeff Darcy (Code Review)" <root at dev.gluster.com> -----
Change subject: NetBSD swapcontext() portability fix
......................................................................


Patch Set 1:

I don't think tasks A and B have as strong an affinity to thread
(not task) X as you claim.  If they did, then work couldn't be
moved from a busy thread to an idle one and we might as well just
use straight pthreads throughout.  When A needs to suspend, e.g.
in SYNCOP after STACK_WIND_COOKIE, it gets put on a per-syncenv
wait queue.  When the callback occurs, e.g. from a transport polling
thread, A gets moved to a per-syncenv run queue, from whence it
might be picked up and resumed from any thread in that syncenv.
When resumption hits  syncenv_switchto, if we're not in the same
thread as before, then (according to you) the swapcontext will
cause a preemption of the original thread that might be executing
B.

I don't dispute your findings.  Clearly the fix does avoid one bug,
which would cause predictable failures.  I'm worried about the less
predictable failures, such as deadlocks that might involve small
timing windows and be hard to hit even under heavy load.  Secondarily,
I'm worried about fixes that might work and even be completely
airtight at the expense of reducing the parallelism of requests
within the syncop subsystem.

Before we expend tremendous effort working through all the layers
of our own code and that of pthreads/ucontext, I'd really like to
know whether ucontext is worth it at all on NetBSD or similarly
behaving platforms.  If the performance difference is small, then
it might be better to have explicit enforcement of task/thread
affinity (and thread-scaling mechanisms more appropriate to that
environment).  Then we don't have to deal with all of hidden
assumptions between synctask_xxx and __xxx and SYNCOP, synctask_wrap
and syncenv_processor, and so on ad infinitum.  That's how we got
into this mess, not how we get out.

----- End forwarded message -----

-- 
Emmanuel Dreyfus
manu at netbsd.org




More information about the Gluster-devel mailing list