[Gluster-devel] Reg. multi thread epoll NetBSD failures

Fri Jan 23 20:47:59 UTC 2015

Gluster-ians,

would it be ok to temporarily disable multi-thread-epoll on NetBSD, unless there is some huge demand for it?  NetBSD may be useful for exposing race conditions, but it's not clear to me that all of these race conditions would happen in a non-NetBSD environment, so are we chasing problems that non-NetBSD users can never see?  what do people think?  If yes, why bust our heads figuring them out for NetBSD right now?  

attached is a tiny, crude and possibly out-of-date patch for making multi-thread-epoll tunable,   If we make number of epoll threads settable, we could add conditional compilation to make GLUSTERFS_EPOLL_MAXTHREADS 1 for NetBSD without much trouble, while still allowing people to experiment with it on NetBSD.

>From a performance perspective, let's review why we should go to the trouble of using multi-thread-epoll patch.  The original goal was to allow far greater CPU utilization by Gluster than we typically were seeing.  To do this, we want multiple Gluster RPC sockets to be read and processed in parallel by a single process.  This is important to clients (glusterfs, libgfapi) that have to talk to many bricks (example: JBOD, erasure coding), and to brick processes (glusterfsd) that have to talk to many clients.  It is also important for SSD support (cache tiering) because we need to be able to have the glusterfsd process keep up with SSD hardware and caches, which can have orders of magnitude more IOPS available than a single disk drive or even a RAID LUN, and glusterfsd epoll thread is currently the bottleneck in such configurations.  This multi-thread-epoll enhancement seems similar to multi-queue ethernet driver, etc. that spreads load across CPU cores.  RDMA 40-Gbps networking may also encounter this bottleneck.  We don't want a small fraction of CPU cores (often just 1) to be a bottleneck - we want either network or storage hardware to be the bottleneck instead.

Finally, is it possible with multi-thread-epoll that we do not need to use the io-threads translator (Anand Avati's suggestion) that offloads incoming requests to worker threads?  In this case, the epoll threads ARE the server-side thread pool.  If so, this could reduce context switching and latency further.  I for one look forward to finding out but I do not want to invest in more performance testing than we have already done unless it is going to be upstream to use.

thanks for your help,

-Ben England, Red Hat Perf. Engr.

----- Original Message -----
> From: "Shyam" <srangana at redhat.com>
> To: "Emmanuel Dreyfus" <manu at netbsd.org>
> Cc: "Gluster Devel" <gluster-devel at gluster.org>
> Sent: Friday, January 23, 2015 2:48:14 PM
> Subject: [Gluster-devel] Reg. multi thread epoll NetBSD failures
> 
> Patch: http://review.gluster.org/#/c/3842/
> 
> Manu,
> 
> I was not able to find the NetBSD job mentioned in the last review
> comment provided by you, pointers to that would help.
> 
> Additionally,
> 
> What is the support status of epoll on NetBSD? I though NetBSD favored
> the kqueue means of event processing over epoll and that epoll was not
> supported on NetBSD (or *BSD).
> 
> I ask this, as this patch specifically changes the number of epoll
> threads, as a result, it is possibly having a different affect on
> NetBSD, which should either be on poll or kqueue (to my understanding).
> 
> Could you shed some light on this and on the current status of epoll on
> NetBSD.
> 
> Thanks,
> Shyam
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: event-epoll-tunable.patch
Type: text/x-patch
Size: 1387 bytes
Desc: not available
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150123/29a8c89b/attachment.bin>