[Gluster-devel] Spurious regression failure analysis from runs over the wkend

Tue Feb 24 16:46:38 UTC 2015

On 24 Feb 2015, at 14:39, Shyam <srangana at redhat.com> wrote:
> On 02/23/2015 11:20 PM, Justin Clift wrote:
>> On 23 Feb 2015, at 19:32, Shyam <srangana at redhat.com> wrote:
>> <snip>
>>>> 4 of the regression runs also created coredumps.  Uploaded the
>>>> archived_builds and logs here:
>>>> 
>>>>  http://mirror.salasaga.org/gluster/
>>>> 
>>>> (are those useful?)
>>> 
>>> Yes, these are useful as they contain a very similar crash in each of the cores, so we could be looking at a single problem to fix here.
>> 
>> Do you have a few minutes to check if this coredump (release-3.6 branch) is
>> also the same problem?
>> 
>> http://mirror.salasaga.org/gluster/release-3.6/bulkregression17/
> 
> Yup, it is the same as the others. One of the threads is on cleanup_and exit and the other is processing a disconnect and crashed while cleaning up a list (stacks similar to the older one, and same as the one on bug #1195415)

Thanks Shyam, that's awesome. Better to have one bug that needs fixing than two. :)

And now we know right away at least two of the branches it's in.

<snip>
>> Ran 20x regression runs on the release-3.6 branch head, and the above was
>> the only one to coredump.  Several spurious errors, but that's for a later
>> email (probably tomorrow). :)
> 
> This rules out the possibility of this being due to MT epoll, but probably more frequent with the same (based on statistics in this mail).
> 
> Needs a fix irrespective, anyway.

Yep.  Am I ok to remove that stuff from the mirror.salasaga.org website, or
should I leave it there for a while?

+ Justin

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift