[Gluster-devel] brick stop responding (3.4.0qa8)

Anand Avati anand.avati at gmail.com
Sun Feb 3 22:13:40 UTC 2013


Yeah, a lot of threads are "missing"! Do the logs have anything unusual?

Avati

On Sun, Feb 3, 2013 at 7:00 AM, Emmanuel Dreyfus <manu at netbsd.org> wrote:

> 3.4.0qa8 works fine, but after a while, a NetBSD/amd64 brick stops
> responding (while NetBSD/i386 servers seems to work fine, therefore
> it looks 64 bits specific). ktrace shows it loops around poll calls. Here
> is what I can see if I stop it and inspect with gdb:
>
> #0  0x00007f7ff62759da in ___lwp_park50 () from /lib/libc.so.12
> (gdb) bt
> #0  0x00007f7ff62759da in ___lwp_park50 () from /lib/libc.so.12
> #1  0x00007f7ff6c086b9 in pthread_cond_timedwait ()
>    from /usr/lib/libpthread.so.1
> #2  0x00007f7ff200abd1 in iot_worker (data=0x7f7ff6fe3120) at
> io-threads.c:157
> #3  0x00007f7ff6c09d75 in ?? () from /usr/lib/libpthread.so.1
> #4  0x00007f7ff62759f0 in ___lwp_park50 () from /lib/libc.so.12
> #5  0x00007f7fee000000 in ?? ()
> #6  0x00007f7ff7fffcc0 in ?? ()
> #7  0x0000000111110001 in ?? ()
> #8  0x0000000033330003 in ?? ()
> #9  0x0000000000000000 in ?? ()
> (gdb) frame 2
> #2  0x00007f7ff200abd1 in iot_worker (data=0x7f7ff6fe3120) at
> io-threads.c:157
> 157                                     ret = pthread_cond_timedwait
> (&conf->cond,
> (gdb) list
> 152                                     pri = -1;
> 153                             }
> 154                             while (conf->queue_size == 0) {
> 155                                     conf->sleep_count++;
> 156
> 157                                     ret = pthread_cond_timedwait
> (&conf->cond,
> 158
> &conf->mutex,
> 159
> &sleep_till);
> 160                                     conf->sleep_count--;
> 161
> (gdb) print conf->cond
> $1 = {ptc_magic = 1431633925, ptc_lock = 0 '\000', ptc_waiters = {
>     ptqh_first = 0x7f7feec00000, ptqh_last = 0x7f7fee000230},
>   ptc_mutex = 0x7f7ff6fe3120, ptc_private = 0x0}
> (gdb) print conf->mutex
> $2 = {ptm_magic = 858980355, ptm_errorcheck = 0 '\000', ptm_pad1 =
> "\000\000",
>   ptm_interlock = 0 '\000', ptm_pad2 = "\000\000", ptm_owner = 0x0,
>   ptm_waiters = 0x0, ptm_recursed = 0, ptm_spare2 = 0x0}
>
> NB: ptc_magic and ptm_magic are correct.
>
>
> (gdb) print sleep_till
> $3 = {tv_sec = 1359902843, tv_nsec = 0}
>
> This is also fine.
>
>
> (gdb) info threads
>   Id   Target Id         Frame
> * 1    LWP 1             0x00007f7ff200abd1 in iot_worker
> (data=0x7f7ff6fe3120)
>     at io-threads.c:157
>
> The current thread <Thread ID 1> has terminated.  See `help thread'.
> (gdb)
>
> Is'nt it supposed to have multiple threads?
>
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> manu at netbsd.org
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at nongnu.org
> https://lists.nongnu.org/mailman/listinfo/gluster-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-devel/attachments/20130203/21b7256e/attachment-0001.html>


More information about the Gluster-devel mailing list