[Gluster-devel] about afr

Tue Feb 3 14:50:16 UTC 2009

Nicolas,

When you restart the server logs indicating EBADFD is fine, AFR will
try the operation on the other server. When you have the situation
where the glusterfs client hangs can you attach gdb to the glusterfs
and mail us the backtrace?

gdb -p <pid of glusterfs>
type "bt" at the gdb command prompt.

Just want to confirm that glusterfs has not blocked at a system call.
(as we have non blocking io now)

Can you see if removing the performance translators helps? we can
narrow down to the problem translator in such case.

Krishna

On Tue, Feb 3, 2009 at 5:18 PM, nicolas prochazka
<prochazka.nicolas at gmail.com> wrote:
> ok,
> So now I know there's few bugs,
>
> 1 - when stop and i restart a server , I've the EBADFD bug
> 2 - When I stop server :
>       - with  --disable-direct-io-mode   : my big image file become corrupt
> ( missing data ...)
>       - without --disable-direct-io-mode  :   my process hangs and cpu load
> grows a lot (by process )
>
> any ideas ?
>
> Regards,
> Nicolas Prochazka
>
>  On Tue, Feb 3, 2009 at 5:42 AM, Raghavendra G <raghavendra at zresearch.com>
> wrote:
>>
>> Hi Nicolas,
>>
>> On Tue, Feb 3, 2009 at 12:01 AM, nicolas prochazka
>> <prochazka.nicolas at gmail.com> wrote:
>>>
>>> I inspect the log and i find something interesting :
>>> All is ok,
>>> i have stop 10.98.98.2 and i restart it :
>>>
>>> 2009-02-02 15:00:32 D [client-protocol.c:6498:notify] brick_10.98.98.2:
>>> got GF_EVENT_CHILD_UP
>>> 2009-02-02 15:00:32 D [socket.c:924:socket_connect] brick_10.98.98.2:
>>> connect () called on transport already connected
>>> 2009-02-02 15:00:32 N [client-protocol.c:5786:client_setvolume_cbk]
>>> brick_10.98.98.2: connection and handshake succeeded
>>> 2009-02-02 15:00:40 D [fuse-bridge.c:1945:fuse_statfs] glusterfs-fuse:
>>> 17399: STATFS
>>> 2009-02-02 15:00:40 D [fuse-bridge.c:368:fuse_entry_cbk] glusterfs-fuse: