[Gluster-devel] Re: Bonnie++ bailing out

Sat Sep 20 19:19:43 UTC 2008

KwangErn,
Did increasing transport-timeout help?
Regards
Krishna

On Wed, Sep 17, 2008 at 11:13 AM, Raghavendra G
<raghavendra.hg at gmail.com> wrote:
> Hi,
>
> Can you try out with an higher transport-timeout? Currently you are having
> transport-timeout configured to 30s. Can you progressively increase the
> transport timeout and observe the behaviour? I think 180s can be kept as the
> ceiling.
>
> regards,
>
> On Tue, Sep 16, 2008 at 1:01 PM, KwangErn Liew <ke.liew at gmail.com> wrote:
>>
>> Hm, a new error occurred when running Bonnie++ at one pass instead of 10.
>>
>> ====================
>>  $ bonnie++ -u bonnie -d /home/storage -s 7904m:256k -n 1024:-2:5m:30
>> Using uid:100, gid:65534.
>> Writing with putc()...done
>> Writing intelligently...done
>> Rewriting...done
>> Reading with getc()...Can't getc(3)
>> ====================
>>
>> The error logs are attached!
>>
>> Strangely, the server side logs doesn't quite correspond with the
>> client...! In terms of the time and its execution.
>>
>>
>> KwangErn
>>
>> On Sat, Sep 13, 2008 at 1:18 PM, Krishna Srinivas <krishna at zresearch.com>
>> wrote:
>> > KwangErn,
>> > Can you give us the setup details?
>> > Thanks
>> > Krishna
>> >
>> > On Sat, Sep 13, 2008 at 2:12 PM, KwangErn Liew <ke.liew at gmail.com>
>> > wrote:
>> >> Hm, it seems to be 'normal'. I have just ran dd across the network and
>> >> this
>> >> is what I have...
>> >>
>> >> $ dd if=/dev/zero of=/home/storage/testfile bs=16384k count=100
>> >> 100+0 records in
>> >> 100+0 records out
>> >> 1677721600 bytes (1.7 GB) copied, 82.5314 seconds, 20.3 MB/s
>> >>
>> >>  $ ls storage -l
>> >> total 1638408
>> >> drwxr-xr-x 2 bonnie nogroup         21 2008-09-13 08:34 .
>> >> drwxr-xr-x 7 root   root          4096 2008-09-11 21:28 ..
>> >> -rw-r--r-- 1 root   root    1677721600 2008-09-13 08:35 testfile
>> >>
>> >> ==== Client ====
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1510:fuse_writev_cbk]
>> >> glusterfs-fuse:
>> >> 458332: WRITE => 13
>> >> 1072/131072,1677459456/1677590528
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1548:fuse_write] glusterfs-fuse:
>> >> 458333: WRITE (0x516760,
>> >>  size=131072, offset=1677590528)
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1510:fuse_writev_cbk]
>> >> glusterfs-fuse:
>> >> 458333: WRITE => 131072/131072,1677590528/1677721600
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1572:fuse_flush] glusterfs-fuse:
>> >> 458334: FLUSH 0x516760
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:857:fuse_err_cbk] glusterfs-fuse:
>> >> 458334: FLUSH() ERR => 0
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1599:fuse_release] glusterfs-fuse:
>> >> 458335: CLOSE 0x516760
>> >> 2008-09-13 08:35:29 E [afr.c:3012:afr_close_cbk] gfs-afr:
>> >> (path=/testfile
>> >> child=gfs-client2) op_ret=-1 op_errno=77(File descriptor in bad state)
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:857:fuse_err_cbk] glusterfs-fuse:
>> >> 458335: CLOSE() ERR => 0
>> >> 2008-09-13 08:35:29 D [inode.c:287:__inode_passivate] fuse/inode:
>> >> passivating inode(2013354535) lru=1/0 active=1 purge=30
>> >> 2008-09-13 08:35:31 D
>> >> [client-protocol.c:4126:client_protocol_reconnect]
>> >> gfs-client2: attempting reconnect
>> >> 2008-09-13 08:35:31 D [socket.c:924:socket_connect] gfs-client2:
>> >> connect ()
>> >> called on transport already connected
>> >> 2008-09-13 08:35:39 D [client-protocol.c:4065:client_setvolume_cbk]
>> >> gfs-client2: SETVOLUME on remote-host succeeded
>> >> 2008-09-13 08:35:39 D [afr.c:6309:notify] gfs-afr: GF_EVENT_CHILD_UP
>> >> from
>> >> gfs-client2
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1510:fuse_writev_cbk]
>> >> glusterfs-fuse:
>> >> 458332: WRITE => 13
>> >> 1072/131072,1677459456/1677590528
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1548:fuse_write] glusterfs-fuse:
>> >> 458333: WRITE (0x516760,
>> >>  size=131072, offset=1677590528)
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1510:fuse_writev_cbk]
>> >> glusterfs-fuse:
>> >> 458333: WRITE => 131072/131072,1677590528/1677721600
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1572:fuse_flush] glusterfs-fuse:
>> >> 458334: FLUSH 0x516760
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:857:fuse_err_cbk] glusterfs-fuse:
>> >> 458334: FLUSH() ERR => 0
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:1599:fuse_release] glusterfs-fuse:
>> >> 458335: CLOSE 0x516760
>> >> 2008-09-13 08:35:29 E [afr.c:3012:afr_close_cbk] gfs-afr:
>> >> (path=/testfile
>> >> child=gfs-client2) op_ret=-1 op_errno=77(File descriptor in bad state)
>> >> 2008-09-13 08:35:29 D [fuse-bridge.c:857:fuse_err_cbk] glusterfs-fuse:
>> >> 458335: CLOSE() ERR => 0
>> >> 2008-09-13 08:35:29 D [inode.c:287:__inode_passivate] fuse/inode:
>> >> passivating inode(2013354535) lru=1/0 active=1 purge=30
>> >> 2008-09-13 08:35:31 D
>> >> [client-protocol.c:4126:client_protocol_reconnect]
>> >> gfs-client2: attempting reconnect
>> >> 2008-09-13 08:35:31 D [socket.c:924:socket_connect] gfs-client2:
>> >> connect ()
>> >> called on transport already connected
>> >> 2008-09-13 08:35:39 D [client-protocol.c:4065:client_setvolume_cbk]
>> >> gfs-client2: SETVOLUME on remote-host succeeded
>> >> 2008-09-13 08:35:39 D [afr.c:6309:notify] gfs-afr: GF_EVENT_CHILD_UP
>> >> from
>> >> gfs-client2
>> >>
>> >>
>> >> ==== Server ====
>> >>
>> >> 2008-09-13 08:34:38 D [socket.c:636:socket_proto_state_machine]
>> >> gfs-server2:
>> >> partial header re
>> >> ad on NB socket. continue later
>> >> 2008-09-13 08:34:38 D [socket.c:636:socket_proto_state_machine]
>> >> gfs-server2:
>> >> partial header re
>> >> ad on NB socket. continue later
>> >> 2008-09-13 08:34:38 D [socket.c:636:socket_proto_state_machine]
>> >> gfs-server2:
>> >> partial header re
>> >> ad on NB socket. continue later
>> >> 2008-09-13 08:35:39 W [socket.c:93:__socket_rwv] gfs-server2: EOF from
>> >> peer
>> >> 62.212.91.118:1022
>> >> 2008-09-13 08:35:39 D [socket.c:568:socket_proto_state_machine]
>> >> gfs-server2:
>> >> socket read faile
>> >> d (Transport endpoint is not connected) in state 1 (62.212.91.118:1022)
>> >> 2008-09-13 08:35:39 D [inode.c:287:__inode_passivate]
>> >> gfs-storage2/inode:
>> >> passivating inode(1879077150) lru=3/1024 active=1 purge=32
>> >> 2008-09-13 08:35:39 D [server-protocol.c:6182:server_protocol_cleanup]
>> >> gfs-server2: cleaned up transport state for client 62.212.91.118:1022
>> >> 2008-09-13 08:35:39 D [socket.c:1287:fini] gfs-server2: transport
>> >> 0x50d980
>> >> destroyed
>> >> 2008-09-13 08:35:39 D [addr.c:166:gf_auth] gfs-storage2: allowed = "*",
>> >> received addr = "62.212.91.118"
>> >> 2008-09-13 08:35:39 D [server-protocol.c:5544:mop_setvolume]
>> >> gfs-server2:
>> >> accepted client from 62.212.91.118:1021
>> >> 2008-09-13 08:38:58 D [inode.c:268:__inode_activate]
>> >> gfs-storage2/inode:
>> >> activating inode(1879076204), lru=2/1024 active=2 purge=32
>> >> 2008-09-13 08:38:58 D [inode.c:287:__inode_passivate]
>> >> gfs-storage2/inode:
>> >> passivating inode(1879076204) lru=3/1024 active=1 purge=32
>> >> 2008-09-13 08:38:58 D [inode.c:268:__inode_activate]
>> >> gfs-storage2/inode:
>> >> activating inode(1879076204), lru=2/1024 active=2 purge=32
>> >> 2008-09-13 08:38:58 D [inode.c:287:__inode_passivate]
>> >> gfs-storage2/inode:
>> >> passivating inode(1879076204) lru=3/1024 active=1 purge=32
>> >>
>> >>
>> >> Does that mean GlusterFS will always destroy the connection once the
>> >> process
>> >> is complete?
>> >>
>> >>
>> >> KwangErn
>> >> _______________________________________________
>> >> Gluster-devel mailing list
>> >> Gluster-devel at nongnu.org
>> >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>> >>
>> >
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Raghavendra G
>
> A centipede was happy quite, until a toad in fun,
> Said, "Prey, which leg comes after which?",
> This raised his doubts to such a pitch,
> He fell flat into the ditch,
> Not knowing how to run.
> -Anonymous
>