[Gluster-users] returning EBADFD / no proper reply from server, returning ENOTCONN
Andrew McGill
list2008 at lunch.za.net
Tue Nov 11 08:59:52 UTC 2008
Greetings glusterfs users,
I have the errors below in /var/log/glusterfs.log. It's not clear, but I'm
guessing that this is simply a network error which was handled adequately by
the software -- but it is truly not obvious.
* Were these network errors were handled by AFR?
* Without AFR the application would I have seen a filesystem error
(e.g. "Transport endpoint not connected")? (How about if the network error
was on the namespace brick?).
* Is there a recommended action for errors in the error log - or some other
way of ensuring the integrity of the filesystem (like glusterfsck ...)
The volume is defined as ...
volume u100-node6
type protocol/client
option transport-type tcp/client
option transport-timeout 10sec
option remote-host node6
option remote-subvolume u100-node6
option username dkpaa
option password XXXXXXXXASDBH
end-volume
volume afr4
type cluster/afr
subvolumes u100-node7 u100-node6
end-volume
volume unify0
type cluster/unify
subvolumes afr0 afr1 afr2 afr3 afr4
option namespace u25-node4
option rr.limits.min-free-disk 5%
option scheduler rr
end-volume
Log file says:
2008-11-11 01:57:01 C [client-protocol.c:212:call_bail] u100-node6: bailing
transport
2008-11-11 01:57:01 E [client-protocol.c:4834:client_protocol_cleanup]
u100-node6: forced unwinding frame type(1) op(14) reply=@0x860d208
2008-11-11 01:57:01 E [client-protocol.c:3254:client_write_cbk] u100-node6: no
proper reply from server, returning ENOTCONN
2008-11-11 01:57:01 E [afr.c:2393:afr_writev_cbk] afr4:
(path=/backup5/intelligence.local/rdiff-backup-data/increments/home/pcformat/tmp/analog/cache.2008-11-10T01:30:12+02:00.diff.gz
child=u100-node6) op_ret=-1 op_errno=107
2008-11-11 02:36:14 C [client-protocol.c:212:call_bail] u100-node6: bailing
transport
2008-11-11 02:36:14 E [client-protocol.c:4834:client_protocol_cleanup]
u100-node6: forced unwinding frame type(1) op(14) reply=@0x89308a8
2008-11-11 02:36:14 E [client-protocol.c:3254:client_write_cbk] u100-node6: no
proper reply from server, returning ENOTCONN
2008-11-11 02:36:14 E [afr.c:2393:afr_writev_cbk] afr4:
(path=/backup5/intelligence.local/rdiff-backup-data/increments/home/pcformat/tmp/analog/cache.out.2008-11-10T01:30:12+02:00.diff.gz
child=u100-node6) op_ret=-1 op_errno=107
2008-11-11 03:06:34 E [client-protocol.c:1238:client_flush] u100-node6: :
returning EBADFD
2008-11-11 03:06:34 E [afr.c:2649:afr_flush_cbk] afr4:
(path=/backup5/intelligence.local/rdiff-backup-data/mirror_metadata.2008-11-10T01:30:12+02:00.snapshot.gz
child=u100-node6) op_ret=-1 op_errno=77
On the server side, it sees the client going away:
2008-11-11 01:57:01 E [protocol.c:271:gf_block_unserialize_transport] server:
EOF from peer (192.168.15.43:1001)
2008-11-11 01:57:01 E [server-protocol.c:186:generic_reply] server:
transport_writev failed
2008-11-11 02:36:14 E [protocol.c:271:gf_block_unserialize_transport] server:
EOF from peer (192.168.15.43:1021)
2008-11-11 02:36:14 E [server-protocol.c:186:generic_reply] server:
transport_writev failed
2008-11-11 07:30:39 E [protocol.c:271:gf_block_unserialize_transport] server:
EOF from peer (192.168.15.43:999)
2008-11-11 07:30:39 E [protocol.c:271:gf_block_unserialize_transport] server:
EOF from peer (192.168.15.43:1020)
More information about the Gluster-users
mailing list