[Gluster-devel] errors in my log
m.c.wilkins at massey.ac.nz
m.c.wilkins at massey.ac.nz
Tue Apr 21 08:36:59 UTC 2009
hi all,
i am having problems with rc7, i get "Transport endpoint not
connected" from time to time. the problem is causing us a major
headache, basically makes our gluster unusable.
i posted this to the users list, but haven't got a response yet, so
hope you can help.
i'm wondering if i could just tweak some timeouts or something. it is
really odd though, the machines are connected to the same gigabit
switch, and not heavily loaded (well i don't think so anyway - i
didn't check at the time).
---
hi,
i'm running 2.0.0rc7 (config below) in a nufa setup. please help me
out, i'm getting some errors in my logs:
on tur-awc1 i have:
2009-04-21 01:24:43 E [client-protocol.c:533:client_ping_timer_expired] tur-awc3-0: ping timer expired! bailing transport
2009-04-21 01:24:43 E [saved-frames.c:169:saved_frames_unwind] tur-awc3-0: forced unwinding frame type(1) op(MKNOD)
2009-04-21 01:24:43 E [fuse-bridge.c:1274:fuse_rename_cbk] glusterfs-fuse: 102353757: /090417_HWI-EAS209_0011_FC30K/Data/IPAR_1.3/Bustard1.3.2_20-04-2009_mjscolli/GERALD_20-04-2009_mjscolli/s_5_0009_realign.txt.tmp -> /090417_HWI-EAS209_0011_FC30K/Data/IPAR_1.3/Bustard1.3.2_20-04-2009_mjscolli/GERALD_20-04-2009_mjscolli/s_5_0009_realign.txt => -1 (Transport endpoint is not connected)
2009-04-21 01:24:43 E [saved-frames.c:169:saved_frames_unwind] tur-awc3-0: forced unwinding frame type(1) op(STAT)
2009-04-21 01:24:43 E [dht-common.c:762:dht_attr_cbk] nufa: subvolume tur-awc3-0 returned -1 (Transport endpoint is not connected)
2009-04-21 01:24:43 E [saved-frames.c:169:saved_frames_unwind] tur-awc3-0: forced unwinding frame type(1) op(STAT)
2009-04-21 01:24:43 E [dht-common.c:762:dht_attr_cbk] nufa: subvolume tur-awc3-0 returned -1 (Transport endpoint is not connected)
2009-04-21 01:24:43 E [saved-frames.c:169:saved_frames_unwind] tur-awc3-0: forced unwinding frame type(1) op(STAT)
2009-04-21 01:24:43 E [dht-common.c:762:dht_attr_cbk] nufa: subvolume tur-awc3-0 returned -1 (Transport endpoint is not connected)
2009-04-21 01:24:43 E [saved-frames.c:169:saved_frames_unwind] tur-awc3-0: forced unwinding frame type(2) op((null))
2009-04-21 01:24:43 E [client-protocol.c:630:client_ping_cbk] tur-awc3-0: timer must have expired
2009-04-21 01:24:45 N [client-protocol.c:6159:client_setvolume_cbk] tur-awc3-0: connection and handshake succeeded
seems like it is having problems communicating with tur-awc3, and in
that machines log i have:
2009-04-21 01:24:45 N [server-protocol.c:7513:mop_setvolume] server: accepted client from 130.123.129.121:1013
2009-04-21 01:24:45 E [socket.c:102:__socket_rwv] server: writev failed (Broken pipe)
2009-04-21 01:24:45 N [server-protocol.c:8268:notify] server: 130.123.129.121:1011 disconnected
2009-04-21 04:03:12 W [nufa.c:219:nufa_lookup] nufa: incomplete layout failure for path=/
2009-04-21 04:03:12 W [fuse-bridge.c:301:need_fresh_lookup] fuse-bridge: revalidate of / failed (Resource temporarily unavailable)
any idea what is happening? i can confirm that both machines are up
and under almost no load, connected on the same gigabit switch.
130.123.129.121 is the IP address of tur-awc1.
any help much appreciated
Matt
---
oh and my config is:
volume posix0
type storage/posix
option directory /export/brick-newgluster
end-volume
volume locks0
type features/locks
subvolumes posix0
end-volume
volume brick0
type performance/io-threads
subvolumes locks0
end-volume
volume server
type protocol/server
option transport-type tcp
option listen-port 16996
option auth.addr.brick0.allow *
subvolumes brick0
end-volume
volume tur-awc1-0
type protocol/client
option transport-type tcp
option remote-port 16996
option remote-host tur-awc1
option remote-subvolume brick0
end-volume
volume tur-awc2-0
type protocol/client
option transport-type tcp
option remote-port 16996
option remote-host tur-awc2
option remote-subvolume brick0
end-volume
volume tur-awc3-0
type protocol/client
option transport-type tcp
option remote-port 16996
option remote-host tur-awc3
option remote-subvolume brick0
end-volume
volume nufa
type cluster/nufa
option local-volume-name `hostname`-0
subvolumes tur-awc1-0 tur-awc2-0 tur-awc3-0
end-volume
Matt
More information about the Gluster-devel
mailing list