[Gluster-users] df causes hang
Joe Warren-Meeks
joe at encoretickets.co.uk
Sat Jan 15 11:40:56 UTC 2011
Hey guys,
I've been using glusterfs to share a volume between two webservers
happily for quite a while.
However, for some reason, they've got into a bit of a state such that
typing 'df -k' causes both to hang, resulting in a loss of service for42
seconds. I see the following messages in the log files:
Any ideas what might be causing this?
Server1
Glusterfs.log: (i.e. the client log)
[2011-01-15 11:22:54] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:22:54] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:22:54] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:22:54] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:22:54] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(2) op(PING)
[2011-01-15 11:22:54] N [client-protocol.c:6976:notify] 10.10.130.11-1:
disconnected
[2011-01-15 11:22:54] N [client-protocol.c:6228:client_setvolume_cbk]
10.10.130.11-1: Connected to 10.10.130.11:6996, attached to remote
volume 'brick1'.
[2011-01-15 11:22:54] N [client-protocol.c:6228:client_setvolume_cbk]
10.10.130.11-1: Connected to 10.10.130.11:6996, attached to remote
volume 'brick1'.
Glusterfsd.log:
[2011-01-15 11:22:54] N [server-protocol.c:6748:notify] server-tcp:
10.10.130.12:1023 disconnected
[2011-01-15 11:22:54] N [server-protocol.c:6748:notify] server-tcp:
10.10.130.11:1022 disconnected
[2011-01-15 11:22:54] N [server-protocol.c:6748:notify] server-tcp:
10.10.130.12:1022 disconnected
[2011-01-15 11:22:54] N [server-helpers.c:842:server_connection_destroy]
server-tcp: destroyed connection of
w3-4176-2010/10/19-06:35:34:26343-10.10.130.11-1
[2011-01-15 11:22:54] N [server-protocol.c:6748:notify] server-tcp:
10.10.130.11:1018 disconnected
[2011-01-15 11:22:54] N [server-helpers.c:842:server_connection_destroy]
server-tcp: destroyed connection of
w2-827-2011/01/15-11:09:38:7996-10.10.130.11-1
[2011-01-15 11:22:54] N [server-protocol.c:5812:mop_setvolume]
server-tcp: accepted client from 10.10.130.12:1019
[2011-01-15 11:22:54] N [server-protocol.c:5812:mop_setvolume]
server-tcp: accepted client from 10.10.130.12:1018
[2011-01-15 11:22:54] N [server-protocol.c:5812:mop_setvolume]
server-tcp: accepted client from 10.10.130.11:1023
[2011-01-15 11:22:54] N [server-protocol.c:5812:mop_setvolume]
server-tcp: accepted client from 10.10.130.11:1019
Server2
Client log:
[2011-01-15 11:21:47] E
[client-protocol.c:415:client_ping_timer_expired] 10.10.130.11-1: Server
10.10.130.11:6996 has not responded in the last 42 seconds,
disconnecting.
[2011-01-15 11:21:47] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(STATFS)
[2011-01-15 11:21:47] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:21:47] E [saved-frames.c:165:saved_frames_unwind]
10.10.130.11-1: forced unwinding frame type(1) op(LOOKUP)
[2011-01-15 11:21:47] N [client-protocol.c:6976:notify] 10.10.130.11-1:
disconnected
[2011-01-15 11:22:54] N [client-protocol.c:6228:client_setvolume_cbk]
10.10.130.11-1: Connected to 10.10.130.11:6996, attached to remote
volume 'brick1'.
[2011-01-15 11:22:54] N [client-protocol.c:6228:client_setvolume_cbk]
10.10.130.11-1: Connected to 10.10.130.11:6996, attached to remote
volume 'brick1'.
Note that the 2nd server doesn't show anything in the server log.
My glusterfsd.vol:
volume posix1
type storage/posix
option directory /data/export
end-volume
volume brick1
type features/locks
subvolumes posix1
end-volume
volume server-tcp
type protocol/server
option transport-type tcp
option auth.addr.brick1.allow *
option transport.socket.listen-port 6996
option transport.socket.nodelay on
subvolumes brick1
end-volume
repstore.vol
## file auto generated by /usr/bin/glusterfs-volgen (mount.vol)
# Cmd line:
# $ /usr/bin/glusterfs-volgen --name repstore1 --raid 1
10.10.130.11:/data/export 10.10.130.12:/data/export
# RAID 1
# TRANSPORT-TYPE tcp
volume 10.10.130.12-1
type protocol/client
option transport-type tcp
option remote-host 10.10.130.12
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick1
end-volume
volume 10.10.130.11-1
type protocol/client
option transport-type tcp
option remote-host 10.10.130.11
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick1
end-volume
volume mirror-0
type cluster/replicate
subvolumes 10.10.130.11-1 10.10.130.12-1
end-volume
volume writebehind
type performance/write-behind
option cache-size 4MB
subvolumes mirror-0
end-volume
volume iocache
type performance/io-cache
option cache-size `grep 'MemTotal' /proc/meminfo | awk '{print $2 *
0.2 / 1024}' | cut -f1 -d.`MB
option cache-timeout 60
subvolumes writebehind
end-volume
-- joe.
Joe Warren-Meeks
Director Of Systems Development
ENCORE TICKETS LTD
Encore House, 50-51 Bedford Row, London WC1R 4LR
Direct line: +44 (0)20 7492 1506
Reservations: +44 (0)20 7492 1500
Fax: +44 (0)20 7831 4410
Email: joe at encoretickets.co.uk
<mailto:joe at encoretickets.co.uk>
web: www.encoretickets.co.uk
<http://www.encoretickets.co.uk/>
Copyright in this message and any attachments remains with us. It is
confidential and may be legally privileged. If this message is not
intended for you it must not be read, copied or used by you or disclosed
to anyone else. Please advise the sender immediately if you have
received this message in error. Although this message and any
attachments are believed to be free of any virus or other defect that
might affect any computer system into which it is received and opened it
is the responsibility of the recipient to ensure that it is virus free
and no responsibility is accepted by Encore Tickets Limited for any loss
or damage in any way arising from its use.
More information about the Gluster-users
mailing list