[Gluster-users] Gluster IO thread unresponsive
Daniel Pereira
d.pereira at skillupjapan.co.jp
Thu Jan 6 02:37:30 UTC 2011
Hi everyone,
Hope you guys can help me or at least point me to the good direction.
I'm doing some stress tests in a 16 disk striped setup, using 300Mbyte
files, created with "dd if=/dev/zero of=smalltest.file bs=1048576
count=300". The tests consist on reading those files, in a loop, in the same
manner with "dd of=/dev/null if=smalltest.file bs=1048576 count=300".
Both server and client machines are a dual-core Intel Xeon connected by a
10Gbit link and the OS is Linux Ubuntu 10.10 with a kernel 2.6.36.2. The
tests were done with a fresh compiled git version of GlusterFS
(v3.1.1-52-gcbba1c3).
After some time (always after a couple hours), while GlusterFS is working
great during that time, only reading the created files, one of the
glusterfsd threads CPU usage goes up to 340% (uses almost all of the 4-core)
and seems to become unresponsive.
Is this a known issue or is there a mistake from my part (configs, etc)?
On the server logs, there isn't any relevant information (attached file) but
on the client logs I have the following (also attached):
[2011-01-05 20:43:41.258672] E
[client-handshake.c:116:rpc_client_ping_timer_expired] stripe1-client-8:
Server 10.0.0.1:24033 has not responded in the last 42 seconds,
disconnecting.
[2011-01-05 20:43:41.271821] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310399
[2011-01-05 20:43:41.272034] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310523
[2011-01-05 20:43:41.272088] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310556
[2011-01-05 20:43:41.272140] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310575
[2011-01-05 20:43:41.272191] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310595
[2011-01-05 20:43:41.272241] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS 3.1)
op(READ(12)) called at 2011-01-05 20:42:26.310618
[2011-01-05 20:43:41.272292] E [rpc-clnt.c:338:saved_frames_unwind]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88) [0x7f032d9f6678]
(-->/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7d)
[0x7f032d9f5ddd] (-->/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)
[0x7f032d9f5d3e]))) rpc-clnt: forced unwinding frame type(GlusterFS
Handshake) op(PING(3)) called at 2011-01-05 20:42:59.256051
[2011-01-05 20:43:41.272315] I [client.c:1590:client_rpc_notify]
stripe1-client-8: disconnected
The configuration for each subvolume is:
+---------------------------------------------------------------------------
---+
volume stripe1-posix
type storage/posix
option directory /data_b
end-volume
volume stripe1-access-control
type features/access-control
subvolumes stripe1-posix
end-volume
volume stripe1-locks
type features/locks
subvolumes stripe1-access-control
end-volume
volume stripe1-io-threads
type performance/io-threads
subvolumes stripe1-locks
end-volume
volume /data_b
type debug/io-stats
subvolumes stripe1-io-threads
end-volume
volume stripe1-server
type protocol/server
option transport-type tcp
option auth.addr./data_b.allow *
subvolumes /data_b
end-volume
+---------------------------------------------------------------------------
---+
The server/client configuration used is:
+---------------------------------------------------------------------------
---+
1: volume stripe1-client-0
2: type protocol/client
3: option remote-host 10.0.0.1
4: option remote-subvolume /data_b
5: option transport-type tcp
6: end-volume
7:
<repeat 15 times for each subvolume>
112:
113: volume stripe1-stripe-0
114: type cluster/stripe
115: option block-size 1MB
116: subvolumes stripe1-client-0 stripe1-client-1 stripe1-client-2
stripe1-client-3 stripe1-client-4 stripe1-client-5 stripe1-client-6
stripe1-client-7 stripe1-client-8 stripe1-client-9 stripe1-client-10
stripe1-client-11 stripe1-client-12 stripe1-client-13 stripe1-client-14
stripe1-client-15
117: end-volume
118:
119: volume stripe1-write-behind
120: type performance/write-behind
121: subvolumes stripe1-stripe-0
122: end-volume
123:
124: volume stripe1-read-ahead
125: type performance/read-ahead
126: option page-count 128
127: option page-size 8388608
128: subvolumes stripe1-write-behind
129: end-volume
130:
131: volume stripe1-io-cache
132: type performance/io-cache
133: option cache-size 1GB
134: subvolumes stripe1-read-ahead
135: end-volume
136:
137: volume stripe1-quick-read
138: type performance/quick-read
139: subvolumes stripe1-io-cache
140: end-volume
141:
142: volume stripe1-stat-prefetch
143: type performance/stat-prefetch
144: subvolumes stripe1-quick-read
145: end-volume
146:
147: volume stripe1
148: type debug/io-stats
149: subvolumes stripe1-stat-prefetch
150: end-volume
+---------------------------------------------------------------------------
---+
TIA,
Daniel
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glusterclientlog.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110106/fd8dec83/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: glusterserverlog.txt
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20110106/fd8dec83/attachment-0001.txt>
More information about the Gluster-users
mailing list