[Gluster-users] Slow open on Server or Client
Jürgen Winkler
juergen.winkler at xidras.com
Thu Sep 16 16:05:07 UTC 2010
Hi,
I found a reason, what happens, when our webserver needs a long time (
more than 1 second) to deliver some picture via glusterfs:
Here is what happens at the servers:
gfs-server 1:
A request was sent from the glusterfs-client at 11:09:31.XXXXX
4583 11:09:36.299385 <... futex resumed> ) = 0 <7.740114>
4570 11:09:36.299404 epoll_wait(3, <unfinished ...>
4583 11:09:36.299416
lstat("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg", {st_mode=S_IFREG|0777,
st_size=5116, ...}) = 0 <0.000155>
4583 11:09:36.299628
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.posix.gen", "Lj3\xe7\x00q\xeb\xe1", 8) = 8 <0.0
00009>
4583 11:09:36.299680
open("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
O_RDONLY) = 56 <0.000070>
4583 11:09:36.299789 read(56,
"\377\330\377\340\0\20JFIF\0\1\1\0\0\1\0\1\0\0\377\376\0>CREATOR:"...,
5116) = 5116 <0.016504>
4583 11:09:36.316340 close(56) = 0 <0.000006>
4583 11:09:36.316371
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.afr.distribute1", 0x0, 0) = 12 <0.000008>
4583 11:09:36.316409
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.afr.distribute1", "\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00", 12) = 12 <0.000007>
4583 11:09:36.316447
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.afr.distribute2", 0x0, 0) = 12 <0.000035>
4583 11:09:36.316528
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.afr.distribute2", "\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00", 12) = 12 <0.000142>
4583 11:09:36.316712
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"glusterfs.inodelk-count", 0x0, 0) = -1 EOPNOTSUPP (Oper
ation not supported) <0.000008>
4583 11:09:36.316752
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"glusterfs.entrylk-count", 0x0, 0) = -1 EOPNOTSUPP (Oper
ation not supported) <0.000006>
4583 11:09:36.316787
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.glusterfs.dht", 0x0, 0) = -1 ENODATA (No data a
vailable) <0.000033>
4583 11:09:36.316850
lgetxattr("/gluster-storage/macm01//videos02/6839/preview/490ab55407a36.jpg",
"trusted.glusterfs.dht.linkto", 0x0, 0) = -1 ENODATA (No
data available) <0.000028>
What could it be, that locks down the futex for 7.7 seconds? Why is no
other thread handling this situation?
gfs-server 2:
According to the straces the same share on the second server does
nothing for seconds. Why?
21089 11:09:28.582044 readv(16, [{":O\0\0\0\2\"\0\0\0\0*", 12}], 1) = 12
<0.000011>
21089 11:09:28.583087 readv(16, <unfinished ...>
21090 11:09:33.739852 <... nanosleep resumed> NULL) = 0 <5.228398>
21089 11:09:33.739884 <... readv resumed>
[{"\0\0\0\0\3\361s\261\0\0\0\1\0\0\0\33\0\0\1\266\0\0@\226\0\0\377\376\0\0\377\375"...,
546}], 1) = 546 <0.000067>
21090 11:09:33.739956 nanosleep({1, 0}, <unfinished ...>
21089 11:09:33.739975 futex(0x615188, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000008>
21103 11:09:33.740006 <... futex resumed> ) = 0 <5.251553>
21089 11:09:33.740019 epoll_wait(3, {{EPOLLIN, {u32=16,
u64=38654705680}}, {EPOLLIN, {u32=19, u64=51539607571}}, {EPOLLIN,
{u32=27, u64=81604378651}}, {EPOL
LIN, {u32=24, u64=73014444056}}, {EPOLLIN, {u32=13, u64=103079215117}}},
257, 4294967295) = 5 <0.000007>
21103 11:09:33.740057
lstat("/gluster-storage/macm01//videos08/content/galleries/20634/deleted/ur3efhj5fnvs5g8n4ln1bkdms2_14_1232038384",
<unfinished ...>
21089 11:09:33.740081 readv(16, [{":O\0\0\0\0l\0\0\0\0*", 12}], 1) = 12
<0.000006>
21103 11:09:33.740112 <... lstat resumed> {st_mode=S_IFDIR|0755,
st_size=4096, ...}) = 0 <0.000035>
21089 11:09:33.740140 readv(16,
[{"\0\0\0\0\3\361s\262\0\0\0\2\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
108}], 1) = 108 <0.000007>
21103 11:09:33.740184
lgetxattr("/gluster-storage/macm01//videos08/content/galleries/20634/deleted/ur3efhj5fnvs5g8n4ln1bkdms2_14_1232038384",
"trusted.posix
.gen" <unfinished ...>
21089 11:09:33.740210 writev(16, [{":O\0\0\0\0l\0\0\0\0*", 12},
{"\0\0\0\0\3\361s\262\0\0\0\5\0\0\0\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
108}], 2 <unfinis
hed ...>
21103 11:09:33.740254 <... lgetxattr resumed> ,
"LR\xdc\xdd\x00\xa4\x91\x80", 8) = 8 <0.000047>
21089 11:09:33.740273 <... writev resumed> ) = 120 <0.000042>
21103 11:09:33.740292
lgetxattr("/gluster-storage/macm01//videos08/content/galleries/20634/deleted/ur3efhj5fnvs5g8n4ln1bkdms2_14_1232038384",
"glusterfs.con
tent" <unfinished ...>
21089 11:09:33.740318 readv(19, [{":O\0\0\0\0\204\0\0\0\0*", 12}], 1) =
12 <0.000008>
21103 11:09:33.740351 <... lgetxattr resumed> , 0x0, 0) = -1 EOPNOTSUPP
(Operation not supported) <0.000036>
21089 11:09:33.740368 readv(19, <unfinished ...>
21103 11:09:33.740384
lgetxattr("/gluster-storage/macm01//videos08/content/galleries/20634/deleted/ur3efhj5fnvs5g8n4ln1bkdms2_14_1232038384",
"trusted.afr.d
istribute1" <unfinished ...>
21090 11:09:35.502257 <... nanosleep resumed> NULL) = 0 <1.762285>
Later at 11:09:44 when another request shows up for the same file, the
servers lookup the file (lstat and lgetxattr) and everything works
smoothly.
But at the first attempt, the server behave very differently!
and her is what client say´s :
37512 11:09:31.027420
open("/SAN/contentmanager/content_videos02/6839/preview/490ab55407a36.jpg",
O_RDONLY|O_NONBLOCK) = 78 <5.290506>
Thx
More information about the Gluster-users
mailing list