[Gluster-devel] quota.t hangs on NetBSD machines
manu at netbsd.org
Thu Dec 31 11:01:37 UTC 2015
On Thu, Dec 31, 2015 at 03:40:54PM +0530, Raghavendra Talur wrote:
We have threads sleeping, either voluntary (nanosleep) or not (lwp_park),
c5223a80 (glusterfs) is in
Awaiting while reading on a socket. Probably FUSE, but it would be nice
to be certain.
c5346540 (glusterfs) is in
This is ordinary sigtimedwait() but the timeout arguent (third) is zero,
which can let it sleep forever. Is it expected?
c5418020 (glusterfs) is in
This is orinary poll(2). The struct timespec for the timeout is at
db721f18 and again this is an infinite timeout;
crash> x db721f18,2
db721f18: 0 0
(NB: 2 words because we run a a 32 bit machine, struct timespec is a
32 bit time_t and a 32 bit long)
c53692c0 (perfused) is in
Awaiting for data (either from kernel or glusterfs, I do not know).
Again we have an inifinite timeout.
I note that the FUSE filesystem is responding. Since perfused is
not multithreaded, it suggests it is not the stuck process. It may
have missed a request or reply, though, which would stuck the calling
Speaking about the calling process. I beleive it is the quota utility?
Indeed awaiting for a reply from the filesystem:
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 15221 1406 1546 85 0 3360 1080 puffsrpl I pts/0- 0:00.06 tests/basic/quota /mnt/glusterfs/0/test_dir/1.txt 256 48
Here is its backtrace obtained from gdb:
#0 0xbb69b6f7 in write () from /usr/lib/libc.so.12
#1 0x080489c0 in nwrite (fd=3, buf=0xbb501000, count=262144)
#2 0x08048a8b in file_write (
filename=0xbf7ffcb2 "/mnt/glusterfs/0/test_dir/1.txt", bs=262144, count=48)
#3 0x08048b64 in main (argc=4, argv=0xbf7feba0) at tests/basic/quota.c:83
It is awaiting for a write to complete, but we still do not know what process
got the request and not the reply. Do you see any way to tell?
manu at netbsd.org
More information about the Gluster-devel