[Gluster-users] concurrent "gluster volume status" crashes the command (v3.4 and v3.7)

Engelmann Florian florian.engelmann at everyware.ch
Tue Nov 10 11:57:34 UTC 2015


Dear list,

concurrent running "gluster volume status" on all 3 GlusterFS Nodes (actually those are LXC) somehow crashes the command. Two nodes reply "Another transaction is in progress. Please try again after sometime." and on the 3rd node the command hangs forever. Stopping the hanging command and running it again results also in "Another transaction is in progress. Please try again after sometime." on that machine.

strace exits like:

[...]
connect(7, {sa_family=AF_LOCAL, sun_path="/var/run/gluster/quotad.socket"}, 110) = -1 ENOENT (No such file or directory)
fcntl(7, F_GETFL)                       = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
epoll_ctl(3, EPOLL_CTL_ADD, 7, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLONESHOT, {u32=1, u64=4294967297}}) = 0
pipe([8, 9])                            = 0
fcntl(9, F_SETFD, FD_CLOEXEC)           = 0
pipe([10, 11])                          = 0
fcntl(10, F_GETFL)                      = 0 (flags O_RDONLY)
fstat(10, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f67780e5000
lseek(10, 0, SEEK_CUR)                  = -1 ESPIPE (Illegal seek)
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f67780d9a50) = 28493
close(-1)                               = -1 EBADF (Bad file descriptor)
close(11)                               = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(9)                                = 0
read(8, "", 4)                          = 0
close(8)                                = 0
read(10, "gsyncd.py 0.0.1\n", 4096)     = 16
wait4(28493, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 28493
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=28493, si_status=0, si_utime=5, si_stime=1} ---
close(10)                               = 0
munmap(0x7f67780e5000, 4096)            = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
close(-2)                               = -1 EBADF (Bad file descriptor)
close(-1)                               = -1 EBADF (Bad file descriptor)
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f6773545000
mprotect(0x7f6773545000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773d44f70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6773d459d0, tls=0x7f6773d45700, child_tidptr=0x7f6773d459d0) = 28496
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f6772d44000
mprotect(0x7f6772d44000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f6773543f70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f67735449d0, tls=0x7f6773544700, child_tidptr=0x7f67735449d0) = 28497
futex(0x7f67735449d0, FUTEX_WAIT, 28497, NULLAnother transaction is in progress. Please try again after sometime.
 <unfinished ...>
+++ exited with 1 +++

I  had to stop all volumes and restart glusterd to solve that problem.

Host OS: Ubuntu 14.04 LTS
LXC OS:  Ubuntu 14.04 LTS


We've got this issue with 3.4.2 (Ubuntu official) and upgraded to 3.7.5 (Launchpad) to check if the problem still exists. Still unsolved. Any ideas?

Thank you for your help,
Florian


More information about the Gluster-users mailing list