[Gluster-users] Replace Brick fails (gluster 3.2.5)

Heiko Schröter schroete at iup.physik.uni-bremen.de
Sat Nov 29 22:23:07 UTC 2014


sorry, at least for the next 18 month i'am stuck with 3.2.5 (Lifetime of 
project).
But maybe someone from the dev team could give a hint where to start 
debugging with these error messages.
It seems to be related to the "rpc XDR" encoding mechanism. The only 
thing i've found so far is that it might be a "LOCALE" probelm. I 
checked the locale settings. They where indeed different.
So recompiled everything with correct locale but still get the error.
"strace" gives me no real clue.

Is there a way to debug or trace the gluster RPC XDR encoding mechanism ?

Heiko


[2014-11-29 23:10:52.912545] W [rpc-common.c:38:xdr_serialize_generic] 
(-->/usr/lib/glusterfs/3.2.5qa9/xlator/mgmt/glusterd.so(glusterd_op_send_cli_response+0x97) 
[0x7f801881d7e7] 
(-->/usr/lib/glusterfs/3.2.5qa9/xlator/mgmt/glusterd.so(glusterd_submit_reply+0x65) 
[0x7f801880f695] 
(-->/usr/lib/glusterfs/3.2.5qa9/xlator/mgmt/glusterd.so(glusterd_serialize_reply+0x56) 
[0x7f801880f576]))) 0-xdr: XDR encoding failed
[2014-11-29 23:10:52.912556] E 
[glusterd-utils.c:404:glusterd_serialize_reply] 0-: Failed to encode message
[2014-11-29 23:10:52.912561] E 
[glusterd-utils.c:446:glusterd_submit_reply] 0-: Failed to serialize reply


strace end:
.....
read(5, "# /etc/services\n#\n# Network serv"..., 4096) = 4096
read(5, " private\t77/tcp\t\t\t\t# any private"..., 4096) = 4096
read(5, "e\nemfis-cntl\t141/udp\nimap\t\t143/t"..., 4096) = 4096
read(5, "dialog\t360/tcp\t\t\t\t# scoi2odialog"..., 4096) = 4096
read(5, "\t\tdqs313_intercell\ncryptoadmin\t6"..., 4096) = 4096
read(5, "# Citrix ICA Client\nica\t\t1494/ud"..., 4096) = 4096
read(5, "05/udp\nlstp\t\t2559/tcp\t\t\t# \nlstp\t"..., 4096) = 4096
read(5, "t-pmp\t\t5351/udp\ndns-llq\t\t5352/tc"..., 4096) = 4096
read(5, "p\t\t\t# OpenPGP HTTP Keyserver\nhkp"..., 4096) = 3373
read(5, "", 4096)                       = 0
close(5)                                = 0
munmap(0x7f8cf229a000, 4096)            = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5
setsockopt(5, SOL_SOCKET, SO_RCVBUF, [524288], 4) = 0
setsockopt(5, SOL_SOCKET, SO_SNDBUF, [524288], 4) = 0
setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
fcntl(5, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
setsockopt(5, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(5, SOL_TCP, TCP_KEEPIDLE, [20], 4) = 0
setsockopt(5, SOL_TCP, TCP_KEEPINTVL, [2], 4) = 0
bind(5, {sa_family=AF_INET, sin_port=htons(1023), 
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(24007), 
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in 
progress)
epoll_ctl(3, EPOLL_CTL_ADD, 5, {EPOLLIN|EPOLLPRI|EPOLLOUT, {u32=5, 
u64=5}}) = 0
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f8cef018000
mprotect(0x7f8cef018000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f8cef817ff0, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, 
parent_tidptr=0x7f8cef8189d0, tls=0x7f8cef818700, 
child_tidptr=0x7f8cef8189d0) = 2840
mmap(NULL, 8392704, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f8cee817000
mprotect(0x7f8cee817000, 4096, PROT_NONE) = 0
clone(child_stack=0x7f8cef016ff0, 
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, 
parent_tidptr=0x7f8cef0179d0, tls=0x7f8cef017700, 
child_tidptr=0x7f8cef0179d0) = 2841
epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 257, -1) = 1
getsockopt(5, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(1023), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
futex(0x621ea4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x621e60, 2) = 1
epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLPRI, {u32=5, u64=5}}) = 0
epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 257, -1) = 1
readv(5, [{"\200\2\0\30", 4}], 1)       = 4
readv(5, [{"\0\0\0\1\0\0\0\1", 8}], 1)  = 8
readv(5, 
[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0"..., 
131088}], 1) = 131088
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2309, ...}) = 0
write(4, "[2014-11-29 23:21:41.942744] I ["..., 118) = 118
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7f8cf229a000
write(1, "replace-brick failed to start\n", 30replace-brick failed to start
) = 30
futex(0x621e24, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x621de0, 2) = 1
epoll_wait(3,  <unfinished ...>
+++ exited with 1 +++






Am 28.11.2014 um 12:50 schrieb Atin Mukherjee:
> 3.2.5 is too old, can you please upgrade your cluster to recent version
> of glusterfs bits and try it out?
>
> ~Atin
>
> On 11/28/2014 05:17 PM, Heiko Schröter wrote:
>> Unable to set cli op
>
>



More information about the Gluster-users mailing list