[Bugs] [Bug 1173732] New: Glusterd fails when script set_geo_rep_pem_keys.sh is executed on peer

bugzilla at redhat.com bugzilla at redhat.com
Fri Dec 12 19:29:42 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1173732

            Bug ID: 1173732
           Summary: Glusterd fails when script set_geo_rep_pem_keys.sh is
                    executed on peer
           Product: GlusterFS
           Version: 3.6.1
         Component: geo-replication
          Severity: urgent
          Assignee: bugs at gluster.org
          Reporter: vnosov at stonefly.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com



Description of problem:

Have geo-replication in active state. Slave system has 2 nodes. Slave volume is
replicated between these 2 nodes. Run script 
"/usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh geoaccount"
on one of the slave nodes. Glusterd on other slave node hits assert.


Version-Release number of selected component (if applicable): 3.6.1


How reproducible: 100%


Steps to Reproduce:
1. Call [root at SC-10-10-200-142 log]#
/usr/local/libexec/glusterfs/set_geo_rep_pem_keys.sh geoaccount
2.
3.

Actual results:
Contents of the log file
"/var/log/glusterfs/usr-local-etc-glusterfs-glusterd.vol.log" from the failed
node "192.168.5.141":

[2014-12-12 18:14:33.975724] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
The message "I [MSGID: 106006]
[glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has
disconnected from glusterd." repeated 39 times between [2014-12-12
18:12:36.960062] and [2014-12-12 18:14:33.975855]
[2014-12-12 18:14:36.976121] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:36.976221] I [MSGID: 106006]
[glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has
disconnected from glusterd.
[2014-12-12 18:14:39.976446] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:42.976771] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:45.977159] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:48.977489] W [socket.c:611:__socket_rwv] 0-management: readv
on /var/run/ef07a1f029b2d014b26651eab86a2517.socket failed (Invalid argument)
[2014-12-12 18:14:49.050113] E [mem-pool.c:242:__gf_free] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x1ab)[0x7f44075f8ffb] (-->
/usr/local/lib/libglusterfs.so.0(__gf_free+0xb4)[0x7f4407626084] (-->
/usr/local/lib/libglusterfs.so.0(data_destroy+0x55)[0x7f44075f36e5] (-->
/usr/local/lib/libglusterfs.so.0(dict_destroy+0x3e)[0x7f44075f40ae] (-->
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_destroy_req_ctx+0x17)[0x7f43fd2bba57]
))))) 0-: Assertion failed: GF_MEM_HEADER_MAGIC == *(uint32_t *)ptr
The message "I [MSGID: 106006]
[glusterd-handler.c:4257:__glusterd_nodesvc_rpc_notify] 0-management: nfs has
disconnected from glusterd." repeated 4 times between [2014-12-12
18:14:36.976221] and [2014-12-12 18:14:48.977580]
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2014-12-12 18:14:49
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.6.1
/usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x7f44075f8c26]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x2ed)[0x7f440761315d]
/lib64/libc.so.6[0x30008329a0]
/usr/local/lib/libglusterfs.so.0(__gf_free+0xcc)[0x7f440762609c]
/usr/local/lib/libglusterfs.so.0(data_destroy+0x55)[0x7f44075f36e5]
/usr/local/lib/libglusterfs.so.0(dict_destroy+0x3e)[0x7f44075f40ae]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_destroy_req_ctx+0x17)[0x7f43fd2bba57]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_op_sm+0x2f8)[0x7f43fd2c0198]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(__glusterd_handle_commit_op+0x108)[0x7f43fd2a3af8]
/usr/local/lib/glusterfs/3.6.1/xlator/mgmt/glusterd.so(glusterd_big_locked_handler+0x3f)[0x7f43fd2a106f]
/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x12)[0x7f4407635762]
/lib64/libc.so.6[0x3000843bf0]
---------
[2014-12-12 18:20:21.089219] I [MSGID: 100030] [glusterfsd.c:2018:main]
0-/usr/local/sbin/glusterd: Started running /usr/local/sbin/glusterd version
3.6.1 (args: /usr/local/sbin/glusterd --pid-file=/run/glusterd.pid)
[2014-12-12 18:20:21.095768] I [glusterd.c:1214:init] 0-management: Maximum
allowed open file descriptors set to 65536
[2014-12-12 18:20:21.095863] I [glusterd.c:1259:init] 0-management: Using
/var/lib/glusterd as working directory
[2014-12-12 18:20:21.103419] E [rpc-transport.c:266:rpc_transport_load]
0-rpc-transport: /usr/local/lib/glusterfs/3.6.1/rpc-transport/rdma.so: cannot
open shared object file: No such file or directory
[2014-12-12 18:20:21.103476] W [rpc-transport.c:270:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid
or not found on this machine
[2014-12-12 18:20:21.103550] W [rpcsvc.c:1524:rpcsvc_transport_create]
0-rpc-service: cannot create listener, initing the transport failed
[2014-12-12 18:20:28.294458] I
[glusterd-store.c:2043:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 30600
[2014-12-12 18:20:29.175538] I
[glusterd-handler.c:3146:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2014-12-12 18:20:29.175663] I [rpc-clnt.c:969:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2014-12-12 18:20:29.181732] I [glusterd.c:146:glusterd_uuid_init]
0-management: retrieved UUID: c0efccec-b0a0-a091-e517-00114331bbc4
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option transport.socket.listen-backlog 128
  7:     option rpc-auth-allow-insecure on
  8:     option geo-replication-log-group geogroup
  9:     option mountbroker-geo-replication.geoaccount nas-volume-0001
 10:     option mountbroker-root /var/mountbroker-root
 11:     option ping-timeout 30
 12:     option transport.socket.read-fail-log off
 13:     option transport.socket.keepalive-interval 2
 14:     option transport.socket.keepalive-time 10
 15:     option transport-type rdma
 16:     option working-directory /var/lib/glusterd
 17: end-volume
 18:
+------------------------------------------------------------------------------+

On Slave system:

[root at SC-10-10-200-141 log]# gluster volume info

Volume Name: gl-eae5fffa4556b4602-804-1418085976-nas-metadata
Type: Replicate
Volume ID: 41a6696e-780c-4582-b773-b544ce81dc53
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.5.141:/exports/nas-metadata-on-SC-10.10.200.141/nas-metadata
Brick2: 192.168.5.142:/exports/nas-metadata-on-SC-10.10.200.142/nas-metadata
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off
performance.stat-prefetch: off
nfs.disable: on
network.frame-timeout: 5
network.ping-timeout: 5
nfs.addr-namelookup: off

Volume Name: nas-volume-0001
Type: Replicate
Volume ID: 8eb4615c-0b1e-4eac-905a-b03c24e934f7
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.5.142:/exports/nas-segment-0008/nas-volume-0001
Brick2: 192.168.5.141:/exports/nas-segment-0003/nas-volume-0001
Options Reconfigured:
performance.read-ahead: off
performance.write-behind: off
performance.stat-prefetch: off
nfs.disable: on
nfs.addr-namelookup: off


On Master system:
[root at SC-10-10-200-182 log]# gluster volume geo-replication nas-volume-loc
geoaccount at 192.168.5.141::nas-volume-0001 status detail

MASTER NODE                     MASTER VOL        MASTER BRICK                 
              SLAVE                             STATUS    CHECKPOINT STATUS   
CRAWL STATUS       FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES
PENDING    FILES SKIPPED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SC-10-10-200-182.example.com    nas-volume-loc   
/exports/nas-segment-0001/nas-volume-loc    192.168.5.141::nas-volume-0001   
Active    N/A                  Changelog Crawl    39             0             
  0                0                  0


Expected results:


Additional info:

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list