[Bugs] [Bug 1702316] Cannot upgrade 5.x volume to 6.1 because of unused 'crypt' and 'bd' xlators

Wed May 8 07:45:19 UTC 2019

https://bugzilla.redhat.com/show_bug.cgi?id=1702316

robdewit <rob.dewit at coosto.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|needinfo?(rob.dewit at coosto. |
                   |com)                        |

--- Comment #3 from robdewit <rob.dewit at coosto.com> ---
Hi,

I tried upgrading one of the nodes again:

1) shutdown glusterd 5.6
2) install 6.1
3) start glusterd 6.1
4) no working brick
5) shutdown glusterd 6.1
6) downgrade to 5.6
7) start glusterd 5.6
8) brick is working fine again

The volume status is showing only the other nodes as the node running 6.1 is
failing the brick process:

=== START volume status ===
Status of volume: jf-vol0
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.10.0.25:/local.mnt/glfs/brick      49153     0          Y       20952
Brick 10.10.0.208:/local.mnt/glfs/brick     49153     0          Y       29631
Self-heal Daemon on localhost               N/A       N/A        Y       3487 
Self-heal Daemon on 10.10.0.208             N/A       N/A        Y       27031

Task Status of Volume jf-vol0
------------------------------------------------------------------------------
There are no active volume tasks
=== END volume status ===

=== START glusterd.log ===
[2019-05-08 07:23:26.043605] I [MSGID: 100030] [glusterfsd.c:2849:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 6.1 (args:
/usr/sbin/glusterd --pid-file=/run/glusterd.pid)
[2019-05-08 07:23:26.044499] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid
of current running process is 21399
[2019-05-08 07:23:26.047235] I [MSGID: 106478] [glusterd.c:1422:init]
0-management: Maximum allowed open file descriptors set to 65536
[2019-05-08 07:23:26.047270] I [MSGID: 106479] [glusterd.c:1478:init]
0-management: Using /var/lib/glusterd as working directory
[2019-05-08 07:23:26.047284] I [MSGID: 106479] [glusterd.c:1484:init]
0-management: Using /var/run/gluster as pid file working directory
[2019-05-08 07:23:26.051068] I [socket.c:931:__socket_server_bind]
0-socket.management: process started listening on port (44950)
[2019-05-08 07:23:26.051268] E [rpc-transport.c:297:rpc_transport_load]
0-rpc-transport: /usr/lib64/glusterfs/6.1/rpc-transport/rdma.so: cannot open
shared object file: No such file or directory
[2019-05-08 07:23:26.051282] W [rpc-transport.c:301:rpc_transport_load]
0-rpc-transport: volume 'rdma.management': transport-type 'rdma' is not valid
or not found on this machine
[2019-05-08 07:23:26.051292] W [rpcsvc.c:1985:rpcsvc_create_listener]
0-rpc-service: cannot create listener, initing the transport failed
[2019-05-08 07:23:26.051302] E [MSGID: 106244] [glusterd.c:1785:init]
0-management: creation of 1 listeners failed, continuing with succeeded
transport
[2019-05-08 07:23:26.053127] I [socket.c:902:__socket_server_bind]
0-socket.management: closing (AF_UNIX) reuse check socket 13
[2019-05-08 07:23:28.584285] I [MSGID: 106513]
[glusterd-store.c:2394:glusterd_restore_op_version] 0-glusterd: retrieved
op-version: 50000
[2019-05-08 07:23:28.650177] I [MSGID: 106544]
[glusterd.c:152:glusterd_uuid_init] 0-management: retrieved UUID:
5104ed01-f959-4a82-bbd6-17d4dd177ec2
[2019-05-08 07:23:28.656448] E [mem-pool.c:351:__gf_free]
(-->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x49190) [0x7fa26784e190]
-->/usr/lib64/glusterfs/6.1/xlator/mgmt/glusterd.so(+0x48f72) [0x7fa26784df72]
-->/usr/lib64/libglusterfs.so.0(__gf_free+0x21d) [0x7fa26d1f31dd] ) 0-:
Assertion failed: mem_acct->rec[header->type].size >= header->size
[2019-05-08 07:23:28.683589] I [MSGID: 106498]
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2019-05-08 07:23:28.686748] I [MSGID: 106498]
[glusterd-handler.c:3669:glusterd_friend_add_from_peerinfo] 0-management:
connect returned 0
[2019-05-08 07:23:28.686787] W [MSGID: 106061]
[glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2019-05-08 07:23:28.686819] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2019-05-08 07:23:28.687629] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.listen-backlog 1024
  8:     option event-threads 1
  9:     option ping-timeout 0
 10:     option transport.socket.read-fail-log off
 11:     option transport.socket.keepalive-interval 2
 12:     option transport.socket.keepalive-time 10
 13:     option transport-type rdma
 14:     option working-directory /var/lib/glusterd
 15: end-volume
 16:
+------------------------------------------------------------------------------+
[2019-05-08 07:23:28.687625] W [MSGID: 106061]
[glusterd-handler.c:3472:glusterd_transport_inet_options_build] 0-glusterd:
Failed to get tcp-user-timeout
[2019-05-08 07:23:28.689771] I [MSGID: 101190]
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 0
[2019-05-08 07:23:29.388437] I [MSGID: 106493]
[glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC
from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4, host: 10.10.0.208, port: 0
[2019-05-08 07:23:29.393409] I [glusterd-utils.c:6312:glusterd_brick_start]
0-management: starting a fresh brick process for brick /local.mnt/glfs/brick
[2019-05-08 07:23:29.395426] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2019-05-08 07:23:29.460728] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-nfs: setting frame-timeout to 600
[2019-05-08 07:23:29.460868] I [MSGID: 106131]
[glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: nfs already stopped
[2019-05-08 07:23:29.460911] I [MSGID: 106568]
[glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: nfs service is
stopped
[2019-05-08 07:23:29.461360] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-glustershd: setting frame-timeout to 600
[2019-05-08 07:23:29.462857] I [MSGID: 106131]
[glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: glustershd already
stopped
[2019-05-08 07:23:29.462902] I [MSGID: 106568]
[glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: glustershd service is
stopped
[2019-05-08 07:23:29.462959] I [MSGID: 106567]
[glusterd-svc-mgmt.c:220:glusterd_svc_start] 0-management: Starting glustershd
service
[2019-05-08 07:23:30.465107] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-quotad: setting frame-timeout to 600
[2019-05-08 07:23:30.465293] I [MSGID: 106131]
[glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: quotad already
stopped
[2019-05-08 07:23:30.465314] I [MSGID: 106568]
[glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: quotad service is
stopped
[2019-05-08 07:23:30.465351] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-bitd: setting frame-timeout to 600
[2019-05-08 07:23:30.465477] I [MSGID: 106131]
[glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: bitd already stopped
[2019-05-08 07:23:30.465489] I [MSGID: 106568]
[glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: bitd service is
stopped
[2019-05-08 07:23:30.465517] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-scrub: setting frame-timeout to 600
[2019-05-08 07:23:30.465633] I [MSGID: 106131]
[glusterd-proc-mgmt.c:86:glusterd_proc_stop] 0-management: scrub already
stopped
[2019-05-08 07:23:30.465645] I [MSGID: 106568]
[glusterd-svc-mgmt.c:253:glusterd_svc_stop] 0-management: scrub service is
stopped
[2019-05-08 07:23:30.465689] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-snapd: setting frame-timeout to 600
[2019-05-08 07:23:30.465772] I [rpc-clnt.c:1005:rpc_clnt_connection_init]
0-gfproxyd: setting frame-timeout to 600
[2019-05-08 07:23:30.466776] I [MSGID: 106493]
[glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received
ACC from uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4
[2019-05-08 07:23:30.466822] I [MSGID: 106493]
[glusterd-rpc-ops.c:468:__glusterd_friend_add_cbk] 0-glusterd: Received ACC
from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3, host: 10.10.0.25, port: 0
[2019-05-08 07:23:30.490461] I [MSGID: 106493]
[glusterd-rpc-ops.c:681:__glusterd_friend_update_cbk] 0-management: Received
ACC from uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3
[2019-05-08 07:23:47.540967] I [MSGID: 106584]
[glusterd-handler.c:5995:__glusterd_handle_get_state] 0-management: Received
request to get state for glusterd
[2019-05-08 07:23:47.541003] I [MSGID: 106061]
[glusterd-handler.c:5517:glusterd_get_state] 0-management: Default output
directory: /var/run/gluster/
[2019-05-08 07:23:47.541052] I [MSGID: 106061]
[glusterd-handler.c:5553:glusterd_get_state] 0-management: Default filename:
glusterd_state_20190508_092347
=== END glusterd.log ===

=== START glustershd.log ===
[2019-05-08 07:23:29.465963] I [MSGID: 100030] [glusterfsd.c:2849:main]
0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 6.1 (args:
/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log
-S /var/run/gluster/dc47fa45e83d2326.socket --xlator-option
*replicate*.node-uuid=5104ed01-f959-4a82-bbd6-17d4dd177ec2 --process-name
glustershd --client-pid=-6)
[2019-05-08 07:23:29.466783] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid
of current running process is 29165
[2019-05-08 07:23:29.469726] I [socket.c:902:__socket_server_bind]
0-socket.glusterfsd: closing (AF_UNIX) reuse check socket 10
[2019-05-08 07:23:29.471280] I [MSGID: 101190]
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 0
[2019-05-08 07:23:29.471317] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify]
0-glusterfsd-mgmt: disconnected from remote-host: localhost
[2019-05-08 07:23:29.471326] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify]
0-glusterfsd-mgmt: Exhausted all volfile servers
[2019-05-08 07:23:29.471518] I [MSGID: 101190]
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2019-05-08 07:23:29.471540] W [glusterfsd.c:1570:cleanup_and_exit]
(-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7f8e5adb37b3] -->/usr/sbin/glusterfs()
[0x411629] -->/usr/sbin/glusterfs(cleanup_and_exit+0x57) [0x409db7] ) 0-:
received signum (1), shutting down
=== END glustershd.log ===

=== START local.mnt-glfs-brick.log ===
[2019-05-08 07:23:29.396753] I [MSGID: 100030] [glusterfsd.c:2849:main]
0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 6.1 (args:
/usr/sbin/glusterfsd -s 10.10.0.177 --volfile-id
jf-vol0.10.10.0.177.local.mnt-glfs-brick -p
/var/run/gluster/vols/jf-vol0/10.10.0.177-local.mnt-glfs-brick.pid -S
/var/run/gluster/ccdac309d72f1df7.socket --brick-name /local.mnt/glfs/brick -l
/var/log/glusterfs/bricks/local.mnt-glfs-brick.log --xlator-option
*-posix.glusterd-uuid=5104ed01-f959-4a82-bbd6-17d4dd177ec2 --process-name brick
--brick-port 49153 --xlator-option jf-vol0-server.listen-port=49153)
[2019-05-08 07:23:29.397519] I [glusterfsd.c:2556:daemonize] 0-glusterfs: Pid
of current running process is 28996
[2019-05-08 07:23:29.400575] I [socket.c:902:__socket_server_bind]
0-socket.glusterfsd: closing (AF_UNIX) reuse check socket 10
[2019-05-08 07:23:29.401901] I [MSGID: 101190]
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 1
[2019-05-08 07:23:29.402622] I [MSGID: 101190]
[event-epoll.c:680:event_dispatch_epoll_worker] 0-epoll: Started thread with
index 0
[2019-05-08 07:23:29.402631] I [glusterfsd-mgmt.c:2443:mgmt_rpc_notify]
0-glusterfsd-mgmt: disconnected from remote-host: 10.10.0.177
[2019-05-08 07:23:29.402649] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify]
0-glusterfsd-mgmt: Exhausted all volfile servers
[2019-05-08 07:23:29.402770] W [glusterfsd.c:1570:cleanup_and_exit]
(-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7fe46b1f77b3]
-->/usr/sbin/glusterfsd() [0x411629]
-->/usr/sbin/glusterfsd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum
(1), shutting down
[2019-05-08 07:23:29.403338] I [socket.c:3754:socket_submit_outgoing_msg]
0-glusterfs: not connected (priv->connected = 0)
[2019-05-08 07:23:29.403353] W [rpc-clnt.c:1704:rpc_clnt_submit] 0-glusterfs:
failed to submit rpc-request (unique: 0, XID: 0x2 Program: Gluster Portmap,
ProgVers: 1, Proc: 5) to rpc-transport (glusterfs)
[2019-05-08 07:23:29.403420] W [glusterfsd.c:1570:cleanup_and_exit]
(-->/usr/lib64/libgfrpc.so.0(+0xe7b3) [0x7fe46b1f77b3]
-->/usr/sbin/glusterfsd() [0x411629]
-->/usr/sbin/glusterfsd(cleanup_and_exit+0x57) [0x409db7] ) 0-: received signum
(1), shutting down
=== END local.mnt-glfs-brick.log ===

=== START glusterd_state_20190508_092347 ===
[Global]
MYUUID: 5104ed01-f959-4a82-bbd6-17d4dd177ec2
op-version: 50000

[Global options]

[Peers]
Peer1.primary_hostname: 10.10.0.208
Peer1.uuid: 88496e0c-298b-47ef-98a1-a884ca68d7d4
Peer1.state: Peer in Cluster
Peer1.connected: Connected
Peer1.othernames:
Peer2.primary_hostname: 10.10.0.25
Peer2.uuid: a6ff7d5b-1e8d-4cdc-97cf-4e03b89462a3
Peer2.state: Peer in Cluster
Peer2.connected: Connected
Peer2.othernames:

[Volumes]
Volume1.name: jf-vol0
Volume1.id: f90d35dd-b2a4-461b-9ae9-dcfc68dac322
Volume1.type: Replicate
Volume1.transport_type: tcp
Volume1.status: Started
Volume1.profile_enabled: 0
Volume1.brickcount: 3
Volume1.Brick1.path: 10.10.0.177:/local.mnt/glfs/brick
Volume1.Brick1.hostname: 10.10.0.177
Volume1.Brick1.port: 49153
Volume1.Brick1.rdma_port: 0
Volume1.Brick1.port_registered: 0
Volume1.Brick1.status: Stopped
Volume1.Brick1.spacefree: 1891708428288Bytes
Volume1.Brick1.spacetotal: 1891966050304Bytes
Volume1.Brick2.path: 10.10.0.25:/local.mnt/glfs/brick
Volume1.Brick2.hostname: 10.10.0.25
Volume1.Brick3.path: 10.10.0.208:/local.mnt/glfs/brick
Volume1.Brick3.hostname: 10.10.0.208
Volume1.snap_count: 0
Volume1.stripe_count: 1
Volume1.replica_count: 3
Volume1.subvol_count: 1
Volume1.arbiter_count: 0
Volume1.disperse_count: 0
Volume1.redundancy_count: 0
Volume1.quorum_status: not_applicable
Volume1.snapd_svc.online_status: Offline
Volume1.snapd_svc.inited: True
Volume1.rebalance.id: 00000000-0000-0000-0000-000000000000
Volume1.rebalance.status: not_started
Volume1.rebalance.failures: 0
Volume1.rebalance.skipped: 0
Volume1.rebalance.lookedup: 0
Volume1.rebalance.files: 0
Volume1.rebalance.data: 0Bytes
Volume1.time_left: 0
Volume1.gsync_count: 0
Volume1.options.cluster.readdir-optimize: on
Volume1.options.cluster.self-heal-daemon: enable
Volume1.options.cluster.lookup-optimize: on
Volume1.options.network.inode-lru-limit: 200000
Volume1.options.performance.md-cache-timeout: 600
Volume1.options.performance.cache-invalidation: on
Volume1.options.performance.stat-prefetch: on
Volume1.options.features.cache-invalidation-timeout: 600
Volume1.options.features.cache-invalidation: on
Volume1.options.diagnostics.brick-sys-log-level: INFO
Volume1.options.diagnostics.brick-log-level: INFO
Volume1.options.diagnostics.client-log-level: INFO
Volume1.options.transport.address-family: inet
Volume1.options.nfs.disable: on
Volume1.options.performance.client-io-threads: off

[Services]
svc1.name: glustershd
svc1.online_status: Offline

svc2.name: nfs
svc2.online_status: Offline

svc3.name: bitd
svc3.online_status: Offline

svc4.name: scrub
svc4.online_status: Offline

svc5.name: quotad
svc5.online_status: Offline

[Misc]
Base port: 49152
Last allocated port: 49153
=== END glusterd_state_20190508_092347 ===

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.