[Bugs] [Bug 1223207] New: glusterd crashed on the node when tried to detach a tier after restoring data from the snapshot.

bugzilla at redhat.com bugzilla at redhat.com
Wed May 20 06:14:42 UTC 2015


https://bugzilla.redhat.com/show_bug.cgi?id=1223207

            Bug ID: 1223207
           Summary: glusterd crashed on the node when tried to detach a
                    tier after restoring data from the snapshot.
           Product: Red Hat Gluster Storage
           Version: 3.0
         Component: gluster-snapshot
          Keywords: Triaged
          Severity: urgent
          Priority: urgent
          Assignee: rjoseph at redhat.com
          Reporter: asengupt at redhat.com
        QA Contact: storage-qa-internal at redhat.com
                CC: annair at redhat.com, asengupt at redhat.com,
                    bugs at gluster.org, dlambrig at redhat.com,
                    gluster-bugs at redhat.com, rkavunga at redhat.com,
                    trao at redhat.com
        Depends On: 1215002
            Blocks: 1186580 (qe_tracker_everglades)



+++ This bug was initially created as a clone of Bug #1215002 +++

Description of problem:
i saw glusterd crash on the node when i tried to detach a tier after restoring
data from the snapshot.

Version-Release number of selected component (if applicable):

[root at rhsqa14-vm1 ~]# rpm -qa| grep gluster
glusterfs-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-devel-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-geo-replication-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-resource-agents-3.7dev-0.952.gita7f1d08.el6.noarch
glusterfs-debuginfo-3.7dev-0.952.gita7f1d08.el6.x86_64
glusterfs-libs-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-api-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-fuse-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-extra-xlators-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-regression-tests-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-rdma-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-cli-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-server-3.7dev-0.994.gitf522001.el6.x86_64
glusterfs-api-devel-3.7dev-0.994.gitf522001.el6.x86_64
[root at rhsqa14-vm1 ~]# 

[root at rhsqa14-vm1 ~]# glusterfs --version
glusterfs 3.7dev built on Apr 13 2015 07:14:26
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
[root at rhsqa14-vm1 ~]# 


Steps to Reproduce:
1. Create a normal distrep volume
2. Attach-tier to it.
3. fuse mount it and add some data.
4. Create snapshot and activate it.
5. access the snapshot on mount point and check everything is fine.
6. delete few files/dirs from mount point
7. stop the volume and restore the snapshot.
8. start the volume and check the data is available
9. detach the tier.

Actual results:

Glusterd crashed.
10. restored data available on the mount point
11. detached the tier causes snapshot removed which was created with tier.
12. on the gluster nodes peer status show peer rejected state.
13. no operations on the volumes can be done

[root at rhsqa14-vm1 ~]# gluster v create Mint replica 2
10.70.46.233:/rhs/brick3/M1 10.70.46.236:/rhs/brick3/M1
10.70.46.233:/rhs/brick4/M1 10.70.46.236:/rhs/brick4/M1 force
volume create: Mint: failed: Host 10.70.46.236 is not in 'Peer in Cluster'
state
[root at rhsqa14-vm1 ~]#


[root at rhsqa14-vm1 ~]# gluster peer status
Number of Peers: 3

Hostname: 10.70.46.240
Uuid: 0f69be9f-0055-41ba-89e8-34ef4c33b521
State: Peer Rejected (Connected)

Hostname: 10.70.46.243
Uuid: cd48cc5a-2f4c-4d53-847e-c67c2f7aefd9
State: Peer Rejected (Connected)

Hostname: 10.70.46.236
Uuid: 76dd61a5-e5e0-4a93-8f1d-8d5de71fca14
State: Peer Rejected (Connected)
[root at rhsqa14-vm1 ~]#

Expected results:


Additional info:

i have uploaded the sosreport here:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/007/


log messages:

[root at rhsqa14-vm1 ~]# less /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-04-20 05:03:08.285701] I
[glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-04-20 05:03:08.289612] I
[glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-04-20 05:03:08.292597] I
[glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-04-20 05:03:08.295527] I
[glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-04-20 05:49:21.790376] E
[glusterd-snapshot.c:5331:glusterd_snapshot_status_prevalidate] 0-management:
Snapshot (mix_snap) does not exist
[2015-04-20 05:49:21.790678] W
[glusterd-snapshot.c:7769:glusterd_snapshot_prevalidate] 0-management: Snapshot
status validation failed
[2015-04-20 05:49:21.790720] W [glusterd-mgmt.c:155:gd_mgmt_v3_pre_validate_fn]
0-management: Snapshot Prevalidate Failed
[2015-04-20 05:49:21.790741] E
[glusterd-mgmt.c:691:glusterd_mgmt_v3_pre_validate] 0-management: Pre
Validation failed for operation Snapshot on local node
[2015-04-20 05:49:21.790761] E
[glusterd-mgmt.c:1945:glusterd_mgmt_v3_initiate_snap_phases] 0-management: Pre
Validation Failed
[2015-04-20 05:49:51.841947] E
[glusterd-snapshot.c:3386:glusterd_handle_snapshot_info] 0-management: Snapshot
(mix_snap) does not exist
[2015-04-20 05:49:51.841978] W
[glusterd-snapshot.c:8336:glusterd_handle_snapshot_fn] 0-management: Snapshot
info failed
[2015-04-20 05:52:56.699396] E
[glusterd-snapshot.c:3386:glusterd_handle_snapshot_info] 0-management: Snapshot
(mix) does not exist
[2015-04-20 05:52:56.699447] W
[glusterd-snapshot.c:8336:glusterd_handle_snapshot_fn] 0-management: Snapshot
info failed
[2015-04-20 05:55:06.886334] E
[glusterd-snapshot.c:3386:glusterd_handle_snapshot_info] 0-management: Snapshot
(mix_snap) does not exist
[2015-04-20 05:55:06.886385] W
[glusterd-snapshot.c:8336:glusterd_handle_snapshot_fn] 0-management: Snapshot
info failed
[2015-04-20 06:12:10.157110] W
[glusterd-snapshot-utils.c:312:glusterd_snap_volinfo_find] 0-management: Snap
volume
b8482da332324836b016225ae8c2e669.10.70.46.233.var-run-gluster-snaps-b8482da332324836b016225ae8c2e669-brick2-mix
not found
[2015-04-20 06:12:10.190691] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap:
adding brick /var/run/gluster/snaps/b8482da332324836b016225ae8c2e669/brick2/mix
on port 49163
[2015-04-20 06:12:10.192862] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-04-20 06:12:10.217393] W
[glusterd-snapshot-utils.c:312:glusterd_snap_volinfo_find] 0-management: Snap
volume
b8482da332324836b016225ae8c2e669.10.70.46.233.var-run-gluster-snaps-b8482da332324836b016225ae8c2e669-brick3-mix
not found
[2015-04-20 06:12:10.251479] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap:
adding brick /var/run/gluster/snaps/b8482da332324836b016225ae8c2e669/brick3/mix
on port 49164
[2015-04-20 06:12:10.253502] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-04-20 06:12:10.273797] W
[glusterd-snapshot-utils.c:312:glusterd_snap_volinfo_find] 0-management: Snap
volume
b8482da332324836b016225ae8c2e669.10.70.46.233.var-run-gluster-snaps-b8482da332324836b016225ae8c2e669-brick5-mix
not found
[2015-04-20 06:12:10.298518] I [glusterd-pmap.c:227:pmap_registry_bind] 0-pmap:
adding brick /var/run/gluster/snaps/b8482da332324836b016225ae8c2e669/brick5/mix
on port 49165
[2015-04-20 06:12:10.300905] I [rpc-clnt.c:972:rpc_clnt_connection_init]
0-management: setting frame-timeout to 600
[2015-04-20 06:21:54.350244] I
[glusterd-handler.c:1317:__glusterd_handle_cli_get_volume] 0-glusterd: Received
get vol req
[2015-04-20 06:28:10.400329] W [socket.c:3059:socket_connect] 0-snapd: Ignore
failed connection attempt on , (No such file or directory) 
[2015-04-20 06:28:11.887406] I
[glusterd-utils.c:3981:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV3
successfully
[2015-04-20 06:28:11.888200] I
[glusterd-utils.c:3986:glusterd_nfs_pmap_deregister] 0-: De-registered MOUNTV1
successfully
...skipping...
)[0x7f4aa1adcfc0] (-->
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x356f410143] )))))
0-management: Lock for vol glusterfs_shared_storage not held
[2015-04-24 05:07:26.977951] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x356f022140] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x481)[0x7f4aa1b78fa1]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2a0)[0x7f4aa1af48c0]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f4aa1adcfc0]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x356f410143] )))))
0-management: Lock for vol mix not held
[2015-04-24 05:07:26.978375] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x356f022140] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x481)[0x7f4aa1b78fa1]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2a0)[0x7f4aa1af48c0]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f4aa1adcfc0]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x356f410143] )))))
0-management: Lock for vol test not held
[2015-04-24 05:07:26.978788] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x356f022140] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x481)[0x7f4aa1b78fa1]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2a0)[0x7f4aa1af48c0]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f4aa1adcfc0]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x356f410143] )))))
0-management: Lock for vol testing not held
[2015-04-24 05:07:26.979191] W [glusterd-locks.c:647:glusterd_mgmt_v3_unlock]
(--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1e0)[0x356f022140] (-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_mgmt_v3_unlock+0x481)[0x7f4aa1b78fa1]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(__glusterd_peer_rpc_notify+0x2a0)[0x7f4aa1af48c0]
(-->
/usr/lib64/glusterfs/3.7dev/xlator/mgmt/glusterd.so(glusterd_big_locked_notify+0x60)[0x7f4aa1adcfc0]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3)[0x356f410143] )))))
0-management: Lock for vol tri not held
[2015-04-24 05:07:34.023839] I
[glusterd-rpc-ops.c:463:__glusterd_friend_add_cbk] 0-glusterd: Received RJT
from uuid: cd48cc5a-2f4c-4d53-847e-c67c2f7aefd9, host: 10.70.46.243, port: 0
[2015-04-24 05:07:34.409049] I
[glusterd-handshake.c:1151:__glusterd_mgmt_hndsk_versions_ack] 0-management:
using the op-version 30700
[2015-04-24 05:07:36.182749] I
[glusterd-handshake.c:1151:__glusterd_mgmt_hndsk_versions_ack] 0-management:
using the op-version 30700
[2015-04-24 05:07:53.414628] I
[glusterd-handler.c:2337:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: cd48cc5a-2f4c-4d53-847e-c67c2f7aefd9
[2015-04-24 05:07:53.445409] E [MSGID: 106010]
[glusterd-utils.c:2608:glusterd_compare_friend_volume] 0-management: Version of
Cksums everglades differ. local cksum = 2068641408, remote cksum = 1397023877
on peer 10.70.46.243
[2015-04-24 05:07:53.445844] I
[glusterd-handler.c:3491:glusterd_xfer_friend_add_resp] 0-glusterd: Responded
to 10.70.46.243 (0), ret: 0
[2015-04-24 05:08:14.940876] I
[glusterd-rpc-ops.c:463:__glusterd_friend_add_cbk] 0-glusterd: Received RJT
from uuid: 76dd61a5-e5e0-4a93-8f1d-8d5de71fca14, host: 10.70.46.236, port: 0
[2015-04-24 05:08:15.261211] I
[glusterd-handler.c:2337:__glusterd_handle_incoming_friend_req] 0-glusterd:
Received probe from uuid: 76dd61a5-e5e0-4a93-8f1d-8d5de71fca14
[2015-04-24 05:08:15.291703] E [MSGID: 106010]
[glusterd-utils.c:2608:glusterd_compare_friend_volume] 0-management: Version of
Cksums everglades differ. local cksum = 2068641408, remote cksum = 1397023877
on peer 10.70.46.236
[2015-04-24 05:08:15.292301] I
[glusterd-handler.c:3491:glusterd_xfer_friend_add_resp] 0-glusterd: Responded
to 10.70.46.236 (0), ret: 0
[2015-04-24 05:11:12.917953] I
[glusterd-handler.c:1262:__glusterd_handle_cli_list_friends] 0-glusterd:
Received cli list req
You have new mail in /var/spool/mail/root
[root at rhsqa14-vm1 ~]#

--- Additional comment from Anand Avati on 2015-05-12 09:26:16 EDT ---

REVIEW: http://review.gluster.org/10761 (glusterd: function to create duplicate
should copy subvol_count) posted (#1) for review on master by mohammed rafi  kc
(rkavunga at redhat.com)

--- Additional comment from Anand Avati on 2015-05-12 10:52:00 EDT ---

REVIEW: http://review.gluster.org/10761 (glusterd: function to create duplicate
of volinfo should copy subvol_count) posted (#2) for review on master by
mohammed rafi  kc (rkavunga at redhat.com)


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1186580
[Bug 1186580] QE tracker bug for Everglades
https://bugzilla.redhat.com/show_bug.cgi?id=1215002
[Bug 1215002] glusterd crashed on the node when tried to detach a tier
after restoring data from the snapshot.
-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=4HXJ2MpiHO&a=cc_unsubscribe


More information about the Bugs mailing list