<div dir="ltr">Hi Avra,<br><div><div class="gmail_extra"><br><div class="gmail_quote">On 20 February 2017 at 02:51, Avra Sengupta <span dir="ltr"><<a href="mailto:asengupt@redhat.com" target="_blank">asengupt@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_-6721563728597169759moz-cite-prefix">Hi D,<br>
<br>
It seems you tried to take a clone of a snapshot, when that
snapshot was not activated.<br></div></div></blockquote><div><br></div><div>Correct. As per my commands, I then noticed the issue, checked the snapshot's status & activated it. I included this in my command history just to clear up any doubts from the logs.<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
However in this scenario, the cloned volume should not be in an
inconsistent state. I will try to reproduce this and see if it's a
bug. Meanwhile could you please answer the following queries:<br>
1. How many nodes were in the cluster.<br></div></div></blockquote><div><br>There are 4 nodes in a (2+1)x2 setup.<br></div><div>s0 replicates to s1, with an arbiter on s2, and s2 replicates to s3, with an arbiter on s0.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
2. How many bricks does the snapshot
data-bck_GMT-2017.02.09-14.15.<wbr>43 have?<br></div></div></blockquote><div> </div><div>6 bricks, including the 2 arbiters.<br> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
3. Was the snapshot clone command issued from a node which did not
have any bricks for the snapshot data-bck_GMT-2017.02.09-14.15.<wbr>43<br></div></div></blockquote><div><br></div><div>All commands were issued from s0. All volumes have bricks on every node in the cluster.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
4. I see you tried to delete the new cloned volume. Did the new
cloned volume land in this state after failure to create the clone
or failure to delete the clone<br></div></div></blockquote><div><br></div><div>I noticed there was something wrong as soon as I created the clone. The clone command completed, however I was then unable to do anything with it because the clone didn't exist on s1-s3.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
<br>
If you want to remove the half baked volume from the cluster
please proceed with the following steps.<br>
1. bring down glusterd on all nodes by running the following
command on all nodes<br>
$ systemctl stop glusterd.<br>
Verify that the glusterd is down on all nodes by running the
following command on all nodes<br>
$ systemctl status glusterd.<br>
2. delete the following repo from all the nodes (whichever nodes
it exists)<br>
/var/lib/glusterd/vols/data-<wbr>teste<br></div></div></blockquote><div><br></div><div>The repo only exists on s0, but stoppping glusterd on only s0 & deleting the directory didn't work, the directory was restored as soon as glusterd was restarted. I haven't yet tried stopping glusterd on *all* nodes before doing this, although I'll need to plan for that, as it'll take the entire cluster off the air.<br><br></div><div>Thanks for the reply,<br></div><div> Doug<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div class="m_-6721563728597169759moz-cite-prefix">
<br>
Regards,<br>
Avra<div><div class="h5"><br>
<br>
On 02/16/2017 08:01 PM, Gambit15 wrote:<br>
</div></div></div>
<blockquote type="cite"><div><div class="h5">
<div dir="ltr">
<div>
<div>
<div>Hey guys,<br>
</div>
I tried to create a new volume from a cloned snapshot
yesterday, however something went wrong during the process
& I'm now stuck with the new volume being created on the
server I ran the commands on (s0), but not on the rest of
the peers. I'm unable to delete this new volume from the
server, as it doesn't exist on the peers.<br>
<br>
</div>
What do I do?<br>
</div>
Any insights into what may have gone wrong?<br>
<div>
<div>
<div>
<div>
<div>
<div><br>
CentOS 7.3.1611</div>
<div>Gluster 3.8.8<br>
<br>
</div>
<div>The command history & extract from
etc-glusterfs-glusterd.vol.log are included below.<br>
</div>
<div><br>
gluster volume list<br>
gluster snapshot list<br>
gluster snapshot clone data-teste
data-bck_GMT-2017.02.09-14.15.<wbr>43<br>
gluster volume status data-teste<br>
gluster volume delete data-teste<br>
gluster snapshot create teste data<br>
gluster snapshot clone data-teste
teste_GMT-2017.02.15-12.44.04<br>
gluster snapshot status<br>
gluster snapshot activate
teste_GMT-2017.02.15-12.44.04<br>
gluster snapshot clone data-teste
teste_GMT-2017.02.15-12.44.04<br>
<br>
<br>
[2017-02-15 12:43:21.667403] I [MSGID: 106499]
[glusterd-handler.c:4349:__<wbr>glusterd_handle_status_volume]
0-management: Received status volume req for volume
data-teste<br>
[2017-02-15 12:43:21.682530] E [MSGID: 106301]
[glusterd-syncop.c:1297:gd_<wbr>stage_op_phase]
0-management: Staging of operation 'Volume Status'
failed on localhost : Volume data-teste is not
started<br>
[2017-02-15 12:43:43.633031] I [MSGID: 106495]
[glusterd-handler.c:3128:__<wbr>glusterd_handle_getwd]
0-glusterd: Received getwd req<br>
[2017-02-15 12:43:43.640597] I
[run.c:191:runner_log]
(-->/usr/lib64/glusterfs/3.8.<wbr>8/xlator/mgmt/glusterd.so(+<wbr>0xcc4b2)
[0x7ffb396a14b2]
-->/usr/lib64/glusterfs/3.8.8/<wbr>xlator/mgmt/glusterd.so(+<wbr>0xcbf65)
[0x7ffb396a0f65]
-->/lib64/libglusterfs.so.0(<wbr>runner_log+0x115)
[0x7ffb44ec31c5] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/<wbr>delete/post/S57glusterfind-<wbr>delete-post
--volname=data-teste<br>
[2017-02-15 13:05:20.103423] E [MSGID: 106122]
[glusterd-snapshot.c:2397:<wbr>glusterd_snapshot_clone_<wbr>prevalidate]
0-management: Failed to pre validate<br>
[2017-02-15 13:05:20.103464] E [MSGID: 106443]
[glusterd-snapshot.c:2413:<wbr>glusterd_snapshot_clone_<wbr>prevalidate]
0-management: One or more bricks are not running.
Please run snapshot status command to see brick
status.<br>
Please start the stopped brick and then issue
snapshot clone command<br>
[2017-02-15 13:05:20.103481] W [MSGID: 106443]
[glusterd-snapshot.c:8563:<wbr>glusterd_snapshot_prevalidate]
0-management: Snapshot clone pre-validation failed<br>
[2017-02-15 13:05:20.103492] W [MSGID: 106122]
[glusterd-mgmt.c:167:gd_mgmt_<wbr>v3_pre_validate_fn]
0-management: Snapshot Prevalidate Failed<br>
[2017-02-15 13:05:20.103503] E [MSGID: 106122]
[glusterd-mgmt.c:884:glusterd_<wbr>mgmt_v3_pre_validate]
0-management: Pre Validation failed for operation
Snapshot on local node<br>
[2017-02-15 13:05:20.103514] E [MSGID: 106122]
[glusterd-mgmt.c:2243:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Pre Validation Failed<br>
[2017-02-15 13:05:20.103531] E [MSGID: 106027]
[glusterd-snapshot.c:8118:<wbr>glusterd_snapshot_clone_<wbr>postvalidate]
0-management: unable to find clone data-teste
volinfo<br>
[2017-02-15 13:05:20.103542] W [MSGID: 106444]
[glusterd-snapshot.c:9063:<wbr>glusterd_snapshot_<wbr>postvalidate]
0-management: Snapshot create post-validation failed<br>
[2017-02-15 13:05:20.103561] W [MSGID: 106121]
[glusterd-mgmt.c:351:gd_mgmt_<wbr>v3_post_validate_fn]
0-management: postvalidate operation failed<br>
[2017-02-15 13:05:20.103572] E [MSGID: 106121]
[glusterd-mgmt.c:1660:<wbr>glusterd_mgmt_v3_post_<wbr>validate]
0-management: Post Validation failed for operation
Snapshot on local node<br>
[2017-02-15 13:05:20.103582] E [MSGID: 106122]
[glusterd-mgmt.c:2363:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Post Validation Failed<br>
[2017-02-15 13:11:15.862858] W [MSGID: 106057]
[glusterd-snapshot-utils.c:<wbr>410:glusterd_snap_volinfo_<wbr>find]
0-management: Snap volume
c3ceae3889484e96ab8bed69593cf6<wbr>d3.s0.run-gluster-snaps-<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3-brick1-data-brick
not found [Argumento inválido]<br>
[2017-02-15 13:11:16.314759] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_<wbr>registry_bind] 0-pmap:
adding brick
/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick1/data/brick
on port 49452<br>
[2017-02-15 13:11:16.316090] I
[rpc-clnt.c:1046:rpc_clnt_<wbr>connection_init]
0-management: setting frame-timeout to 600<br>
[2017-02-15 13:11:16.348867] W [MSGID: 106057]
[glusterd-snapshot-utils.c:<wbr>410:glusterd_snap_volinfo_<wbr>find]
0-management: Snap volume
c3ceae3889484e96ab8bed69593cf6<wbr>d3.s0.run-gluster-snaps-<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3-brick6-data-arbiter
not found [Argumento inválido]<br>
[2017-02-15 13:11:16.558878] I [MSGID: 106143]
[glusterd-pmap.c:250:pmap_<wbr>registry_bind] 0-pmap:
adding brick
/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick6/data/arbiter
on port 49453<br>
[2017-02-15 13:11:16.559883] I
[rpc-clnt.c:1046:rpc_clnt_<wbr>connection_init]
0-management: setting frame-timeout to 600<br>
[2017-02-15 13:11:23.279721] E [MSGID: 106030]
[glusterd-snapshot.c:4736:<wbr>glusterd_take_lvm_snapshot]
0-management: taking snapshot of the brick
(/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick1/data/brick)
of device
/dev/mapper/v0.dc0.cte--g0-<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3_0
failed<br>
[2017-02-15 13:11:23.279790] E [MSGID: 106030]
[glusterd-snapshot.c:5135:<wbr>glusterd_take_brick_snapshot]
0-management: Failed to take snapshot of brick
s0:/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick1/data/brick<br>
[2017-02-15 13:11:23.279806] E [MSGID: 106030]
[glusterd-snapshot.c:6484:<wbr>glusterd_take_brick_snapshot_<wbr>task]
0-management: Failed to take backend snapshot for
brick
s0:/run/gluster/snaps/data-<wbr>teste/brick1/data/brick
volume(data-teste)<br>
[2017-02-15 13:11:23.286678] E [MSGID: 106030]
[glusterd-snapshot.c:4736:<wbr>glusterd_take_lvm_snapshot]
0-management: taking snapshot of the brick
(/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick6/data/arbiter)
of device
/dev/mapper/v0.dc0.cte--g0-<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3_1
failed<br>
[2017-02-15 13:11:23.286735] E [MSGID: 106030]
[glusterd-snapshot.c:5135:<wbr>glusterd_take_brick_snapshot]
0-management: Failed to take snapshot of brick
s0:/run/gluster/snaps/<wbr>c3ceae3889484e96ab8bed69593cf6<wbr>d3/brick6/data/arbiter<br>
[2017-02-15 13:11:23.286749] E [MSGID: 106030]
[glusterd-snapshot.c:6484:<wbr>glusterd_take_brick_snapshot_<wbr>task]
0-management: Failed to take backend snapshot for
brick
s0:/run/gluster/snaps/data-<wbr>teste/brick6/data/arbiter
volume(data-teste)<br>
[2017-02-15 13:11:23.286793] E [MSGID: 106030]
[glusterd-snapshot.c:6626:<wbr>glusterd_schedule_brick_<wbr>snapshot]
0-management: Failed to create snapshot<br>
[2017-02-15 13:11:23.286813] E [MSGID: 106441]
[glusterd-snapshot.c:6796:<wbr>glusterd_snapshot_clone_<wbr>commit]
0-management: Failed to take backend snapshot
data-teste<br>
[2017-02-15 13:11:25.530666] E [MSGID: 106442]
[glusterd-snapshot.c:8308:<wbr>glusterd_snapshot]
0-management: Failed to clone snapshot<br>
[2017-02-15 13:11:25.530721] W [MSGID: 106123]
[glusterd-mgmt.c:272:gd_mgmt_<wbr>v3_commit_fn]
0-management: Snapshot Commit Failed<br>
[2017-02-15 13:11:25.530735] E [MSGID: 106123]
[glusterd-mgmt.c:1427:<wbr>glusterd_mgmt_v3_commit]
0-management: Commit failed for operation Snapshot
on local node<br>
[2017-02-15 13:11:25.530749] E [MSGID: 106123]
[glusterd-mgmt.c:2304:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Commit Op Failed<br>
[2017-02-15 13:11:25.532312] E [MSGID: 106027]
[glusterd-snapshot.c:8118:<wbr>glusterd_snapshot_clone_<wbr>postvalidate]
0-management: unable to find clone data-teste
volinfo<br>
[2017-02-15 13:11:25.532339] W [MSGID: 106444]
[glusterd-snapshot.c:9063:<wbr>glusterd_snapshot_<wbr>postvalidate]
0-management: Snapshot create post-validation failed<br>
[2017-02-15 13:11:25.532353] W [MSGID: 106121]
[glusterd-mgmt.c:351:gd_mgmt_<wbr>v3_post_validate_fn]
0-management: postvalidate operation failed<br>
[2017-02-15 13:11:25.532367] E [MSGID: 106121]
[glusterd-mgmt.c:1660:<wbr>glusterd_mgmt_v3_post_<wbr>validate]
0-management: Post Validation failed for operation
Snapshot on local node<br>
[2017-02-15 13:11:25.532381] E [MSGID: 106122]
[glusterd-mgmt.c:2363:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Post Validation Failed<br>
[2017-02-15 13:29:53.779020] E [MSGID: 106062]
[glusterd-snapshot-utils.c:<wbr>2391:glusterd_snap_create_use_<wbr>rsp_dict]
0-management: failed to get snap UUID<br>
[2017-02-15 13:29:53.779073] E [MSGID: 106099]
[glusterd-snapshot-utils.c:<wbr>2507:glusterd_snap_use_rsp_<wbr>dict]
0-glusterd: Unable to use rsp dict<br>
[2017-02-15 13:29:53.779096] E [MSGID: 106108]
[glusterd-mgmt.c:1305:gd_mgmt_<wbr>v3_commit_cbk_fn]
0-management: Failed to aggregate response from
node/brick<br>
[2017-02-15 13:29:53.779136] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Commit failed on s3. Please check log
file for details.<br>
[2017-02-15 13:29:54.136196] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Commit failed on s1. Please check log
file for details.<br>
The message "E [MSGID: 106108]
[glusterd-mgmt.c:1305:gd_mgmt_<wbr>v3_commit_cbk_fn]
0-management: Failed to aggregate response from
node/brick" repeated 2 times between [2017-02-15
13:29:53.779096] and [2017-02-15 13:29:54.535080]<br>
[2017-02-15 13:29:54.535098] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Commit failed on s2. Please check log
file for details.<br>
[2017-02-15 13:29:54.535320] E [MSGID: 106123]
[glusterd-mgmt.c:1490:<wbr>glusterd_mgmt_v3_commit]
0-management: Commit failed on peers<br>
[2017-02-15 13:29:54.535370] E [MSGID: 106123]
[glusterd-mgmt.c:2304:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Commit Op Failed<br>
[2017-02-15 13:29:54.539708] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Post Validation failed on s1. Please
check log file for details.<br>
[2017-02-15 13:29:54.539797] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Post Validation failed on s3. Please
check log file for details.<br>
[2017-02-15 13:29:54.539856] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Post Validation failed on s2. Please
check log file for details.<br>
[2017-02-15 13:29:54.540224] E [MSGID: 106121]
[glusterd-mgmt.c:1713:<wbr>glusterd_mgmt_v3_post_<wbr>validate]
0-management: Post Validation failed on peers<br>
[2017-02-15 13:29:54.540256] E [MSGID: 106122]
[glusterd-mgmt.c:2363:<wbr>glusterd_mgmt_v3_initiate_<wbr>snap_phases]
0-management: Post Validation Failed<br>
The message "E [MSGID: 106062]
[glusterd-snapshot-utils.c:<wbr>2391:glusterd_snap_create_use_<wbr>rsp_dict]
0-management: failed to get snap UUID" repeated 2
times between [2017-02-15 13:29:53.779020] and
[2017-02-15 13:29:54.535075]<br>
The message "E [MSGID: 106099]
[glusterd-snapshot-utils.c:<wbr>2507:glusterd_snap_use_rsp_<wbr>dict]
0-glusterd: Unable to use rsp dict" repeated 2 times
between [2017-02-15 13:29:53.779073] and [2017-02-15
13:29:54.535078]<br>
[2017-02-15 13:31:14.285666] I [MSGID: 106488]
[glusterd-handler.c:1537:__<wbr>glusterd_handle_cli_get_<wbr>volume]
0-management: Received get vol req<br>
[2017-02-15 13:32:17.827422] E [MSGID: 106027]
[glusterd-handler.c:4670:<wbr>glusterd_get_volume_opts]
0-management: Volume cluster.locking-scheme does not
exist<br>
[2017-02-15 13:34:02.635762] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Pre Validation failed on s1. Volume
data-teste does not exist<br>
[2017-02-15 13:34:02.635838] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Pre Validation failed on s2. Volume
data-teste does not exist<br>
[2017-02-15 13:34:02.635889] E [MSGID: 106116]
[glusterd-mgmt.c:135:gd_mgmt_<wbr>v3_collate_errors]
0-management: Pre Validation failed on s3. Volume
data-teste does not exist<br>
[2017-02-15 13:34:02.636092] E [MSGID: 106122]
[glusterd-mgmt.c:947:glusterd_<wbr>mgmt_v3_pre_validate]
0-management: Pre Validation failed on peers<br>
[2017-02-15 13:34:02.636132] E [MSGID: 106122]
[glusterd-mgmt.c:2009:<wbr>glusterd_mgmt_v3_initiate_all_<wbr>phases]
0-management: Pre Validation Failed<br>
[2017-02-15 13:34:20.313228] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s2. Error: Volume
data-teste does not exist<br>
[2017-02-15 13:34:20.313320] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s1. Error: Volume
data-teste does not exist<br>
[2017-02-15 13:34:20.313377] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s3. Error: Volume
data-teste does not exist<br>
[2017-02-15 13:34:36.796455] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s1. Error: Volume
data-teste does not exist<br>
[2017-02-15 13:34:36.796830] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s3. Error: Volume
data-teste does not exist<br>
[2017-02-15 13:34:36.796896] E [MSGID: 106153]
[glusterd-syncop.c:113:gd_<wbr>collate_errors]
0-glusterd: Staging failed on s2. Error: Volume
data-teste does not exist<br>
<br>
</div>
<div>Many thanks!<br>
</div>
<div> D<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset class="m_-6721563728597169759mimeAttachmentHeader"></fieldset>
<br>
</div></div><pre>______________________________<wbr>_________________
Gluster-users mailing list
<a class="m_-6721563728597169759moz-txt-link-abbreviated" href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a class="m_-6721563728597169759moz-txt-link-freetext" href="http://lists.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://lists.gluster.org/<wbr>mailman/listinfo/gluster-users</a></pre>
</blockquote>
<br>
</div>
</blockquote></div><br></div></div></div>