<div dir="ltr"><div dir="ltr">Hi Nithya<div><br></div><div>Thanks for replying so quickly. It is very much appreciated.</div><div><br></div><div>There are lots if " [No space left on device] " errors which I can not understand as there are much space on all of the nodes.</div><div><br></div><div>A little bit of background will be useful in this case. I had cluster of seven nodes of varying capacity(73, 73, 73, 46, 46, 46,46 TB) . The cluster was almost 90% full so every node has almost 8 to 15 TB free space. I added two new nodes with 100TB each and ran fix-layout which completed successfully.</div><div><br></div><div>After that I started remove-brick operation. I don't think that any point , any of the nodes were 100% full. Looking at my ganglia graph, there is minimum 5TB always available at every node.</div><div><br></div><div>I was keeping an eye on remove-brick status and for very long time there was no failures and then at some point these 17000 failures appeared and it stayed like that.</div><div><br></div><div> Thanks</div><div><br></div><div>Kashif</div><div> </div><div><br></div><div><br></div><div> </div><div><br></div><div>Let me explain a little bit of background. </div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 4, 2019 at 5:09 AM Nithya Balachandran <<a href="mailto:nbalacha@redhat.com">nbalacha@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>The status shows quite a few failures. Please check the rebalance logs to see why that happened. We can decide what to do based on the errors.</div><div>Once you run a commit, the brick will no longer be part of the volume and you will not be able to access those files via the client.</div><div>Do you have sufficient space on the remaining bricks for the files on the removed brick?</div><div><br></div><div>Regards,</div><div>Nithya</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 4 Feb 2019 at 03:50, mohammad kashif <<a href="mailto:kashif.alig@gmail.com" target="_blank">kashif.alig@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi<div><br></div><div>I have a pure distributed gluster volume with nine nodes and trying to remove one node, I ran </div><div>gluster volume remove-brick atlasglust nodename:/glusteratlas/brick007/gv0 start<br></div><div><br></div><div>It completed but with around 17000 failures</div><div><br></div><div><div> Node Rebalanced-files size scanned failures skipped status run time in h:m:s</div><div> --------- ----------- ----------- ----------- ----------- ----------- ------------ --------------</div><div> nodename 4185858 27.5TB 6746030 17488 0 completed 405:15:34</div></div><div><br></div><div>I can see that there is still 1.5 TB of data on the node which I was trying to remove.</div><div><br></div><div>I am not sure what to do now? Should I run remove-brick command again so the files which has been failed can be tried again?</div><div> </div><div>or should I run commit first and then try to remove node again?</div><div><br></div><div>Please advise as I don't want to remove files.</div><div><br></div><div>Thanks</div><div><br></div><div>Kashif</div><div><br></div><div><br></div><div><br></div></div></div></div>
_______________________________________________<br>
Gluster-users mailing list<br>
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
<a href="https://lists.gluster.org/mailman/listinfo/gluster-users" rel="noreferrer" target="_blank">https://lists.gluster.org/mailman/listinfo/gluster-users</a></blockquote></div>
</blockquote></div>