[Gluster-users] Fwd: Added bricks with wrong name and now need to remove them without destroying volume.

Wed Feb 27 22:08:16 UTC 2019

It sounds like new bricks were added and they mounted over the top of
existing bricks.
gluster volume status <volume> detail 
This will give the data you need to find where the real files are. You
can look in those to see the data should be intact.
Stopping the gluster volume is a good first step. Then as a safe guard
you can unmount the filesystem that holds the data you want. Now remove
the gluster volume(s) that are the problem - all if needed. Remount the
real filesystem(s). Create new gluster volumes with correct names.
On Wed, 2019-02-27 at 16:56 -0500, Tami Greene wrote:
> That makes sense.  System is made of four data arrays with a hardware
> RAID 6 and then the distributed volume on top.  I honestly don't know
> how that works, but the previous administrator said we had
> redundancy.  I'm hoping there is a way to bypass the safeguard of
> migrating data when removing a brick from the volume, which in my
> beginner's mind, would be a straight-forward way of remedying the
> problem.  Hopefully once the empty bricks are removed, the "missing"
> data will be visible again in the volume.
> 
> On Wed, Feb 27, 2019 at 3:59 PM Jim Kinney <jim.kinney at gmail.com>
> wrote:
> > Keep in mind that gluster is a metadata process. It doesn't really
> > touch the actual volume files. The exception is the .glusterfs and
> > .trashcan folders in the very top directory of the gluster volume.
> > 
> > When you create a gluster volume from brick, it doesn't format the
> > filesystem. It uses what's already there.
> > 
> > So if you remove a volume and all it's bricks, you've not deleted
> > data.
> > 
> > That said, if you are using anything but replicated bricks, which
> > is what I use exclusively for my needs, then reassembling them into
> > a new volume with correct name might be tricky. By listing the
> > bricks in the exact same order as they were listed when creating
> > the wrong name volume when making the correct named volume, it
> > should use the same method to put data on the drives as previously
> > and not scramble anything. 
> > 
> > On Wed, 2019-02-27 at 14:24 -0500, Tami Greene wrote:
> > > I sent this and realized I hadn't registered.  My apologies for
> > > the duplication
> > > Subject: Added bricks with wrong name and now need to remove them
> > > without destroying volume.
> > > To:  <gluster-users at gluster.org>
> > > 
> > > 
> > > 
> > > Yes, I broke it. Now I need help fixing it.
> > >  
> > > I have an existing Gluster Volume, spread over 16 bricks and 4
> > > servers; 1.5P space with 49% currently used .  Added an
> > > additional 4 bricks and server as we expect large influx of data
> > > in the next 4 to 6 months.  The system had been established by my
> > > predecessor, who is no longer here.
> > >  
> > > First solo addition of bricks to gluster.
> > >  
> > > Everything went smoothly until “gluster volume add-brick Volume
> > > newserver:/bricks/dataX/vol.name"
> > >                 (I don’t have the exact response as I worked on
> > > this for almost 5 hours last night) Unable to add-brick as “it is
> > > already mounted” or something to that affect.
> > >                 Double checked my instructions, the name of the
> > > bricks. Everything seemed correct.  Tried to add again adding
> > > “force.”  Again, “unable to add-brick”
> > >                 Because of the keyword (in my mind) “mounted” in
> > > the error, I checked /etc/fstab, where the name of the mount
> > > point is simply /bricks/dataX.
> > > This convention was the same across all servers, so I thought I
> > > had discovered an error in my notes and changed the name to
> > > newserver:/bricks/dataX. 
> > > Still had to use force, but the bricks were added.
> > > Restarted the gluster volume vol.name. No errors.
> > > Rebooted; but /vol.name did not mount on reboot as the /etc/fstab
> > > instructs. So I attempted to mount manually and discovered a had
> > > a big mess on my hands.
> > >                                 “Transport endpoint not
> > > connected” in addition to other messages.
> > >                 Discovered an issue between certificates and the
> > > auth.ssl-allow list because of the hostname of new server.  I
> > > made correction and /vol.name mounted.
> > >                 However, df -h indicated the 4 new bricks were
> > > not being seen as 400T were missing from what should have been
> > > available.
> > >  
> > > Thankfully, I could add something to vol.name on one machine and
> > > see it on another machine and I wrongly assumed the volume was
> > > operational, even if the new bricks were not recognized.
> > > So I tried to correct the main issue by,
> > >                 gluster volume remove vol.name
> > > newserver/bricks/dataX/
> > >                 received prompt, data will be migrated before
> > > brick is removed continue (or something to that) and I started
> > > the process, think this won’t take long because there is no data.
> > >                 After 10 minutes and no apparent progress on the
> > > process, I did panic, thinking worse case scenario – it is
> > > writing zeros over my data.
> > >                 Executed the stop command and there was still no
> > > progress, and I assume it was due to no data on the brick to be
> > > remove causing the program to hang.
> > >                 Found the process ID and killed it.
> > > 
> > > 
> > > This morning, while all clients and servers can access /vol.name;
> > > not all of the data is present.  I can find it under cluster, but
> > > users cannot reach it.  I am, again, assume it is because of the
> > > 4 bricks that have been added, but aren't really a part of the
> > > volume because of their incorrect name.
> > >  
> > > So – how do I proceed from here.  
> > > 
> > > 
> > > 1. Remove the 4 empty bricks from the volume without damaging
> > > data.
> > > 2. Correctly clear any metadata about these 4 bricks ONLY so they
> > > may be added correctly.
> > > 
> > > 
> > > If this doesn't restore the volume to full functionality, I'll
> > > write another post if I cannot find answer in the notes or on
> > > line.
> > >  
> > > Tami-- 
> > > 
> > > 
> > > 
> > > _______________________________________________Gluster-users
> > > mailing listGluster-users at gluster.org
> > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > -- 
> > James P. Kinney III
> > Every time you stop a school, you will have to build a jail. What
> > yougain at one end you lose at the other. It's like feeding a dog
> > on hisown tail. It won't fatten the dog.- Speech 11/23/1900 Mark
> > Twain
> > http://heretothereideas.blogspot.com/
> > 
> 
> 
-- 
James P. Kinney III

Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.
- Speech 11/23/1900 Mark Twain

http://heretothereideas.blogspot.com/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190227/e104f5ef/attachment.html>