[Gluster-users] Recovering from remove-brick where shards did not rebalance

Xavi Hernandez jahernan at redhat.com
Wed Sep 8 08:57:51 UTC 2021


Hi Anthony,

On Tue, Sep 7, 2021 at 8:20 PM Anthony Hoppe <anthony at vofr.net> wrote:

> I am currently playing with concatenating main file + shards together.  Is
> it safe to assume that a shard with the same ID and sequence number
> (5da7d7b9-7ff3-48d2-8dcd-4939364bda1f.242 for example) is identical across
> bricks?  That is, I can copy all the shards into a single location
> overwriting and/or discarding duplicates, then concatenate them together in
> order?  Or is it a more complex?
>

Assuming it's a replicated volume, a given shard should appear on all
bricks of the same replicated subvolume. If there were no pending heals,
they should all have the same contents (however you can easily check that
by running an md5sum (or similar) on each file).

On distributed-replicated volumes it's possible to have the same shard on
two different subvolumes. In this case one of the subvolumes contains the
real file, and the other a special 0-bytes file with mode '---------T'. You
need to take the real file and ignore the second one.

Shards may be smaller than the shard size. In this case you should extend
the shard to the shard size before concatenating it with the rest of the
shards (for example using "truncate -s"). The last shard may be smaller. It
doesn't need to be extended.

Once you have all the shards, you can concatenate them. Note that the first
shard of a file (or shard 0) is not inside the .shard directory. You must
take it from the location where the file is normally seen.

Regards,

Xavi


>
> ------------------------------
>
> *From: *"anthony" <anthony at vofr.net>
> *To: *"gluster-users" <gluster-users at gluster.org>
> *Sent: *Tuesday, September 7, 2021 10:18:07 AM
> *Subject: *Re: [Gluster-users] Recovering from remove-brick where shards
> did not        rebalance
>
> I've been playing with re-adding the bricks and here is some interesting
> behavior.
>
> When I try to force add the bricks to the volume while it's running, I get
> complaints about one of the bricks already being a member of a volume.  If
> I stop the volume, I can then force-add the bricks.  However, the volume
> won't start without force.  Once the volume is force started, all of the
> bricks remain offline.
>
> I feel like I'm close...but not quite there...
>
> ------------------------------
>
> *From: *"anthony" <anthony at vofr.net>
> *To: *"Strahil Nikolov" <hunter86_bg at yahoo.com>
> *Cc: *"gluster-users" <gluster-users at gluster.org>
> *Sent: *Tuesday, September 7, 2021 7:45:44 AM
> *Subject: *Re: [Gluster-users] Recovering from remove-brick where shards
> did not        rebalance
>
> I was contemplating these options, actually, but not finding anything in
> my research showing someone had tried either before gave me pause.
>
> One thing I wasn't sure about when doing a force add-brick was if gluster
> would wipe the existing data from the added bricks.  Sounds like that may
> not be the case?
>
> With regards to concatenating the main file + shards, how would I go about
> identifying the shards that pair with the main file?  I see the shards have
> sequence numbers, but I'm not sure how to match the identifier to the main
> file.
>
> Thanks!!
>
> ------------------------------
>
> *From: *"Strahil Nikolov" <hunter86_bg at yahoo.com>
> *To: *"anthony" <anthony at vofr.net>, "gluster-users" <
> gluster-users at gluster.org>
> *Sent: *Tuesday, September 7, 2021 6:02:36 AM
> *Subject: *Re: [Gluster-users] Recovering from remove-brick where shards
> did not        rebalance
>
> The data should be recoverable by concatenating the main file with all
> shards. Then you can copy the data back via the FUSE mount point.
>
> I think that some users reported that add-brick with the force option
> allows to 'undo' the situation and 're-add' the data, but I have never
> tried that and I cannot guarantee that it will even work.
>
> The simplest way is to recover from a recent backup , but sometimes this
> leads to a data loss.
>
> Best Regards,
> Strahil Nikolov
>
> On Tue, Sep 7, 2021 at 9:29, Anthony Hoppe
> <anthony at vofr.net> wrote:
> Hello,
>
> I did a bad thing and did a remove-brick on a set of bricks in a
> distributed-replicate volume where rebalancing did not successfully
> rebalance all files.  In sleuthing around the various bricks on the 3 node
> pool, it appears that a number of the files within the volume may have been
> stored as shards.  With that, I'm unsure how to proceed with recovery.
>
> Is it possible to re-add the removed bricks somehow and then do a heal?
> Or is there a way to recover data from shards somehow?
>
> Thanks!
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210908/fa323b44/attachment.html>


More information about the Gluster-users mailing list