[Gluster-users] Disperse volume recovery and healing

Xavi Hernandez jahernan at redhat.com
Tue Mar 20 08:11:13 UTC 2018


On Tue, Mar 20, 2018 at 5:26 AM, Victor T <hero_of_nothing_1 at hotmail.com>
wrote:

> That makes sense. In the case of "file damage," it would show up as files
> that could not be healed in logfiles or gluster volume heal [volume] info?
>

If the damage affects more bricks than the volume redundancy, then probably
yes. These files or directories will appear in "gluster volume heal
<volname> info" permanently. In some cases, specially for directories, they
could be manually healed. But this is always something that needs to be
done with extra care and depends on each case, so I don't recommend to do
it without help from someone that knows what is happening.

Say we have acceptable backups, are there procedures to somehow overwrite
> the bad gfid with a copy from a good backup?
>

If a file is damaged but its parent directory is healthy, it should be
possible to delete the file and then restore it from backup. All this
should be done from the mount point, never directly on the bricks.

Xavi


> ------------------------------
> *From:* Xavi Hernandez <jahernan at redhat.com>
> *Sent:* Monday, March 19, 2018 12:28:46 AM
>
> *To:* Victor T
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Disperse volume recovery and healing
>
> Hi Victor,
>
> On Sun, Mar 18, 2018 at 3:47 AM, Victor T <hero_of_nothing_1 at hotmail.com>
> wrote:
>
>
> *No. After bringing up one brick and before stopping the next one, you
> need to be sure that there are no damaged files. You shouldn't reboot a
> node if "gluster volume heal <volname> info" shows damaged files.*
>
> What happens in this case then? I'm thinking about a situation where the
> servers are kept in an environment that we don't control - i.e. the cloud.
> If the VMs are forcibly rebooted without enough time to complete a heal
> before the next one goes down, then it cannot be guaranteed that the data
> is safe? This has happened to me with Azure before, during the
> Meltdown/Spectre incident.
>
>
> This is something that needs to be considered before deploying a dispersed
> volume (in fact any kind of volume). If multiple bricks can be restarted at
> *any* time, without *any* prior notification, then it's hard to guarantee
> much. You can think about this as if it were similar to a RAID. If someone
> starts removing and adding disks without control, you will surely lose the
> entire volume if there is not enough time between disk removals to rebuild
> newly added or reconnected disks. In the case of Gluster you won't lose the
> entire volume, but some files could get some damage.
>
> If you can't control the sequence of reboots but you know when they will
> happen, the best thing you can do is to stop volume access (at least write
> access). This will prevent any corruption, even if multiple bricks are
> restarted at the same time. Note that in this situation it's possible that
> you lose quorum, so the volume would be inaccessible anyway.
>
> Xavi
>
> ------------------------------
> *From:* Xavi Hernandez <jahernan at redhat.com>
> *Sent:* Thursday, March 15, 2018 11:46:52 PM
>
> *To:* Victor T
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Disperse volume recovery and healing
>
> On Fri, Mar 16, 2018 at 4:57 AM, Victor T <hero_of_nothing_1 at hotmail.com>
> wrote:
>
> Xavi, does that mean that even if every node was rebooted one at a time
> even without issuing a heal that the volume would have no issues after
> running gluster volume heal [volname] when all bricks are back online?
>
>
> No. After bringing up one brick and before stopping the next one, you need
> to be sure that there are no damaged files. You shouldn't reboot a node if
> "gluster volume heal <volname> info" shows damaged files.
>
> The command "gluster volume heal <volname>" is only a tool to force heal
> to progress (until the bug is fixed).
>
> Xavi
>
>
>
> ------------------------------
> *From:* Xavi Hernandez <jahernan at redhat.com>
> *Sent:* Thursday, March 15, 2018 12:09:05 AM
> *To:* Victor T
> *Cc:* gluster-users at gluster.org
> *Subject:* Re: [Gluster-users] Disperse volume recovery and healing
>
> Hi Victor,
>
> On Wed, Mar 14, 2018 at 12:30 AM, Victor T <hero_of_nothing_1 at hotmail.com>
> wrote:
>
> I have a question about how disperse volumes handle brick failure. I'm
> running version 3.10.10 on all systems. If I have a disperse volume in a
> 4+2 configuration with 6 servers each serving 1 brick, and maintenance
> needs to be performed on all systems, are there any general steps that need
> to be taken to ensure data is not lost or service interrupted? For example,
> can I just reboot each system sequentially after making sure sure the
> service is running on all servers before rebooting the next system? Or is
> there a need to force/wait for a heal after each brick comes back online?
> If I have two bricks down for multiple days and then bring them back in, is
> there a need to issue a heal or something like a rebalance before rebooting
> the other servers? There's lots of documentation about other volume types,
> but it seems information specific to dispersed volumes is a bit hard to
> find. Thanks a bunch.
>
>
> On a 4+2 configuration you could bring down up to 2 bricks simultaneously
> for maintenance. However if something happens to one of the remaining 4
> bricks, the volume would stop working. So in this case I would recommend to
> not have more than one server down for maintenance at the same time unless
> the down time is very very small.
>
> Once the stopped servers come back up again, you need to wait until all
> files are healed before proceeding with the next server. Failing to do so
> means that some files could have more than 2 non-healthy versions, what
> will make the file inaccessible until enough healthy versions are available
> again.
>
> Self-heal should be automatically triggered once the bricks come online,
> however there was a bug (https://bugzilla.redhat.com/s
> how_bug.cgi?id=1547662) that could cause delays in the self-heal process.
> This bug should be fixed in the next version. Meantime you can force
> self-heal to progress by issuing "gluster volume heal <volname>" commands
> each time it seems to have stopped.
>
> Once the output of "gluster volume heal <volname> info" reports 0 pending
> files on all bricks, you can proceed with the maintenance of the next
> server.
>
> No need to do any rebalance for down bricks. Rebalance is basically needed
> when volume is expanded with more bricks.
>
> Xavi
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180320/b086ca45/attachment.html>


More information about the Gluster-users mailing list