[Gluster-users] Gluster 3.6.9 missing files during remove migration operations
Ravishankar N
ravishankar at redhat.com
Fri Apr 29 01:31:39 UTC 2016
3.6.9 does not contain all fixes to trigger auto-heal when modifying the
replica count using replace-brick/ add-brick commands.
For replace-brick, you might want to try out the manual steps mentioned
in the "Replacing brick in Replicate/Distributed Replicate volumes"
section of [1].
For add-brick, the steps mentioned by Anuradha in [2] should work.
HTH,
Ravi
[1]
http://gluster.readthedocs.io/en/latest/Administrator%20Guide/Managing%20Volumes/#replace-brick
[2] https://www.gluster.org/pipermail/gluster-users/2016-January/025083.html
On 04/29/2016 01:51 AM, Bernard Gardner wrote:
> Further to this, I've continued my testing and discovered that during
> the same type of migration operations (add-brick followed by
> remove-brick in a replica=2 config), I see shell wildcard expansion
> sometimes returning multiple instances of the same filename - so it
> seems that the namespace for a FUSE mounted filesystem during a brick
> deletion operation is somewhat mutable. This behaviour is
> intermittent, but occurs frequently enough that I'd say it's repeatable.
>
> Does anyone have any feedback on my previous question of this being
> expected behavior or a bug?
>
> Thanks,
> Bernard.
>
> On 20 April 2016 at 19:55, Bernard Gardner <bernard at sprybts.com
> <mailto:bernard at sprybts.com>> wrote:
>
> Hi,
>
> I'm running gluster 3.6.9 on ubuntu 14.04 on a single test server
> (under Vagrant and VirtualBox), with 4 filesystems (in addition to
> the root), 2 of which are xfs directly on the disk, and the other
> 2 are xfs on an LVM config - the scenario I'm testing for is
> migration of our production gluster to add LVM so that we can use
> the snapshot features in 3.6 to implement offline backups.
>
> On my test machine, I configured a volume with replica 2 and 2
> bricks (with both bricks on the same server). I then started and
> mounted the volume back onto the same server under /mnt and
> populated /mnt with a 3 level deep hierarchy of 16 directories,
> and in each the leaf directories added 10 files of 1kB. So there
> are 40960 files in the filesystem (16x16x16x10) named like a/b/c/abc.0
>
> For my first test, I did a "replace-brick commit force" to swap
> the first brick in my config with a new brick on one of the xfs on
> LVM filesystems. This resulted in the /mnt filesystem appearing
> empty until I manually started a full heal on the volume after
> which the files and directories started to re-appear on the
> mounted filesystem - after the heal completed, everything looked
> OK, but that's not going to work for our production systems. This
> appeared to be the suggestion from
> https://www.gluster.org/pipermail/gluster-users/2012-October/011502.html
> for a replicated volume
>
> For my second attempt, I rebuilt the test system from scratch,
> built and mounted the gluster volume the same way and populated it
> with the same test file configuration. I then did a volume
> add-brick and added both of the xfs on LVM filesystems to the
> configuration. The directory tree was copied to the new bricks,
> but no files were moved. I then did volume remove-brick on the 2
> initial bricks and the system started migrating the files to the
> new filesystems. This looked more promising, but during the
> migration operation, I ran find /mnt -type f | wc -l a number of
> times and on one of those checks, the number of files was 39280
> instead of 40960 - I wasn't able to observe exactly which files
> were missing, I ran the command again immediately and it reported
> 40960 files every other time during the migration.
>
> Is this expected behavior, or have I stumbled on a bug?
>
> Is there a better workflow for completing this migration?
>
> The production system runs in AWS and has 6 gluster servers over 2
> availability zones, each of which has 1x600GB brick on an EBS
> volume, which are configured into a single 1.8TB volume with
> replication across the availability zones. We are planning on
> creating the new volumes with about 10% headroom left in the LVM
> config for holding snapshots, and hoping we can implement a backup
> solution by doing a gluster snapshot, followed by an EBS snapshot
> to get a consistent point in time offline backup (and then delete
> the gluster snapshot once the EBS snapshot has been taken). I
> haven't yet figured out the details of how we would restore from
> the snapshots (I can test that scenario once I have a working
> local test migration procedure and can migrate our test
> environment in AWS to support snapshots).
>
> Thanks,
> Bernard.
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160429/374d0374/attachment.html>
More information about the Gluster-users
mailing list