[Gluster-users] Migrating data from a failing filesystem

Wed Sep 24 16:10:19 UTC 2014

On 09/24/2014 07:35 PM, james.bellinger at icecube.wisc.edu wrote:
> Thanks for the info!
> I started the remove-brick start and, of course, the brick went read-only
> in less than an hour.
> This morning I checked the status a couple of minutes apart and found:
>
>       Node Rebalanced-files       size     scanned      failures
> status
> ---------      -----------   --------   ---------   -----------
> ------------
> gfs-node04             6634   590.7GB       81799         14868    in
> progress
> ...
> gfs-node04             6669   596.5GB       86584         15271    in
> progress
>
> I'm not sure exactly what it is doing here:  4785 files scanned, 403
> failures, and 35 rebalanced.
What it is supposed to be doing is to scan all the files in the volume, 
and for the files present in itself, i.e.gfs-node04:/sdb, migrate 
(rebalance) it into other bricks in the volume. Let it go to completion. 
The rebalance log should give you an idea of the 403 failures.
>   The used amount on the partition hasn't
> changed.
Probably because after copying the files to the other bricks, the 
unlinks/rmdirs on itself are failing because of the FS being mounted 
read-only.
> If anything, the _other_ brick on the server is shrinking!
Because the data is being copied into this brick as a part of migration?
> (Which is related to the question I had before that you mention below.)
>
> gluster volume remove-brick scratch gfs-node04:/sdb start
What is your original volume configuration? (gluster vol info scratch)?
> but...
> df /sda
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda             12644872688 10672989432 1844930140  86% /sda
> ...
> /dev/sda             12644872688 10671453672 1846465900  86% /sda
>
> Have I shot myself in the other foot?
> jim
>
>
>
>
>
>> On 09/23/2014 08:56 PM, james.bellinger at icecube.wisc.edu wrote:
>>> I inherited a non-replicated gluster system based on antique hardware.
>>>
>>> One of the brick filesystems is flaking out, and remounts read-only.  I
>>> repair it and remount it, but this is only postponing the inevitable.
>>>
>>> How can I migrate files off a failing brick that intermittently turns
>>> read-only?  I have enough space, thanks to a catastrophic failure on
>>> another brick; I don't want to present people with another one.  But if
>>> I
>>> understand migration correctly references have to be deleted, which
>>> isn't
>>> possible if the filesystem turns read-only.
>> What you could do is initiate the  migration  with `remove-brick start'
>> and monitor the progress with 'remove-brick status`. Irrespective of
>> whether the rebalance  completes or fails (due to the brick turning
>> read-only), you could anyway update the volume configuration with
>> 'remove-brick commit`. Now if the brick still has files left, mount the
>> gluster volume on that node and copy the files from the brick to the
>> volume via the mount.  You can then safely rebuild the array/ add a
>> different brick or whatever.
>>
>>> What I want to do is migrate the files off, remove it from gluster,
>>> rebuild the array, rebuild the filesystem, and then add it back as a
>>> brick.  (Actually what I'd really like is to hear that the students are
>>> all done with the system and I can turn the whole thing off, but theses
>>> aren't complete yet.)
>>>
>>> Any advice or words of warning will be appreciated.
>> Looks like your bricks are in trouble for over a year now
>> (http://gluster.org/pipermail/gluster-users.old/2013-September/014319.html).
>> Better get them fixed sooner than later! :-)
> Oddly enough the old XRAID systems are holding up better than the VTRAK
> arrays.  That doesn't help me much, though, since they're so small.
>
>> HTH,
>> Ravi
>>
>>> James Bellinger
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>