[Gluster-users] Migrating data from a failing filesystem
james.bellinger at icecube.wisc.edu
james.bellinger at icecube.wisc.edu
Wed Sep 24 18:14:45 UTC 2014
> On 09/24/2014 07:35 PM, james.bellinger at icecube.wisc.edu wrote:
>> Thanks for the info!
>> I started the remove-brick start and, of course, the brick went
>> read-only
>> in less than an hour.
>> This morning I checked the status a couple of minutes apart and found:
>>
>> Node Rebalanced-files size scanned failures
>> status
>> --------- ----------- -------- --------- -----------
>> ------------
>> gfs-node04 6634 590.7GB 81799 14868 in
>> progress
>> ...
>> gfs-node04 6669 596.5GB 86584 15271 in
>> progress
>>
>> I'm not sure exactly what it is doing here: 4785 files scanned, 403
>> failures, and 35 rebalanced.
> What it is supposed to be doing is to scan all the files in the volume,
> and for the files present in itself, i.e.gfs-node04:/sdb, migrate
> (rebalance) it into other bricks in the volume. Let it go to completion.
> The rebalance log should give you an idea of the 403 failures.
I'll have a look at that.
>> The used amount on the partition hasn't
>> changed.
> Probably because after copying the files to the other bricks, the
> unlinks/rmdirs on itself are failing because of the FS being mounted
> read-only.
>> If anything, the _other_ brick on the server is shrinking!
> Because the data is being copied into this brick as a part of migration?
No, the space used on the read/write brick is decreasing. The readonly
one isn't changing, of course.
FWIW, this operation seems to have triggered a failure elsewhere, so I was
a little occupied in getting a filesystem working again. (I can hardly
wait to remainder this system...)
>> (Which is related to the question I had before that you mention below.)
>>
>> gluster volume remove-brick scratch gfs-node04:/sdb start
> What is your original volume configuration? (gluster vol info scratch)?
$ sudo gluster volume info scratch
Volume Name: scratch
Type: Distribute
Volume ID: de1fbb47-3e5a-45dc-8df8-04f7f73a3ecc
Status: Started
Number of Bricks: 12
Transport-type: tcp,rdma
Bricks:
Brick1: gfs-node01:/sdb
Brick2: gfs-node01:/sdc
Brick3: gfs-node01:/sdd
Brick4: gfs-node03:/sda
Brick5: gfs-node03:/sdb
Brick6: gfs-node03:/sdc
Brick7: gfs-node04:/sda
Brick8: gfs-node04:/sdb
Brick9: gfs-node05:/sdb
Brick10: gfs-node06:/sdb
Brick11: gfs-node06:/sdc
Brick12: gfs-node05:/sdc
Options Reconfigured:
cluster.min-free-disk: 30GB
>> but...
>> df /sda
>> Filesystem 1K-blocks Used Available Use% Mounted on
>> /dev/sda 12644872688 10672989432 1844930140 86% /sda
>> ...
>> /dev/sda 12644872688 10671453672 1846465900 86% /sda
>>
>> Have I shot myself in the other foot?
>> jim
>>
>>
>>
>>
>>
>>> On 09/23/2014 08:56 PM, james.bellinger at icecube.wisc.edu wrote:
>>>> I inherited a non-replicated gluster system based on antique hardware.
>>>>
>>>> One of the brick filesystems is flaking out, and remounts read-only.
>>>> I
>>>> repair it and remount it, but this is only postponing the inevitable.
>>>>
>>>> How can I migrate files off a failing brick that intermittently turns
>>>> read-only? I have enough space, thanks to a catastrophic failure on
>>>> another brick; I don't want to present people with another one. But
>>>> if
>>>> I
>>>> understand migration correctly references have to be deleted, which
>>>> isn't
>>>> possible if the filesystem turns read-only.
>>> What you could do is initiate the migration with `remove-brick start'
>>> and monitor the progress with 'remove-brick status`. Irrespective of
>>> whether the rebalance completes or fails (due to the brick turning
>>> read-only), you could anyway update the volume configuration with
>>> 'remove-brick commit`. Now if the brick still has files left, mount the
>>> gluster volume on that node and copy the files from the brick to the
>>> volume via the mount. You can then safely rebuild the array/ add a
>>> different brick or whatever.
>>>
>>>> What I want to do is migrate the files off, remove it from gluster,
>>>> rebuild the array, rebuild the filesystem, and then add it back as a
>>>> brick. (Actually what I'd really like is to hear that the students
>>>> are
>>>> all done with the system and I can turn the whole thing off, but
>>>> theses
>>>> aren't complete yet.)
>>>>
>>>> Any advice or words of warning will be appreciated.
>>> Looks like your bricks are in trouble for over a year now
>>> (http://gluster.org/pipermail/gluster-users.old/2013-September/014319.html).
>>> Better get them fixed sooner than later! :-)
>> Oddly enough the old XRAID systems are holding up better than the VTRAK
>> arrays. That doesn't help me much, though, since they're so small.
>>
>>> HTH,
>>> Ravi
>>>
>>>> James Bellinger
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>
>
More information about the Gluster-users
mailing list