[Gluster-users] remove-brick question

Thu Sep 19 03:10:50 UTC 2013

On 09/19/2013 02:04 AM, james.bellinger at icecube.wisc.edu wrote:
> Thanks for your replies.
>
> My first question is:  can I safely issue a "commit" when the volume does
> not seem to have drained?
Hi James,
You should be okay in issuing a commit to update the volume information 
since the rebalance status command show completed (albeit with failures 
for some files). Even if the volume information does get messed up in 
the process, it would at least not affect the data that is present in 
the back end bricks.

Thanks,
Ravi
>
> One of the other arrays failed completely near the end of the draining, so
> I'm guaranteed some data loss in any event; I just don't want to put the
> system in an unstable state.
>
> I have questions about that too, but I'll save those for another message.
>
> Thanks,
> James Bellinger
>
>> On 09/17/2013 03:26 AM, james.bellinger at icecube.wisc.edu wrote:
>>> I inherited a system with a wide mix of array sizes (no replication) in
>>> 3.2.2, and wanted to drain data from a failing array.
>>>
>>> I upgraded to 3.3.2, and began a
>>> gluster volume remove-brick scratch "gfs-node01:/sda" start
>>>
>>> After some time I got this:
>>> gluster volume remove-brick scratch "gfs-node01:/sda" status
>>> Node Rebalanced-files          size       scanned      failures
>>> status
>>>    ---------      -----------   -----------   -----------   -----------
>>> ------------
>>> localhost                0        0Bytes             0             0
>>> not started
>>> gfs-node06                0        0Bytes             0             0
>>> not started
>>> gfs-node03                0        0Bytes             0             0
>>> not started
>>> gfs-node05                0        0Bytes             0             0
>>> not started
>>> gfs-node01       2257394624         2.8TB       5161640        208878
>>> completed
>>>
>>> Two things jump instantly to mind:
>>> 1) The number of failures is rather large
>> Can you see the rebalance logs (/var/log/scratch-rebalance.log) to
>> figure out what the error messages are?
>>> 2) A _different_ disk seems to have been _partially_ drained.
>>> /dev/sda              2.8T  2.7T   12G 100% /sda
>>> /dev/sdb              2.8T  769G  2.0T  28% /sdb
>>> /dev/sdc              2.8T  2.1T  698G  75% /sdc
>>> /dev/sdd              2.8T  2.2T  589G  79% /sdd
>>>
>>>
>> I know this sounds silly, but just to be sure, is  /dev/sda actually
>> mounted on "gfs-node01:sda"?
>> If yes,the files that _were_ successfully rebalanced should have been
>> moved from gfs-node01:sda to one of the other bricks. Is that the case?
>>
>>> When I mount the system it is read-only (another problem I want to fix
>> Again, the mount logs could shed some information ..
>> (btw a successful rebalance start/status sequence should be followed by
>> the rebalance 'commit' command to ensure the volume information gets
>> updated)
>>
>>> ASAP) so I'm pretty sure the failures aren't due to users changing the
>>> system underneath me.
>>>
>>> Thanks for any pointers.
>>>
>>> James Bellinger
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>>
>