[Gluster-users] Advice on rebuilding underlying filesystem

Andrew Smith smith.andrew.james at gmail.com
Fri Apr 11 21:48:51 UTC 2014


Yes, I think that is right, though I would love some expert confirmation.

I did a REMOVE, REFORMAT, then an ADD + REMOVE. The final ADD + REMOVE
combination did not behave as predicted because the first remove unbalanced
the system. 

On Apr 11, 2014, at 5:43 PM, Machiel Groeneveld <machielg at gmail.com> wrote:

> That message uses add + remove. You did remove + add. Maybe that matters?
> 
> Sent from my iPad
> 
>> On 11 Apr 2014, at 23:38, Andrew Smith <smith.andrew.james at gmail.com> wrote:
>> 
>> 
>> My understanding is that “replace-brick” is deprecated 
>> 
>> http://www.gluster.org/pipermail/gluster-users/2012-October/034473.html
>> 
>> And that the “add-brick” followed by “remove-brick” should behave
>> the same way.
>> 
>> It does not behave as predicted, I think, because my system is 
>> unbalanced. I have no idea whether or no the “replace-brick” 
>> command will behave differently. 
>> 
>> Andy
>> 
>> 
>>> On Apr 11, 2014, at 5:34 PM, Machiel Groeneveld <machielg at gmail.com> wrote:
>>> 
>>> Isn't that what replace-brick is for?
>>> 
>>> 
>>>> On 11 Apr 2014, at 23:32, Andrew Smith <smith.andrew.james at gmail.com> wrote:
>>>> 
>>>> 
>>>> Hi, I have a problem, which I hope for your sake, is uncommon.
>>>> 
>>>> I built a Gluster  volume with 8 bricks, 4 80TB and 4 68TB with 
>>>> a total capacity of about 600TB. The underlying filesystem 
>>>> is BTRFS. 
>>>> 
>>>> I found out after the system was half full that BTRFS was a 
>>>> bad idea. BTRFS doesn’t have inodes. It allocates some fraction
>>>> of the disk space to metadata and when it runs out, it allocates
>>>> more. This allocation process on large volumes is painfully slow
>>>> and brings effective write speeds down to only a few MB/s with long 
>>>> timeouts. The data can be read at high speeds, but writing to the
>>>> volume is a big fat mess. Reading is still fairly fast though, 
>>>> so access to the my data by users is acceptable. 
>>>> 
>>>> I need to keep this volume available and I don’t have a second 
>>>> copy of the hardware to rebuild the system on. So, I need to do 
>>>> an in-situ transition from BTRFS to XFS. 
>>>> 
>>>> To do this, I first cleared out some data to free up metadata space,
>>>> and then with much difficulty managed to do a 
>>>> 
>>>> # gluster volume remove-brick 
>>>> 
>>>> I retired the removed brick and then reformatted it with XFS and added
>>>> it back to my Gluster volume. At this point, I thought I was nearly 
>>>> home. I thought I could retire a second brick and the data would 
>>>> be copied to the empty brick. However, this is not what happens.
>>>> Some data ends up on the newly added brick, but some of the data 
>>>> flows elsewhere, which due to the BTRFS problem is a nightmare.
>>>> 
>>>> I assume this is because when I took my volume from 8 bricks to 7, it 
>>>> became unbalanced. The data on the brick that I was retiring 
>>>> belongs on several different bricks and so I am not just doing a 
>>>> substitution.
>>>> 
>>>> I need to be able to tell my Gluster volume to include all the bricks, 
>>>> but do not write files to any of the BTRFS bricks so that it puts data
>>>> only on the XFS brick. If I could somehow tell Gluster that these bricks
>>>> were full, that would suffice. 
>>>> 
>>>> I could do a "rebalance migrate-data" to make make the data on the BTRFS
>>>> volumes more uniform, but I don’t know how this will work. Does reposition
>>>> the data brick by brick or file by file. Brick by brick would be bad, since
>>>> the last brick to rebalance would need to receive all the data that it requires
>>>> before it would get to write data out to free up metadata space. 
>>>> 
>>>> There is a “rebalance-brick” option in the man page, but I don’t see that 
>>>> documented. This may be useful, but I have no idea what it will do.
>>>> 
>>>> Is there a solution to my problem? Whip it and start over is not helpful. 
>>>> Any help on how I can predict where data will go will also help.
>>>> 
>>>> Andy
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>> 




More information about the Gluster-users mailing list