[Gluster-users] rebalance and its alternatives
Hans Lambermont
hans at shapeways.com
Fri Apr 26 10:32:01 UTC 2013
Hi Vijay,
Vijay Bellur wrote on 20130426:
> On 04/25/2013 12:57 AM, Hans Lambermont wrote:
> >For a replicated volume with a brick pair nearly full and several fresh
> >and empty bricks just added do you think the following would work ?
> >
> >Walk the nearly-full brick filesystem to find files on it. Use this list
> >to move them one by one away from the glusterfs volume to temp space,
> >and then move them back in.
> >I'm hoping this achieves a brick-targetted rebalance.
> >
> >Would this work ?
>
> This should normally work.
I went ahead and tried it. It works quite well for files that were added
to the volume before I added new bricks. I used this :
find /bricks/f/ -type f -size +1M -mtime +40 -exec outandin.sh {} \;
with outandin.sh being a small script :
#!/bin/bash
set -e
BRICKFILEWITHPATH="$1"
INGLUSTERFILEWITHPATH=`echo $BRICKFILEWITHPATH|sed -e's@/gluster/./@@'`
GLUSTERFILEWITHPATH="/volumemountpoint/$INGLUSTERFILEWITHPATH"
FILE=`basename "$GLUSTERFILEWITHPATH"`
SIZE=`/usr/bin/du -hs "$BRICKFILEWITHPATH"|/usr/bin/cut -f1`
echo -en "$SIZE\t$BRICKFILEWITHPATH"
/bin/mv "$GLUSTERFILEWITHPATH" "/tmp/$FILE"
/bin/mv "/tmp/$FILE" "$GLUSTERFILEWITHPATH"
if [ -f "$BRICKFILEWITHPATH" ]; then
NEWSIZE=`stat --printf='%s' "$BRICKFILEWITHPATH"`
if [ $NEWSIZE -gt 0 ]; then
echo -e "\tbut it moved BACK with size $NEWSIZE"
else
echo -e "\tmoved away leaving a ---------T empty file"
fi
else
echo -e "\tmoved away"
fi
> >Reason i'm looking into this solution is that a regular rebalance just
> >takes too long. Long as in 100+ days.
> >
> >Do you know of other alternatives ? Or to make rebalance start
> >rebalancing files right away ?
>
> What is the size of your volume?
26 TiB on 4 nodes with the volume around 60%, 20 M directories, 40 M files.
> Can you please provide details of the glusterfs version and the
> command that was issued for rebalancing?
I'm now using 3.3.1, the previous rebalance of half the current data was on
3.2.5, that took 7 days for the fix-layout and 40 days for the migrate-data
rebalance. Extrapolating that to today's data amount gives me about 100 days.
The commands I used on 3.2.5 were :
gluster volume rebalance volume1 fix-layout start
and use status until it said "rebalance step 1: layout fix complete" after which I used :
gluster volume rebalance volume1 migrate-data start
I expect to use the same commands on 3.3.1
I'm about to do a new rebalance, but I first need the open
filedescriptors leak of https://bugzilla.redhat.com/show_bug.cgi?id=928631
fixed . I'm going to retry the release-3.3 head as last time I tested that
I could not write data to the volume :
https://bugzilla.redhat.com/show_bug.cgi?id=956245 I'm going to retest that.
If you have any advice on how to best move forward I'm very interested in it.
regards,
Hans Lambermont
--
Hans Lambermont | Senior Architect
(t) +31407370104 (w) www.shapeways.com
More information about the Gluster-users
mailing list