[Gluster-users] problems after gluster volume remove-brick

Wed Jan 21 14:33:09 UTC 2015

Adding to my previous mail..
I find a couple of strange errors in the rebalance log 
(/var/log/glusterfs/sr_vol01-rebalance.log)
e.g.:
[2015-01-21 10:00:32.123999] E 
[afr-self-heal-entry.c:1135:afr_sh_entry_impunge_newfile_cbk] 
0-sr_vol01-replicate-11: creation of /some/file/on/the/volume.data on 
sr_vol01-client-23 failed (No space left on device)

Why is the rebalance seemingly not taking account of the space left on 
disks available.
This is the current situation on this particular node:
[root at gluster03 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-lv_root
                        50G  2.4G   45G   5% /
tmpfs                 7.8G     0  7.8G   0% /dev/shm
/dev/sda1             485M   95M  365M  21% /boot
/dev/sdb1             1.9T  577G  1.3T  31% /export/brick1gfs03
/dev/sdc1             1.9T  154G  1.7T   9% /export/brick2gfs03
/dev/sdd1             1.9T  413G  1.5T  23% /export/brick3gfs03
/dev/sde1             1.9T  1.5T  417G  78% /export/brick4gfs03
/dev/sdf1             1.9T  1.6T  286G  85% /export/brick5gfs03
/dev/sdg1             1.9T  1.4T  443G  77% /export/brick6gfs03
/dev/sdh1             1.9T   33M  1.9T   1% /export/brick7gfs03
/dev/sdi1             466G   62G  405G  14% /export/brick8gfs03
/dev/sdj1             466G  166G  301G  36% /export/brick9gfs03
/dev/sdk1             466G  466G   20K 100% /export/brick10gfs03
/dev/sdl1             466G  450G   16G  97% /export/brick11gfs03
/dev/sdm1             1.9T  206G  1.7T  12% /export/brick12gfs03
/dev/sdn1             1.9T  306G  1.6T  17% /export/brick13gfs03
/dev/sdo1             1.9T  107G  1.8T   6% /export/brick14gfs03
/dev/sdp1             1.9T  252G  1.6T  14% /export/brick15gfs03

why are brick10 and brick11 over utilised when there is plenty of space 
on brick 6, 14, etc. ?
Anyone any idea?

Cheers,
Olav

On 21/01/15 13:18, Olav Peeters wrote:
> Hi,
> two days ago is started a gluster volume remove-brick on a 
> Distributed-Replicate volume with 21 x 2 per node (3 in total).
>
> I wanted to remove 4 bricks per node which are smaller than the others 
> (on each node I have 7 x 2TB disks and 4 x 500GB disks).
> I am still on gluster 3.5.2. and I was not aware that using disks of 
> different sizes is only supported as of 3.6.x (am I correct?)
>
> I started with 2 paired disks like so:
> gluster volume remove-brick VOLNAME node03:/export/brick8node03 
> node02:/export/brick10node02 start
>
> I followed the progress (which was very slow):
> gluster volume remove-brick volume_name node03:/export/brick8node03 
> node02:/export/brick10node02 status
> after a day the progress of node03:/export/brick8node03 showed 
> "completed", the other brick remained "in progress"
>
> this morning several VM's with vdi's on the volume started showing 
> disk errors + a couple of gluserfs mounts returned a disk is full type 
> of error on the volume which is only ca. 41% filled with data currently.
>
> via df -h I saw that most of the 500GB disk where indeed 100% full. 
> Others were meanwhile nearly empty..
> Gluster seems to have gone nuts a bit during rebalancing the data.
>
> I did a:
> gluster volume remove-brick VOLNAME node03:/export/brick8node03 
> node02:/export/brick10node02 stop
> and a:
> gluster volume rebalance VOLNAME start
>
> progress is again very slow and some of the disks/bricks which were 
> ca. 98% are now 100% full.
> The situation seems to be both getting worse in some cases and slowly 
> improving e.g. for another pair of bricks (from 100% to 97%).
>
> There clearly has been some data corruption. Some VM's don't want to 
> boot anymore, throwing disk errors.
>
> How do I proceed?
> Wait a very long time for the rebalance to complete and hope that the 
> data corruption is automatically mended?
>
> Upgrade to 3.6.x and hope that the issues (which might be related to 
> me using bricks of different sizes) are resolved and again risk a 
> remove-brick operation?
>
> Should I rather do a:
> gluster volume rebalance VOLNAME migrate-data start
>
> Should I have done a replace-brick instead of a remove-brick operation 
> originally? I thought that replace-brick is becoming obsolete.
>
> Thanks,
> Olav
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150121/698b7cba/attachment.html>