[Gluster-users] Rebalancing newly added bricks

Mon Sep 9 03:36:28 UTC 2019

On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov <hunter86_bg at yahoo.com> wrote:

> As it was mentioned, you might have to run rebalance on the other node -
> but it is better to wait this node is over.
>
>
Hi Strahil,

Rebalance does not need to be run on the other node - the operation is a
volume wide one . Only a single node per replica set would migrate files in
the version used in this case .

Regards,
Nithya

Best Regards,
> Strahil Nikolov
>
> В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell <
> herbert.burnswell at gmail.com> написа:
>
>
>
>
> On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran <nbalacha at redhat.com>
> wrote:
>
>
>
> On Thu, 5 Sep 2019 at 02:41, Herb Burnswell <herbert.burnswell at gmail.com>
> wrote:
>
> Thanks for the replies.  The rebalance is running and the brick
> percentages are not adjusting as expected:
>
> # df -hP |grep data
> /dev/mapper/gluster_vg-gluster_lv1_data   60T   49T   11T  83%
> /gluster_bricks/data1
> /dev/mapper/gluster_vg-gluster_lv2_data   60T   49T   11T  83%
> /gluster_bricks/data2
> /dev/mapper/gluster_vg-gluster_lv3_data   60T  4.6T   55T   8%
> /gluster_bricks/data3
> /dev/mapper/gluster_vg-gluster_lv4_data   60T  4.6T   55T   8%
> /gluster_bricks/data4
> /dev/mapper/gluster_vg-gluster_lv5_data   60T  4.6T   55T   8%
> /gluster_bricks/data5
> /dev/mapper/gluster_vg-gluster_lv6_data   60T  4.6T   55T   8%
> /gluster_bricks/data6
>
> At the current pace it looks like this will continue to run for another
> 5-6 days.
>
> I appreciate the guidance..
>
>
> What is the output of the rebalance status command?
> Can you check if there are any errors in the rebalance logs on the node
> on which you see rebalance activity?
> If there are a lot of small files on the volume, the rebalance is expected
> to take time.
>
> Regards,
> Nithya
>
>
> My apologies, that was a typo.  I meant to say:
>
> "The rebalance is running and the brick percentages are NOW adjusting as
> expected"
>
> I did expect the rebalance to take several days.  The rebalance log is not
> showing any errors.  Status output:
>
> # gluster vol rebalance tank status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost          1251320        35.5TB
>     2079527             0             0          in progress      139:9:46
>                                serverB                         0
>  0Bytes             7             0             0            completed
>   63:47:55
> volume rebalance: tank: success
>
> Thanks again for the guidance.
>
> HB
>
>
>
>
>
> On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran <nbalacha at redhat.com>
> wrote:
>
>
>
> On Sat, 31 Aug 2019 at 22:59, Herb Burnswell <herbert.burnswell at gmail.com>
> wrote:
>
> Thank you for the reply.
>
> I started a rebalance with force on serverA as suggested.  Now I see
> 'activity' on that node:
>
> # gluster vol rebalance tank status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost             6143         6.1GB
>        9542             0             0          in progress        0:4:5
>                                serverB                  0        0Bytes
>           7             0             0          in progress        0:4:5
> volume rebalance: tank: success
>
> But I am not seeing any activity on serverB.  Is this expected?  Does the
> rebalance need to run on each node even though it says both nodes are 'in
> progress'?
>
>
> It looks like this is a replicate volume. If that is the case then yes,
> you are running an old version of Gluster for which this was the default
> behaviour.
>
> Regards,
> Nithya
>
> Thanks,
>
> HB
>
> On Sat, Aug 31, 2019 at 4:18 AM Strahil <hunter86_bg at yahoo.com> wrote:
>
> The rebalance status show 0 Bytes.
>
> Maybe you should try with the 'gluster volume rebalance <VOLNAME> start
> force' ?
>
> Best Regards,
> Strahil Nikolov
>
> Source:
> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes
> On Aug 30, 2019 20:04, Herb Burnswell <herbert.burnswell at gmail.com> wrote:
>
> All,
>
> RHEL 7.5
> Gluster 3.8.15
> 2 Nodes: serverA & serverB
>
> I am not deeply knowledgeable about Gluster and it's administration but we
> have a 2 node cluster that's been running for about a year and a half.  All
> has worked fine to date.  Our main volume has consisted of two 60TB bricks
> on each of the cluster nodes.  As we reached capacity on the volume we
> needed to expand.  So, we've added four new 60TB bricks to each of the
> cluster nodes.  The bricks are now seen, and the total size of the volume
> is as expected:
>
> # gluster vol status tank
> Status of volume: tank
> Gluster process                             TCP Port  RDMA Port  Online
>  Pid
>
> ------------------------------------------------------------------------------
> Brick serverA:/gluster_bricks/data1       49162     0          Y
> 20318
> Brick serverB:/gluster_bricks/data1       49166     0          Y
> 3432
> Brick serverA:/gluster_bricks/data2       49163     0          Y
> 20323
> Brick serverB:/gluster_bricks/data2       49167     0          Y
> 3435
> Brick serverA:/gluster_bricks/data3       49164     0          Y
> 4625
> Brick serverA:/gluster_bricks/data4       49165     0          Y
> 4644
> Brick serverA:/gluster_bricks/data5       49166     0          Y
> 5088
> Brick serverA:/gluster_bricks/data6       49167     0          Y
> 5128
> Brick serverB:/gluster_bricks/data3       49168     0          Y
> 22314
> Brick serverB:/gluster_bricks/data4       49169     0          Y
> 22345
> Brick serverB:/gluster_bricks/data5       49170     0          Y
> 22889
> Brick serverB:/gluster_bricks/data6       49171     0          Y
> 22932
> Self-heal Daemon on localhost             N/A       N/A        Y
> 22981
> Self-heal Daemon on serverA.example.com   N/A       N/A        Y
> 6202
>
> After adding the bricks we ran a rebalance from serverA as:
>
> # gluster volume rebalance tank start
>
> The rebalance completed:
>
> # gluster volume rebalance tank status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost                0        0Bytes
>           0             0             0            completed        3:7:10
>                              serverA.example.com        0        0Bytes
>           0             0             0            completed        0:0:0
> volume rebalance: tank: success
>
> However, when I run a df, the two original bricks still show all of the
> consumed space (this is the same on both nodes):
>
> # df -hP
> Filesystem                               Size  Used Avail Use% Mounted on
> /dev/mapper/vg0-root                     5.0G  625M  4.4G  13% /
> devtmpfs                                  32G     0   32G   0% /dev
> tmpfs                                     32G     0   32G   0% /dev/shm
> tmpfs                                     32G   67M   32G   1% /run
> tmpfs                                     32G     0   32G   0%
> /sys/fs/cgroup
> /dev/mapper/vg0-usr                       20G  3.6G   17G  18% /usr
> /dev/md126                              1014M  228M  787M  23% /boot
> /dev/mapper/vg0-home                     5.0G   37M  5.0G   1% /home
> /dev/mapper/vg0-opt                      5.0G   37M  5.0G   1% /opt
> /dev/mapper/vg0-tmp                      5.0G   33M  5.0G   1% /tmp
> /dev/mapper/vg0-var                       20G  2.6G   18G  13% /var
> /dev/mapper/gluster_vg-gluster_lv1_data   60T   59T  1.1T  99%
> /gluster_bricks/data1
> /dev/mapper/gluster_vg-gluster_lv2_data   60T   58T  1.3T  98%
> /gluster_bricks/data2
> /dev/mapper/gluster_vg-gluster_lv3_data   60T  451M   60T   1%
> /gluster_bricks/data3
> /dev/mapper/gluster_vg-gluster_lv4_data   60T  451M   60T   1%
> /gluster_bricks/data4
> /dev/mapper/gluster_vg-gluster_lv5_data   60T  451M   60T   1%
> /gluster_bricks/data5
> /dev/mapper/gluster_vg-gluster_lv6_data   60T  451M   60T   1%
> /gluster_bricks/data6
> localhost:/tank                          355T  116T  239T  33% /mnt/tank
>
> We were thinking that the used space would be distributed across the now 6
> bricks after rebalance.  Is that not what a rebalance does?  Is this
> expected behavior?
>
> Can anyone provide some guidance as to what the behavior here and if there
> is anything that we need to do at this point?
>
> Thanks in advance,
>
> HB
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190909/feea6703/attachment.html>