[Gluster-users] Rebalancing newly added bricks

Fri Sep 6 12:28:59 UTC 2019

On Thu, Sep 5, 2019 at 9:56 PM Nithya Balachandran <nbalacha at redhat.com>
wrote:

>
>
> On Thu, 5 Sep 2019 at 02:41, Herb Burnswell <herbert.burnswell at gmail.com>
> wrote:
>
>> Thanks for the replies.  The rebalance is running and the brick
>> percentages are not adjusting as expected:
>>
>> # df -hP |grep data
>> /dev/mapper/gluster_vg-gluster_lv1_data   60T   49T   11T  83%
>> /gluster_bricks/data1
>> /dev/mapper/gluster_vg-gluster_lv2_data   60T   49T   11T  83%
>> /gluster_bricks/data2
>> /dev/mapper/gluster_vg-gluster_lv3_data   60T  4.6T   55T   8%
>> /gluster_bricks/data3
>> /dev/mapper/gluster_vg-gluster_lv4_data   60T  4.6T   55T   8%
>> /gluster_bricks/data4
>> /dev/mapper/gluster_vg-gluster_lv5_data   60T  4.6T   55T   8%
>> /gluster_bricks/data5
>> /dev/mapper/gluster_vg-gluster_lv6_data   60T  4.6T   55T   8%
>> /gluster_bricks/data6
>>
>> At the current pace it looks like this will continue to run for another
>> 5-6 days.
>>
>> I appreciate the guidance..
>>
>>
> What is the output of the rebalance status command?
> Can you check if there are any errors in the rebalance logs on the node
> on which you see rebalance activity?
> If there are a lot of small files on the volume, the rebalance is expected
> to take time.
>
> Regards,
> Nithya
>

My apologies, that was a typo.  I meant to say:

"The rebalance is running and the brick percentages are NOW adjusting as
expected"

I did expect the rebalance to take several days.  The rebalance log is not
showing any errors.  Status output:

# gluster vol rebalance tank status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost          1251320        35.5TB
  2079527             0             0          in progress      139:9:46
                               serverB                         0
 0Bytes             7             0             0            completed
  63:47:55
volume rebalance: tank: success

Thanks again for the guidance.

HB

>
>>
>> On Mon, Sep 2, 2019 at 9:08 PM Nithya Balachandran <nbalacha at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Sat, 31 Aug 2019 at 22:59, Herb Burnswell <
>>> herbert.burnswell at gmail.com> wrote:
>>>
>>>> Thank you for the reply.
>>>>
>>>> I started a rebalance with force on serverA as suggested.  Now I see
>>>> 'activity' on that node:
>>>>
>>>> # gluster vol rebalance tank status
>>>>                                     Node Rebalanced-files          size
>>>>       scanned      failures       skipped               status  run time in
>>>> h:m:s
>>>>                                ---------      -----------   -----------
>>>>   -----------   -----------   -----------         ------------
>>>> --------------
>>>>                                localhost             6143         6.1GB
>>>>          9542             0             0          in progress        0:4:5
>>>>                                serverB                  0        0Bytes
>>>>             7             0             0          in progress        0:4:5
>>>> volume rebalance: tank: success
>>>>
>>>> But I am not seeing any activity on serverB.  Is this expected?  Does
>>>> the rebalance need to run on each node even though it says both nodes are
>>>> 'in progress'?
>>>>
>>>>
>>> It looks like this is a replicate volume. If that is the case then yes,
>>> you are running an old version of Gluster for which this was the default
>>> behaviour.
>>>
>>> Regards,
>>> Nithya
>>>
>>> Thanks,
>>>>
>>>> HB
>>>>
>>>> On Sat, Aug 31, 2019 at 4:18 AM Strahil <hunter86_bg at yahoo.com> wrote:
>>>>
>>>>> The rebalance status show 0 Bytes.
>>>>>
>>>>> Maybe you should try with the 'gluster volume rebalance <VOLNAME>
>>>>> start force' ?
>>>>>
>>>>> Best Regards,
>>>>> Strahil Nikolov
>>>>>
>>>>> Source:
>>>>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#rebalancing-volumes
>>>>> On Aug 30, 2019 20:04, Herb Burnswell <herbert.burnswell at gmail.com>
>>>>> wrote:
>>>>>
>>>>> All,
>>>>>
>>>>> RHEL 7.5
>>>>> Gluster 3.8.15
>>>>> 2 Nodes: serverA & serverB
>>>>>
>>>>> I am not deeply knowledgeable about Gluster and it's administration
>>>>> but we have a 2 node cluster that's been running for about a year and a
>>>>> half.  All has worked fine to date.  Our main volume has consisted of two
>>>>> 60TB bricks on each of the cluster nodes.  As we reached capacity on the
>>>>> volume we needed to expand.  So, we've added four new 60TB bricks to each
>>>>> of the cluster nodes.  The bricks are now seen, and the total size of the
>>>>> volume is as expected:
>>>>>
>>>>> # gluster vol status tank
>>>>> Status of volume: tank
>>>>> Gluster process                             TCP Port  RDMA Port
>>>>>  Online  Pid
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Brick serverA:/gluster_bricks/data1       49162     0          Y
>>>>> 20318
>>>>> Brick serverB:/gluster_bricks/data1       49166     0          Y
>>>>> 3432
>>>>> Brick serverA:/gluster_bricks/data2       49163     0          Y
>>>>> 20323
>>>>> Brick serverB:/gluster_bricks/data2       49167     0          Y
>>>>> 3435
>>>>> Brick serverA:/gluster_bricks/data3       49164     0          Y
>>>>> 4625
>>>>> Brick serverA:/gluster_bricks/data4       49165     0          Y
>>>>> 4644
>>>>> Brick serverA:/gluster_bricks/data5       49166     0          Y
>>>>> 5088
>>>>> Brick serverA:/gluster_bricks/data6       49167     0          Y
>>>>> 5128
>>>>> Brick serverB:/gluster_bricks/data3       49168     0          Y
>>>>> 22314
>>>>> Brick serverB:/gluster_bricks/data4       49169     0          Y
>>>>> 22345
>>>>> Brick serverB:/gluster_bricks/data5       49170     0          Y
>>>>> 22889
>>>>> Brick serverB:/gluster_bricks/data6       49171     0          Y
>>>>> 22932
>>>>> Self-heal Daemon on localhost             N/A       N/A        Y
>>>>> 22981
>>>>> Self-heal Daemon on serverA.example.com   N/A       N/A        Y
>>>>>   6202
>>>>>
>>>>> After adding the bricks we ran a rebalance from serverA as:
>>>>>
>>>>> # gluster volume rebalance tank start
>>>>>
>>>>> The rebalance completed:
>>>>>
>>>>> # gluster volume rebalance tank status
>>>>>                                     Node Rebalanced-files
>>>>>  size       scanned      failures       skipped               status  run
>>>>> time in h:m:s
>>>>>                                ---------      -----------
>>>>> -----------   -----------   -----------   -----------         ------------
>>>>>     --------------
>>>>>                                localhost                0
>>>>>  0Bytes             0             0             0            completed
>>>>>    3:7:10
>>>>>                              serverA.example.com        0
>>>>>  0Bytes             0             0             0            completed
>>>>>    0:0:0
>>>>> volume rebalance: tank: success
>>>>>
>>>>> However, when I run a df, the two original bricks still show all of
>>>>> the consumed space (this is the same on both nodes):
>>>>>
>>>>> # df -hP
>>>>> Filesystem                               Size  Used Avail Use% Mounted
>>>>> on
>>>>> /dev/mapper/vg0-root                     5.0G  625M  4.4G  13% /
>>>>> devtmpfs                                  32G     0   32G   0% /dev
>>>>> tmpfs                                     32G     0   32G   0% /dev/shm
>>>>> tmpfs                                     32G   67M   32G   1% /run
>>>>> tmpfs                                     32G     0   32G   0%
>>>>> /sys/fs/cgroup
>>>>> /dev/mapper/vg0-usr                       20G  3.6G   17G  18% /usr
>>>>> /dev/md126                              1014M  228M  787M  23% /boot
>>>>> /dev/mapper/vg0-home                     5.0G   37M  5.0G   1% /home
>>>>> /dev/mapper/vg0-opt                      5.0G   37M  5.0G   1% /opt
>>>>> /dev/mapper/vg0-tmp                      5.0G   33M  5.0G   1% /tmp
>>>>> /dev/mapper/vg0-var                       20G  2.6G   18G  13% /var
>>>>> /dev/mapper/gluster_vg-gluster_lv1_data   60T   59T  1.1T  99%
>>>>> /gluster_bricks/data1
>>>>> /dev/mapper/gluster_vg-gluster_lv2_data   60T   58T  1.3T  98%
>>>>> /gluster_bricks/data2
>>>>> /dev/mapper/gluster_vg-gluster_lv3_data   60T  451M   60T   1%
>>>>> /gluster_bricks/data3
>>>>> /dev/mapper/gluster_vg-gluster_lv4_data   60T  451M   60T   1%
>>>>> /gluster_bricks/data4
>>>>> /dev/mapper/gluster_vg-gluster_lv5_data   60T  451M   60T   1%
>>>>> /gluster_bricks/data5
>>>>> /dev/mapper/gluster_vg-gluster_lv6_data   60T  451M   60T   1%
>>>>> /gluster_bricks/data6
>>>>> localhost:/tank                          355T  116T  239T  33%
>>>>> /mnt/tank
>>>>>
>>>>> We were thinking that the used space would be distributed across the
>>>>> now 6 bricks after rebalance.  Is that not what a rebalance does?  Is this
>>>>> expected behavior?
>>>>>
>>>>> Can anyone provide some guidance as to what the behavior here and if
>>>>> there is anything that we need to do at this point?
>>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> HB
>>>>>
>>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190906/bc4cfa8f/attachment.html>