[Gluster-users] Rebalancing newly added bricks

Herb Burnswell herbert.burnswell at gmail.com
Wed Sep 18 20:28:42 UTC 2019


>
> Hi,
>
> Rebalance will abort itself if it cannot reach any of the nodes. Are all
> the bricks still up and reachable?
>
> Regards,
> Nithya
>

Yes the bricks appear to be fine.  I restarted the rebalance and the
process is moving along again:

# gluster vol rebalance tank status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           226973        14.9TB
  1572952             0             0          in progress       44:26:48
                               serverB                       0
 0Bytes        631667             0             0            completed
  37:2:14
volume rebalance: tank: success

# df -hP |grep data
/dev/mapper/gluster_vg-gluster_lv1_data   60T   24T   36T  40%
/gluster_bricks/data1
/dev/mapper/gluster_vg-gluster_lv2_data   60T   24T   36T  40%
/gluster_bricks/data2
/dev/mapper/gluster_vg-gluster_lv3_data   60T   17T   43T  29%
/gluster_bricks/data3
/dev/mapper/gluster_vg-gluster_lv4_data   60T   17T   43T  29%
/gluster_bricks/data4
/dev/mapper/gluster_vg-gluster_lv5_data   60T   19T   41T  31%
/gluster_bricks/data5
/dev/mapper/gluster_vg-gluster_lv6_data   60T   19T   41T  31%
/gluster_bricks/data6

Thanks,

HB



>
>
>
>
>>
>> # gluster vol rebalance tank status
>>                                     Node Rebalanced-files          size
>>     scanned      failures       skipped               status  run time in
>> h:m:s
>>                                ---------      -----------   -----------
>> -----------   -----------   -----------         ------------
>> --------------
>>                                localhost          1348706        57.8TB
>>     2234439             9             6               failed      190:24:3
>>                                serverB                         0
>>  0Bytes             7             0             0            completed
>>   63:47:55
>> volume rebalance: tank: success
>>
>> # gluster vol status tank
>> Status of volume: tank
>> Gluster process                             TCP Port  RDMA Port  Online
>>  Pid
>>
>> ------------------------------------------------------------------------------
>> Brick serverA:/gluster_bricks/data1       49162     0          Y
>> 20318
>> Brick serverB:/gluster_bricks/data1       49166     0          Y
>> 3432
>> Brick serverA:/gluster_bricks/data2       49163     0          Y
>> 20323
>> Brick serverB:/gluster_bricks/data2       49167     0          Y
>> 3435
>> Brick serverA:/gluster_bricks/data3       49164     0          Y
>> 4625
>> Brick serverA:/gluster_bricks/data4       49165     0          Y
>> 4644
>> Brick serverA:/gluster_bricks/data5       49166     0          Y
>> 5088
>> Brick serverA:/gluster_bricks/data6       49167     0          Y
>> 5128
>> Brick serverB:/gluster_bricks/data3       49168     0          Y
>> 22314
>> Brick serverB:/gluster_bricks/data4       49169     0          Y
>> 22345
>> Brick serverB:/gluster_bricks/data5       49170     0          Y
>> 22889
>> Brick serverB:/gluster_bricks/data6       49171     0          Y
>> 22932
>> Self-heal Daemon on localhost               N/A       N/A        Y
>> 6202
>> Self-heal Daemon on serverB               N/A       N/A        Y
>> 22981
>>
>> Task Status of Volume tank
>>
>> ------------------------------------------------------------------------------
>> Task                 : Rebalance
>> ID                   : eec64343-8e0d-4523-ad05-5678f9eb9eb2
>> Status               : failed
>>
>> # df -hP |grep data
>> /dev/mapper/gluster_vg-gluster_lv1_data   60T   31T   29T  52%
>> /gluster_bricks/data1
>> /dev/mapper/gluster_vg-gluster_lv2_data   60T   31T   29T  51%
>> /gluster_bricks/data2
>> /dev/mapper/gluster_vg-gluster_lv3_data   60T   15T   46T  24%
>> /gluster_bricks/data3
>> /dev/mapper/gluster_vg-gluster_lv4_data   60T   15T   46T  24%
>> /gluster_bricks/data4
>> /dev/mapper/gluster_vg-gluster_lv5_data   60T   15T   45T  25%
>> /gluster_bricks/data5
>> /dev/mapper/gluster_vg-gluster_lv6_data   60T   15T   45T  25%
>> /gluster_bricks/data6
>>
>>
>> The rebalance log on serverA shows a disconnect from serverB
>>
>> [2019-09-08 15:41:44.285591] C
>> [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-tank-client-10: server
>> <serverB>:49170 has not responded in the last 42 seconds, disconnecting.
>> [2019-09-08 15:41:44.285739] I [MSGID: 114018]
>> [client.c:2280:client_rpc_notify] 0-tank-client-10: disconnected from
>> tank-client-10. Client process will keep trying to connect to glusterd
>> until brick's port is available
>> [2019-09-08 15:41:44.286023] E [rpc-clnt.c:365:saved_frames_unwind] (-->
>> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7ff986e8b132] (-->
>> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7ff986c5299e] (-->
>> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7ff986c52aae] (-->
>> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x90)[0x7ff986c54220] (-->
>> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x2b0)[0x7ff986c54ce0] )))))
>> 0-tank-client-10: forced unwinding frame type(GlusterFS 3.3)
>> op(FXATTROP(34)) called at 2019-09-08 15:40:44.040333 (xid=0x7f8cfac)
>>
>> Does this type of failure cause data corruption?  What is the best course
>> of action at this point?
>>
>> Thanks,
>>
>> HB
>>
>> On Wed, Sep 11, 2019 at 11:58 PM Strahil <hunter86_bg at yahoo.com> wrote:
>>
>>> Hi Nithya,
>>>
>>> Thanks for the detailed explanation.
>>> It makes sense.
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>> On Sep 12, 2019 08:18, Nithya Balachandran <nbalacha at redhat.com> wrote:
>>>
>>>
>>>
>>> On Wed, 11 Sep 2019 at 09:47, Strahil <hunter86_bg at yahoo.com> wrote:
>>>
>>> Hi Nithya,
>>>
>>> I just reminded about your previous  e-mail  which left me with the
>>> impression that old volumes need that.
>>> This is the one 1 mean:
>>>
>>> >It looks like this is a replicate volume. If >that is the case then
>>> yes, you are >running an old version of Gluster for >which this was the
>>> default
>>>
>>>
>>> Hi Strahil,
>>>
>>> I'm providing a little more detail here which I hope will explain things.
>>> Rebalance was always a volume wide operation - a *rebalance start*
>>> operation will start rebalance processes on all nodes of the volume.
>>> However, different processes would behave differently. In earlier releases,
>>> all nodes would crawl the bricks and update the directory layouts. However,
>>> only one node in each replica/disperse set would actually migrate files,so
>>> the rebalance status would only show one node doing any "work" (scanning,
>>> rebalancing etc). However, this one node will process all the files in its
>>> replica sets. Rerunning rebalance on other nodes would make no difference
>>> as it will always be the same node that ends up migrating files.
>>> So for instance, for a replicate volume with server1:/brick1,
>>> server2:/brick2 and server3:/brick3 in that order, only the rebalance
>>> process on server1 would migrate files. In newer releases, all 3 nodes
>>> would migrate files.
>>>
>>> The rebalance status does not capture the directory operations of fixing
>>> layouts which is why it looks like the other nodes are not doing anything.
>>>
>>> Hope this helps.
>>>
>>> Regards,
>>> Nithya
>>>
>>> behaviour.
>>>
>>> >
>>> >
>>>
>>> >Regards,
>>>
>>> >
>>>
>>> >Nithya
>>>
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>> On Sep 9, 2019 06:36, Nithya Balachandran <nbalacha at redhat.com> wrote:
>>>
>>>
>>>
>>> On Sat, 7 Sep 2019 at 00:03, Strahil Nikolov <hunter86_bg at yahoo.com>
>>> wrote:
>>>
>>> As it was mentioned, you might have to run rebalance on the other node -
>>> but it is better to wait this node is over.
>>>
>>>
>>> Hi Strahil,
>>>
>>> Rebalance does not need to be run on the other node - the operation is a
>>> volume wide one . Only a single node per replica set would migrate files in
>>> the version used in this case .
>>>
>>> Regards,
>>> Nithya
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> В петък, 6 септември 2019 г., 15:29:20 ч. Гринуич+3, Herb Burnswell <
>>> herbert.burnswell at gmail.com>
>>>
>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190918/e4f048da/attachment.html>


More information about the Gluster-users mailing list