[Gluster-users] update to 4.1.6-1 and fix-layout failing

Mon Jan 7 15:18:31 UTC 2019

On Fri, 4 Jan 2019 at 17:10, mohammad kashif <kashif.alig at gmail.com> wrote:

> Hi Nithya
>
> rebalance logs has only these warnings
> 2019-01-04 09:59:20.826261] W [rpc-clnt.c:1753:rpc_clnt_submit]
> 0-atlasglust-client-5: error returned while attempting to connect to
> host:(null), port:0 [2019-01-04 09:59:20.828113] W
> [rpc-clnt.c:1753:rpc_clnt_submit] 0-atlasglust-client-6: error returned
> while attempting to connect to host:(null), port:0 [2019-01-04
> 09:59:20.832017] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-atlasglust-client-4:
> error returned while attempting to connect to host:(null), port:0
>

Please send me the rebalance logs if possible. Are 08 and 09 the newly
added nodes?  Are no directories being created on those ?

>
> gluster volume rebalance atlasglust status
>                                Node
> status           run time in h:m:s
>                           ---------
> -----------                ------------
>                           localhost                             fix-layout
> in progress        1:0:59
>      pplxgluster02.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster03.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster04.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster05.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster06.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster07.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster08.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>      pplxgluster09.physics.ox.ac.uk
> fix-layout in progress        1:0:59
>
> But there is no new entry in logs for last one hour and I can't see any
> new directories being created.
>
> Thanks
>
> Kashif
>
>
> On Fri, Jan 4, 2019 at 10:42 AM Nithya Balachandran <nbalacha at redhat.com>
> wrote:
>
>>
>>
>> On Fri, 4 Jan 2019 at 15:48, mohammad kashif <kashif.alig at gmail.com>
>> wrote:
>>
>>> Hi
>>>
>>> I have updated our distributed gluster storage from 3.12.9-1 to 4.1.6-1.
>>> The existing cluster had seven servers totalling in around 450 TB. OS is
>>> Centos7.  The update went OK and I could access files.
>>> Then I added two more servers of 90TB each to cluster and started
>>> fix-layout
>>>
>>> gluster volume rebalance atlasglust fix-layout start
>>>
>>> Some directories were created at new servers and then stopped although
>>> rebalance status was showing that it is still running. I think it stopped
>>> creating new directories after this error
>>>
>>> E [MSGID: 106061]
>>> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd:
>>> failed to get index The message "E [MSGID: 106061]
>>> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd:
>>> failed to get index" repeated 7 times between [2019-01-03 13:16:31.146779]
>>> and [2019-01-03 13:16:31.158612]
>>>
>>>
>> There are also many warning like this
>>> [2019-01-03 16:04:34.120777] I [MSGID: 106499]
>>> [glusterd-handler.c:4314:__glusterd_handle_status_volume] 0-management:
>>> Received status volume req for volume atlasglust [2019-01-03
>>> 17:04:28.541805] W [rpc-clnt.c:1753:rpc_clnt_submit] 0-management: error
>>> returned while attempting to connect to host:(null), port:0
>>>
>>> These are the glusterd logs. Do you see any errors in the rebalance logs
>> for this volume?
>>
>>
>>> I waited for around 12 hours and then stopped fix-layout and started
>>> again
>>> I can see the same error again
>>>
>>> [2019-01-04 09:59:20.825930] E [MSGID: 106061]
>>> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd:
>>> failed to get index The message "E [MSGID: 106061]
>>> [glusterd-utils.c:10697:glusterd_volume_rebalance_use_rsp_dict] 0-glusterd:
>>> failed to get index" repeated 7 times between [2019-01-04 09:59:20.825930]
>>> and [2019-01-04 09:59:20.837068]
>>>
>>> Please suggest as it is our production service.
>>>
>>> At the moment, I have stopped clients from using file system. Would it
>>> be OK if I allow clients to access file system while fix-layout is still
>>> going.
>>>
>>> Thanks
>>>
>>> Kashif
>>>
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190107/983675c8/attachment.html>