[Gluster-users] Getting timedout error while rebalancing
Sanju Rakonde
srakonde at redhat.com
Fri Feb 8 09:03:24 UTC 2019
Hi Deepu,
I can see multiple errors in glusterd log.
[2019-02-06 13:22:21.012490] E
[glusterd-rpc-ops.c:1429:__glusterd_commit_op_cbk]
(-->/lib64/libgfrpc.so.0(+0xec20) [0x7f278d201c20]
-->/usr/lib64/glusterfs/4.1.7/xlator/mgmt/glusterd.so(+0x7762a)
[0x7f2781f1d62a]
-->/usr/lib64/glusterfs/4.1.7/xlator/mgmt/glusterd.so(+0x75213)
[0x7f2781f1b213] ) 0-: Assertion failed: rsp.op == txn_op_info.op ---->
error has repeated multiple times in log.
[2019-02-06 11:16:32.474268] E [MSGID: 106218]
[glusterd-rebalance.c:460:glusterd_rebalance_cmd_validate] 0-glusterd:
Volume test-volume is not a distribute type or contains only 1 brick
[2019-02-06 11:16:32.474361] E [MSGID: 106301]
[glusterd-op-sm.c:4669:glusterd_op_ac_send_stage_op] 0-management: Staging
of operation 'Volume Rebalance' failed on localhost : Volume test-volume is
not a distribute volume or contains only 1 brick.
Not performing rebalance
[2019-02-06 13:18:35.253045] I [MSGID: 106482]
[glusterd-brick-ops.c:448:__glusterd_handle_add_brick] 0-management:
Received add brick req
[2019-02-06 13:18:35.253080] E [MSGID: 106026]
[glusterd-brick-ops.c:483:__glusterd_handle_add_brick] 0-management: Volume
192.168.185.xxx:/home/data/repl does not exist [Invalid argument] ---->
Is the add-brick success?
It is difficult to confirm anything by only looking at the glusterd logs.
Please share glusterd, cli and cmd_history logs from all the nodes and also
provide output of below commands.
1. gluster --version
2. gluster vol info
3. gluster vol status
Thanks,
Sanju
On Thu, Feb 7, 2019 at 1:26 AM deepu srinivasan <sdeepugd at gmail.com> wrote:
> Please find the glusterd.log file attached.
>
> On Wed, Feb 6, 2019 at 2:01 PM Atin Mukherjee <amukherj at redhat.com> wrote:
>
>>
>>
>> On Tue, Feb 5, 2019 at 8:43 PM Nithya Balachandran <nbalacha at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Tue, 5 Feb 2019 at 17:26, deepu srinivasan <sdeepugd at gmail.com>
>>> wrote:
>>>
>>>> HI Nithya
>>>> We have a test gluster setup.We are testing the rebalancing option of
>>>> gluster. So we started the volume which have 1x3 brick with some data on it
>>>> .
>>>> command : gluster volume create test-volume replica 3
>>>> 192.168.xxx.xx1:/home/data/repl 192.168.xxx.xx2:/home/data/repl
>>>> 192.168.xxx.xx3:/home/data/repl.
>>>>
>>>> Now we tried to expand the cluster storage by adding three more bricks.
>>>> command : gluster volume add-brick test-volume 192.168.xxx.xx4:/home/data/repl
>>>> 192.168.xxx.xx5:/home/data/repl 192.168.xxx.xx6:/home/data/repl
>>>>
>>>> So after the brick addition we tried to rebalance the layout and the
>>>> data.
>>>> command : gluster volume rebalance test-volume fix-layout start.
>>>> The command exited with status "Error : Request timed out".
>>>>
>>>
>>> This sounds like an error in the cli or glusterd. Can you send the
>>> glusterd.log from the node on which you ran the command?
>>>
>>
>> It seems to me that glusterd took more than 120 seconds to process the
>> command and hence cli timed out. We can confirm the same by checking the
>> status of the rebalance below which indicates rebalance did kick in and
>> eventually completed. We need to understand why did it take such longer, so
>> please pass on the cli and glusterd log from all the nodes as Nithya
>> requested for.
>>
>>
>>> regards,
>>> Nithya
>>>
>>>>
>>>> After the failure of the command, we tried to view the status of the
>>>> command and it is something like this :
>>>>
>>>> Node Rebalanced-files size
>>>> scanned failures skipped status run
>>>> time in h:m:s
>>>>
>>>> --------- ----------- -----------
>>>> ----------- ----------- ----------- ------------
>>>> --------------
>>>>
>>>> localhost 41 41.0MB
>>>> 8200 0 0 completed
>>>> 0:00:09
>>>>
>>>> 192.168.xxx.xx4 79 79.0MB
>>>> 8231 0 0 completed
>>>> 0:00:12
>>>>
>>>> 192.168.xxx.xx6 58 58.0MB
>>>> 8281 0 0 completed
>>>> 0:00:10
>>>>
>>>> 192.168.xxx.xx2 136 136.0MB
>>>> 8566 0 136 completed
>>>> 0:00:07
>>>>
>>>> 192.168.xxx.xx4 129 129.0MB
>>>> 8566 0 129 completed
>>>> 0:00:07
>>>>
>>>> 192.168.xxx.xx6 201 201.0MB
>>>> 8566 0 201 completed
>>>> 0:00:08
>>>>
>>>> Is the rebalancing option working fine? Why did gluster throw the
>>>> error saying that "Error : Request timed out"?
>>>> .On Tue, Feb 5, 2019 at 4:23 PM Nithya Balachandran <
>>>> nbalacha at redhat.com> wrote:
>>>>
>>>>> Hi,
>>>>> Please provide the exact step at which you are seeing the error. It
>>>>> would be ideal if you could copy-paste the command and the error.
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 5 Feb 2019 at 15:24, deepu srinivasan <sdeepugd at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> HI everyone. I am getting "Error : Request timed out " while doing
>>>>>> rebalance . I have aded new bricks to my replicated volume.i.e. First it
>>>>>> was 1x3 volume and added three more bricks to make it
>>>>>> distributed-replicated volume(2x3) . What should i do for the timeout error
>>>>>> ?
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
--
Thanks,
Sanju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190208/5e33d4e6/attachment.html>
More information about the Gluster-users
mailing list