[Gluster-users] random disconnects of peers
Strahil Nikolov
hunter86_bg at yahoo.com
Sun Sep 18 22:11:53 UTC 2022
By the way, try to capture the traffic on the systems and compare if only specific packages are not delivered to the destination.
Overall JF won't give you a 2-digit improvement, so in your case I would switch to 1500 MTU.
Best Regards,Strahil Nikolov
I already updated the firmware of the NICs few weeks ago. The switch
firmware is up to date. I already changed the whole switch to a
completely different model without effect before (maybe 6 months ago).
There are no other systems which are using jumbo frames attached to the
switch.
Am 18.09.2022 21:07 schrieb Strahil Nikolov:
> We are currently shooting in the dark...
> If possible update the Firmware of the NICs and FW of the switch .
>
> Have you tried if other systems (on the same switch) have issues with
> the Jumbo Frames ?
>
> Best Regards,
> Strahil Nikolov
>
>> Yes, i did test the ping with a jumbo frame mtu and it worked
>> without
>> problems. There is no firewall between the storage nodes and the
>> hypervisors. They are using the same layer 2 subnet, so there is
>> only
>> the switch in between. On the switch jumbo frames for the specific
>> wlan
>> is enabled.
>>
>> I also increased the tx and rx queue length, without succes in
>> relation
>> to the problem.
>>
>> Am 17.09.2022 10:39 schrieb Strahil Nikolov:
>>> Usually that kind of problems could be on many places.
>>> When you set the MTU to 9000, did you test with ping and the "Do
>> not
>>> fragment" Flag ?
>>>
>>> If there is a device on the path that is not configured (or
>> doesn't
>>> support MTU9000) , it will fragment all packets and that could
>> lead to
>>> excessive device CPU consumption. I have seen many firewalls to
>> not
>>> use JF by default.
>>>
>>> ping <IP/HOSTNAME from brick definition> -M do -s 8972
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>> В петък, 16 септември 2022 г., 22:24:14 ч.
>>> Гринуич+3, Gionatan Danti <g.danti at assyoma.it>
>> написа:
>>>
>>> Il 2022-09-16 18:41 dpgluster at posteo.de ha scritto:
>>>> I have made extensive load tests in the last few days and figured
>>> out
>>>> it's definitely a network related issue. I changed from jumbo
>> frames
>>>> (mtu 9000) to default mtu of 1500. With a mtu of 1500 the problem
>>>> doesn't occur. I'm able to bump the io-wait of our gluster
>> storage
>>>> servers to the max possible values of the disks without any error
>> or
>>>> connection loss between the hypervisors or the storage nodes.
>>>>
>>>> As mentioned in multiple gluster best practices it's recommended
>> to
>>>> use jumbo frames in gluster setups for better performance. So I
>>> would
>>>> like to use jumbo frames in my datacenter.
>>>>
>>>> What could be the issue here?
>>>
>>> I would try with a jumbo frame setting of 4074 (or 4088) bytes.
>>>
>>> Regards.
>>>
>>> --
>>> Danti Gionatan
>>> Supporto Tecnico
>>> Assyoma S.r.l. - www.assyoma.it
>>> email: g.danti at assyoma.it - info at assyoma.it
>>> GPG public key ID: FF5F32A8
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220918/633710bb/attachment.html>
More information about the Gluster-users
mailing list