[Gluster-users] random disconnects of peers

Sun Sep 18 22:11:53 UTC 2022

By the way, try to capture the traffic on the systems and compare if only specific packages are not delivered to the destination.
Overall JF won't give you a 2-digit improvement, so in your case I would switch to 1500 MTU.
Best Regards,Strahil Nikolov 

    I already updated the firmware of the NICs few weeks ago. The switch 
firmware is up to date. I already changed the whole switch to a 
completely different model without effect before (maybe 6 months ago). 
There are no other systems which are using jumbo frames attached to the 
switch.

Am 18.09.2022 21:07 schrieb Strahil Nikolov:
> We are currently shooting in the dark...
> If possible update the Firmware of the NICs and FW of the switch .
> 
> Have you tried if other systems (on the same switch) have issues with
> the Jumbo Frames ?
> 
> Best Regards,
> Strahil Nikolov
> 
>> Yes, i did test the ping with a jumbo frame mtu and it worked
>> without
>> problems. There is no firewall between the storage nodes and the
>> hypervisors. They are using the same layer 2 subnet, so there is
>> only
>> the switch in between. On the switch jumbo frames for the specific
>> wlan
>> is enabled.
>> 
>> I also increased the tx and rx queue length, without succes in
>> relation
>> to the problem.
>> 
>> Am 17.09.2022 10:39 schrieb Strahil Nikolov:
>>> Usually that kind of problems could be on many places.
>>> When you set the MTU to 9000, did you test with ping and the "Do
>> not
>>> fragment" Flag ?
>>> 
>>> If there is a device on the path that is not configured (or
>> doesn't
>>> support MTU9000) , it will fragment all packets and that could
>> lead to
>>> excessive device CPU consumption. I have seen many firewalls to
>> not
>>> use JF by default.
>>> 
>>> ping <IP/HOSTNAME from brick definition> -M do -s 8972
>>> 
>>> Best Regards,
>>> Strahil Nikolov
>>> 
>>> В петък, 16 септември 2022 г., 22:24:14 ч.
>>> Гринуич+3, Gionatan Danti <g.danti at assyoma.it>
>> написа:
>>> 
>>> Il 2022-09-16 18:41 dpgluster at posteo.de ha scritto:
>>>> I have made extensive load tests in the last few days and figured
>>> out
>>>> it's definitely a network related issue. I changed from jumbo
>> frames
>>>> (mtu 9000) to default mtu of 1500. With a mtu of 1500 the problem
>>>> doesn't occur. I'm able to bump the io-wait of our gluster
>> storage
>>>> servers to the max possible values of the disks without any error
>> or
>>>> connection loss between the hypervisors or the storage nodes.
>>>> 
>>>> As mentioned in multiple gluster best practices it's recommended
>> to
>>>> use jumbo frames in gluster setups for better performance. So I
>>> would
>>>> like to use jumbo frames in my datacenter.
>>>> 
>>>> What could be the issue here?
>>> 
>>> I would try with a jumbo frame setting of 4074 (or 4088) bytes.
>>> 
>>> Regards.
>>> 
>>> --
>>> Danti Gionatan
>>> Supporto Tecnico
>>> Assyoma S.r.l. - www.assyoma.it
>>> email: g.danti at assyoma.it - info at assyoma.it
>>> GPG public key ID: FF5F32A8

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20220918/633710bb/attachment.html>