[Gluster-users] Upgrade 5.5 -> 5.6: network traffic bug fixed?

Hu Bert revirii at googlemail.com
Mon Apr 29 07:17:06 UTC 2019


Good morning,

back in office... ;-) i reactivated quick-read on both volumes and
watched traffic, which now looks normal. Well, i did umount/mount of
both gluster volumes after installing the upgrades 5.5 -> 5.6, but it
seems that this wasn't enough? And the changes took place after doing
a reboot (kernel update...) of all clients? Maybe some processes were
still running?

I'll keep watching network traffic and report if i see that it's
higher than usual.


Best regards,
Hubert

Am Di., 23. Apr. 2019 um 15:34 Uhr schrieb Poornima Gurusiddaiah
<pgurusid at redhat.com>:
>
> Hi,
>
> Thank you for the update, sorry for the delay.
>
> I did some more tests, but couldn't see the behaviour of spiked network bandwidth usage when quick-read is on. After upgrading, have you remounted the clients? As in the fix will not be effective until the process is restarted.
> If you have already restarted the client processes, then there must be something related to workload in the live system that is triggering a bug in quick-read. Would need wireshark capture if possible, to debug further.
>
> Regards,
> Poornima
>
> On Tue, Apr 16, 2019 at 6:25 PM Hu Bert <revirii at googlemail.com> wrote:
>>
>> Hi Poornima,
>>
>> thx for your efforts. I made a couple of tests and the results are the
>> same, so the options are not related. Anyway, i'm not able to
>> reproduce the problem on my testing system, although the volume
>> options are the same.
>>
>> About 1.5 hours ago i set performance.quick-read to on again and
>> watched: load/iowait went up (not bad at the moment, little traffic),
>> but network traffic went up - from <20 MBit/s up to 160 MBit/s. After
>> deactivating quick-read traffic dropped to < 20 MBit/s again.
>>
>> munin graph: https://abload.de/img/network-client4s0kle.png
>>
>> The 2nd peak is from the last test.
>>
>>
>> Thx,
>> Hubert
>>
>> Am Di., 16. Apr. 2019 um 09:43 Uhr schrieb Hu Bert <revirii at googlemail.com>:
>> >
>> > In my first test on my testing setup the traffic was on a normal
>> > level, so i thought i was "safe". But on my live system the network
>> > traffic was a multiple of the traffic one would expect.
>> > performance.quick-read was enabled in both, the only difference in the
>> > volume options between live and testing are:
>> >
>> > performance.read-ahead: testing on, live off
>> > performance.io-cache: testing on, live off
>> >
>> > I ran another test on my testing setup, deactivated both and copied 9
>> > GB of data. Now the traffic went up as well, from before ~9-10 MBit/s
>> > up to 100 MBit/s with both options off. Does performance.quick-read
>> > require one of those options set to 'on'?
>> >
>> > I'll start another test shortly, and activate on of those 2 options,
>> > maybe there's a connection between those 3 options?
>> >
>> >
>> > Best Regards,
>> > Hubert
>> >
>> > Am Di., 16. Apr. 2019 um 08:57 Uhr schrieb Poornima Gurusiddaiah
>> > <pgurusid at redhat.com>:
>> > >
>> > > Thank you for reporting this. I had done testing on my local setup and the issue was resolved even with quick-read enabled. Let me test it again.
>> > >
>> > > Regards,
>> > > Poornima
>> > >
>> > > On Mon, Apr 15, 2019 at 12:25 PM Hu Bert <revirii at googlemail.com> wrote:
>> > >>
>> > >> fyi: after setting performance.quick-read to off network traffic
>> > >> dropped to normal levels, client load/iowait back to normal as well.
>> > >>
>> > >> client: https://abload.de/img/network-client-afterihjqi.png
>> > >> server: https://abload.de/img/network-server-afterwdkrl.png
>> > >>
>> > >> Am Mo., 15. Apr. 2019 um 08:33 Uhr schrieb Hu Bert <revirii at googlemail.com>:
>> > >> >
>> > >> > Good Morning,
>> > >> >
>> > >> > today i updated my replica 3 setup (debian stretch) from version 5.5
>> > >> > to 5.6, as i thought the network traffic bug (#1673058) was fixed and
>> > >> > i could re-activate 'performance.quick-read' again. See release notes:
>> > >> >
>> > >> > https://review.gluster.org/#/c/glusterfs/+/22538/
>> > >> > http://git.gluster.org/cgit/glusterfs.git/commit/?id=34a2347780c2429284f57232f3aabb78547a9795
>> > >> >
>> > >> > Upgrade went fine, and then i was watching iowait and network traffic.
>> > >> > It seems that the network traffic went up after upgrade and
>> > >> > reactivation of performance.quick-read. Here are some graphs:
>> > >> >
>> > >> > network client1: https://abload.de/img/network-clientfwj1m.png
>> > >> > network client2: https://abload.de/img/network-client2trkow.png
>> > >> > network server: https://abload.de/img/network-serverv3jjr.png
>> > >> >
>> > >> > gluster volume info: https://pastebin.com/ZMuJYXRZ
>> > >> >
>> > >> > Just wondering if the network traffic bug really got fixed or if this
>> > >> > is a new problem. I'll wait a couple of minutes and then deactivate
>> > >> > performance.quick-read again, just to see if network traffic goes down
>> > >> > to normal levels.
>> > >> >
>> > >> >
>> > >> > Best regards,
>> > >> > Hubert
>> > >> _______________________________________________
>> > >> Gluster-users mailing list
>> > >> Gluster-users at gluster.org
>> > >> https://lists.gluster.org/mailman/listinfo/gluster-users


More information about the Gluster-users mailing list