[Gluster-users] GFS performance under heavy traffic

Mon Dec 23 00:09:43 UTC 2019

Hi Strahil,

Thanks for that. We do have one backup server specified, but will add the
second backup as well.

On Sat, 21 Dec 2019 at 11:26, Strahil <hunter86_bg at yahoo.com> wrote:

> Hi David,
>
> Also consider using the  mount option to specify backup server via
> 'backupvolfile-server=server2:server3' (you can define more but I don't
> thing replica volumes  greater that 3 are usefull (maybe  in some special
> cases).
>
> In such way, when the primary is lost, your client can reach a backup one
> without disruption.
>
> P.S.: Client may 'hang' - if the primary server got rebooted ungracefully
> - as the communication must timeout before FUSE addresses the next server.
> There is a special script for  killing gluster processes in
> '/usr/share/gluster/scripts' which can be used  for  setting up a systemd
> service to do that for you on shutdown.
>
> Best Regards,
> Strahil Nikolov
> On Dec 20, 2019 23:49, David Cunningham <dcunningham at voisonics.com> wrote:
>
> Hi Stahil,
>
> Ah, that is an important point. One of the nodes is not accessible from
> the client, and we assumed that it only needed to reach the GFS node that
> was mounted so didn't think anything of it.
>
> We will try making all nodes accessible, as well as
> "direct-io-mode=disable".
>
> Thank you.
>
>
> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov <hunter86_bg at yahoo.com>
> wrote:
>
> Actually I haven't clarified myself.
> FUSE mounts on the client side is connecting directly to all bricks
> consisted of the volume.
> If for some reason (bad routing, firewall blocked) there could be cases
> where the client can reach 2 out of 3 bricks and this can constantly cause
> healing to happen (as one of the bricks is never updated) which will
> degrade the performance and cause excessive network usage.
> As your attachment is from one of the gluster nodes, this could be the
> case.
>
> Best Regards,
> Strahil Nikolov
>
> В петък, 20 декември 2019 г., 01:49:56 ч. Гринуич+2, David Cunningham <
> dcunningham at voisonics.com> написа:
>
>
> Hi Strahil,
>
> The chart attached to my original email is taken from the GFS server.
>
> I'm not sure what you mean by accessing all bricks simultaneously. We've
> mounted it from the client like this:
> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
> defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
> 0 0
>
> Should we do something different to access all bricks simultaneously?
>
> Thanks for your help!
>
>
> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov <hunter86_bg at yahoo.com>
> wrote:
>
> I'm not sure if you did measure the traffic from client side (tcpdump on a
> client machine) or from Server side.
>
> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
>
> Have you thought about upgrading to v6? There are some enhancements in v6
> which could be beneficial.
>
> Yet, it is indeed strange that so much traffic is generated with FUSE.
>
> Another aproach is to test with NFSGanesha which suports pNFS and can
> natively speak with Gluster, which cant bring you closer to the previous
> setup and also provide some extra performance.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20191223/bfccb5d6/attachment.html>