[Gluster-users] performance.cache-size for high-RAM clients/servers, other tweaks for performance, and improvements to Gluster docs
vladkopy at gmail.com
Tue Apr 10 05:38:51 UTC 2018
you definitely need mount options to /etc/fstab
use ones from here
I went on with using local mounts to achieve performance as well
Also, 3.12 or 3.10 branches would be preferable for production
On Fri, Apr 6, 2018 at 4:12 AM, Artem Russakovskii <archon810 at gmail.com>
> Hi again,
> I'd like to expand on the performance issues and plead for help. Here's
> one case which shows these odd hiccups: https://i.imgur.com/CXBPjTK.gifv.
> In this GIF where I switch back and forth between copy operations on 2
> servers, I'm copying a 10GB dir full of .apk and image files.
> On server "hive" I'm copying straight from the main disk to an attached
> volume block (xfs). As you can see, the transfers are relatively speedy and
> don't hiccup.
> On server "citadel" I'm copying the same set of data to a 4-replicate
> gluster which uses block storage as a brick. As you can see, performance is
> much worse, and there are frequent pauses for many seconds where nothing
> seems to be happening - just freezes.
> All 4 servers have the same specs, and all of them have performance issues
> with gluster and no such issues when raw xfs block storage is used.
> hive has long finished copying the data, while citadel is barely chugging
> along and is expected to take probably half an hour to an hour. I have over
> 1TB of data to migrate, at which point if we went live, I'm not even sure
> gluster would be able to keep up instead of bringing the machines and
> services down.
> Here's the cluster config, though it didn't seem to make any difference
> performance-wise before I applied the customizations vs after.
> Volume Name: apkmirror_data1
> Type: Replicate
> Volume ID: 11ecee7e-d4f8-497a-9994-ceb144d6841e
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 4 = 4
> Transport-type: tcp
> Brick1: nexus2:/mnt/nexus2_block1/apkmirror_data1
> Brick2: forge:/mnt/forge_block1/apkmirror_data1
> Brick3: hive:/mnt/hive_block1/apkmirror_data1
> Brick4: citadel:/mnt/citadel_block1/apkmirror_data1
> Options Reconfigured:
> cluster.quorum-count: 1
> cluster.quorum-type: fixed
> network.ping-timeout: 5
> network.remote-dio: enable
> performance.rda-cache-limit: 256MB
> performance.readdir-ahead: on
> performance.parallel-readdir: on
> network.inode-lru-limit: 500000
> performance.md-cache-timeout: 600
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> cluster.readdir-optimize: on
> performance.io-thread-count: 32
> server.event-threads: 4
> client.event-threads: 4
> performance.read-ahead: off
> cluster.lookup-optimize: on
> performance.cache-size: 1GB
> cluster.self-heal-daemon: enable
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: on
> The mounts are done as follows in /etc/fstab:
> /dev/disk/by-id/scsi-0Linode_Volume_citadel_block1 /mnt/citadel_block1
> xfs defaults 0 2
> localhost:/apkmirror_data1 /mnt/apkmirror_data1 glusterfs defaults,_netdev
> 0 0
> I'm really not sure if direct-io-mode mount tweaks would do anything here,
> what the value should be set to, and what it is by default.
> The OS is OpenSUSE 42.3, 64-bit. 80GB of RAM, 20 CPUs, hosted by Linode.
> I'd really appreciate any help in the matter.
> Thank you.
> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
> <http://www.apkmirror.com/>, Illogical Robot LLC
> beerpla.net | +ArtemRussakovskii
> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
> On Thu, Apr 5, 2018 at 11:13 PM, Artem Russakovskii <archon810 at gmail.com>
>> I'm trying to squeeze performance out of gluster on 4 80GB RAM 20-CPU
>> machines where Gluster runs on attached block storage (Linode) in (4
>> replicate bricks), and so far everything I tried results in sub-optimal
>> There are many files - mostly images, several million - and many
>> operations take minutes, copying multiple files (even if they're small)
>> suddenly freezes up for seconds at a time, then continues, iostat
>> frequently shows large r_await and w_awaits with 100% utilization for the
>> attached block device, etc.
>> But anyway, there are many guides out there for small-file performance
>> improvements, but more explanation is needed, and I think more tweaks
>> should be possible.
>> My question today is about performance.cache-size. Is this a size of
>> cache in RAM? If so, how do I view the current cache size to see if it gets
>> full and I should increase its size? Is it advisable to bump it up if I
>> have many tens of gigs of RAM free?
>> More generally, in the last 2 months since I first started working with
>> gluster and set a production system live, I've been feeling frustrated
>> because Gluster has a lot of poorly-documented and confusing options. I
>> really wish documentation could be improved with examples and better
>> Specifically, it'd be absolutely amazing if the docs offered a strategy
>> for setting each value and ways of determining more optimal values. For
>> example, for performance.cache-size, if it said something like "run command
>> abc to see your current cache size, and if it's hurting, up it, but be
>> aware that it's limited by RAM," it'd be already a huge improvement to the
>> docs. And so on with other options.
>> The gluster team is quite helpful on this mailing list, but in a reactive
>> rather than proactive way. Perhaps it's tunnel vision once you've worked on
>> a project for so long where less technical explanations and even proper
>> documentation of options takes a back seat, but I encourage you to be more
>> proactive about helping us understand and optimize Gluster.
>> Thank you.
>> Founder, Android Police <http://www.androidpolice.com>, APK Mirror
>> <http://www.apkmirror.com/>, Illogical Robot LLC
>> beerpla.net | +ArtemRussakovskii
>> <https://plus.google.com/+ArtemRussakovskii> | @ArtemR
> Gluster-users mailing list
> Gluster-users at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users