[Gluster-users] Blocking IO when hot tier promotion daemon runs

Tue Jan 9 16:21:27 UTC 2018

I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server / 3
bricks per server distributed replicated volume.

I'm seeing IO get blocked across all client FUSE threads for 10 to 15
seconds while the promotion daemon runs. I see the 'glustertierpro' thread
jump to 99% CPU usage on both boxes when these delays occur and they happen
every 25 minutes (my tier-promote-frequency setting).

I suspect this has something to do with the heat database in sqlite, maybe
something is getting locked while it runs the query to determine files to
promote. My volume contains approximately 18 million files.

Has anybody else seen this? I suspect that these delays will get worse as I
add more files to my volume which will cause significant problems.

Here are my hot tier settings:

# gluster volume get gv0 all | grep tier
cluster.tier-pause                      off

cluster.tier-promote-frequency          1500

cluster.tier-demote-frequency           3600

cluster.tier-mode                       cache

cluster.tier-max-promote-file-size      10485760

cluster.tier-max-mb                     64000

cluster.tier-max-files                  100000

cluster.tier-query-limit                100

cluster.tier-compact                    on

cluster.tier-hot-compact-frequency      86400

cluster.tier-cold-compact-frequency     86400

# gluster volume get gv0 all | grep threshold
cluster.write-freq-threshold            2

cluster.read-freq-threshold             5

# gluster volume get gv0 all | grep watermark
cluster.watermark-hi                    92

cluster.watermark-low                   75
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180109/30a79b5b/attachment.html>