[Gluster-users] Blocking IO when hot tier promotion daemon runs
Tom Fite
tomfite at gmail.com
Wed Jan 10 15:17:30 UTC 2018
The sizes of the files are extremely varied, there are millions of small
(<1 MB) files and thousands of files larger than 1 GB.
Attached is the tier log for gluster1 and gluster2. These are full of
"demotion failed" messages, which is also shown in the status:
[root at pod-sjc1-gluster1 gv0]# gluster volume tier gv0 status
Node Promoted files Demoted files Status
run time in h:m:s
--------- --------- --------- ---------
---------
localhost 25940 0 in progress
112:21:49
pod-sjc1-gluster2 0 2917154 in progress
112:21:49
Is it normal to have promotions and demotions only happen on each server
but not both?
Volume info:
[root at pod-sjc1-gluster1 ~]# gluster volume info
Volume Name: gv0
Type: Distributed-Replicate
Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196
Status: Started
Snapshot Count: 13
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: pod-sjc1-gluster1:/data/brick1/gv0
Brick2: pod-sjc1-gluster2:/data/brick1/gv0
Brick3: pod-sjc1-gluster1:/data/brick2/gv0
Brick4: pod-sjc1-gluster2:/data/brick2/gv0
Brick5: pod-sjc1-gluster1:/data/brick3/gv0
Brick6: pod-sjc1-gluster2:/data/brick3/gv0
Options Reconfigured:
performance.cache-refresh-timeout: 60
performance.stat-prefetch: on
server.allow-insecure: on
performance.flush-behind: on
performance.rda-cache-limit: 32MB
network.tcp-window-size: 1048576
performance.nfs.io-threads: on
performance.write-behind-window-size: 4MB
performance.nfs.write-behind-window-size: 512MB
performance.io-cache: on
performance.quick-read: on
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 90000
performance.cache-size: 4GB
server.event-threads: 16
client.event-threads: 16
features.barrier: disable
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: on
cluster.lookup-optimize: on
server.outstanding-rpc-limit: 1024
auto-delete: enable
# gluster volume status
Status of volume: gv0
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------
------------------
Hot Bricks:
Brick pod-sjc1-gluster2:/data/
hot_tier/gv0 49219 0 Y
26714
Brick pod-sjc1-gluster1:/data/
hot_tier/gv0 49199 0 Y
21325
Cold Bricks:
Brick pod-sjc1-gluster1:/data/
brick1/gv0 49152 0 Y
3178
Brick pod-sjc1-gluster2:/data/
brick1/gv0 49152 0 Y
4818
Brick pod-sjc1-gluster1:/data/
brick2/gv0 49153 0 Y
3186
Brick pod-sjc1-gluster2:/data/
brick2/gv0 49153 0 Y
4829
Brick pod-sjc1-gluster1:/data/
brick3/gv0 49154 0 Y
3194
Brick pod-sjc1-gluster2:/data/
brick3/gv0 49154 0 Y
4840
Tier Daemon on localhost N/A N/A Y
20313
Self-heal Daemon on localhost N/A N/A Y
32023
Tier Daemon on pod-sjc1-gluster1 N/A N/A Y
24758
Self-heal Daemon on pod-sjc1-gluster2 N/A N/A Y
12349
Task Status of Volume gv0
------------------------------------------------------------
------------------
There are no active volume tasks
On Tue, Jan 9, 2018 at 10:33 PM, Hari Gowtham <hgowtham at redhat.com> wrote:
> Hi,
>
> Can you send the volume info, and volume status output and the tier logs.
> And I need to know the size of the files that are being stored.
>
> On Tue, Jan 9, 2018 at 9:51 PM, Tom Fite <tomfite at gmail.com> wrote:
> > I've recently enabled an SSD backed 2 TB hot tier on my 150 TB 2 server
> / 3
> > bricks per server distributed replicated volume.
> >
> > I'm seeing IO get blocked across all client FUSE threads for 10 to 15
> > seconds while the promotion daemon runs. I see the 'glustertierpro'
> thread
> > jump to 99% CPU usage on both boxes when these delays occur and they
> happen
> > every 25 minutes (my tier-promote-frequency setting).
> >
> > I suspect this has something to do with the heat database in sqlite,
> maybe
> > something is getting locked while it runs the query to determine files to
> > promote. My volume contains approximately 18 million files.
> >
> > Has anybody else seen this? I suspect that these delays will get worse
> as I
> > add more files to my volume which will cause significant problems.
> >
> > Here are my hot tier settings:
> >
> > # gluster volume get gv0 all | grep tier
> > cluster.tier-pause off
> > cluster.tier-promote-frequency 1500
> > cluster.tier-demote-frequency 3600
> > cluster.tier-mode cache
> > cluster.tier-max-promote-file-size 10485760
> > cluster.tier-max-mb 64000
> > cluster.tier-max-files 100000
> > cluster.tier-query-limit 100
> > cluster.tier-compact on
> > cluster.tier-hot-compact-frequency 86400
> > cluster.tier-cold-compact-frequency 86400
> >
> > # gluster volume get gv0 all | grep threshold
> > cluster.write-freq-threshold 2
> > cluster.read-freq-threshold 5
> >
> > # gluster volume get gv0 all | grep watermark
> > cluster.watermark-hi 92
> > cluster.watermark-low 75
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> --
> Regards,
> Hari Gowtham.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180110/9c4538b4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster2-tierd.log
Type: application/octet-stream
Size: 1979348 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180110/9c4538b4/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster1-tierd.log
Type: application/octet-stream
Size: 1970058 bytes
Desc: not available
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180110/9c4538b4/attachment-0003.obj>
More information about the Gluster-users
mailing list