[Bugs] [Bug 1403984] New: Node node high CPU - healing entries increasing
bugzilla at redhat.com
bugzilla at redhat.com
Mon Dec 12 19:36:00 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1403984
Bug ID: 1403984
Summary: Node node high CPU - healing entries increasing
Product: GlusterFS
Version: 3.8
Component: core
Severity: urgent
Assignee: bugs at gluster.org
Reporter: tu2Bgone at gmail.com
CC: bugs at gluster.org
Created attachment 1230893
--> https://bugzilla.redhat.com/attachment.cgi?id=1230893&action=edit
statedump from gluster node with high load
Description of problem:
3 x node Fedora Cluster in AWS (m4.xlarge) (Fedora 23 (Cloud Edition))
2.5Tb volume
One node out of 3 gets high CPU, healing entries increase. Logs on one node
with low CPU usage keep getting messages similar to this:
I [MSGID: 115072] [server-rpc-fops.c:1640:server_setattr_cbk]
0-marketplace_nfs-server: 6954047: SETATTR /ftpdata/<removed>/2_kamih2.zip
(c46c2f49-9688-4617-9541-a7181b495f80) ==> (Operation not permitted) [Operation
not permitted]
I have asked previously about this message on the mailing list but nobody would
answer.
Other two nodes (one with high CPU and one with low) logs are quiet, with
occassional heal messages
To reduce load we have to stop reading and writing to volumes.
After having trouble explained here
https://bugzilla.redhat.com/show_bug.cgi?id=1402621 we upgraded the cluster
3.8. The cluster of 3x m4.xlarge hosts (4CPU 16G RAM) support only 12 clients
at most.
Version-Release number of selected component (if applicable):
3.8
How reproducible:
Performance issue occurring regularly.
Steps to Reproduce:
Running find /path -type f -exec stat {} \; can show a significant increase
from just one host.
Additional info:
sudo gluster volume info
Volume Name: marketplace_nfs
Type: Distributed-Replicate
Volume ID: 528de1b5-0bd5-488b-83cf-c4f3f747e6cd
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.90.5.105:/data/data0/marketplace_nfs
Brick2: 10.90.3.14:/data/data3/marketplace_nfs
Brick3: 10.90.4.195:/data/data0/marketplace_nfs
Brick4: 10.90.5.105:/data/data1/marketplace_nfs
Brick5: 10.90.3.14:/data/data1/marketplace_nfs
Brick6: 10.90.4.195:/data/data1/marketplace_nfs
Options Reconfigured:
performance.client-io-threads: on
performance.io-thread-count: 12
server.event-threads: 3
client.event-threads: 3
server.outstanding-rpc-limit: 256
cluster.self-heal-readdir-size: 16KB
cluster.self-heal-window-size: 3
diagnostics.brick-log-level: INFO
network.ping-timeout: 15
cluster.quorum-type: none
performance.readdir-ahead: on
cluster.self-heal-daemon: enable
performance.cache-size: 1024MB
cluster.lookup-optimize: on
cluster.data-self-heal-algorithm: diff
nfs.disable: off
cluster.server-quorum-ratio: 51%
sudo gluster volume status
Status of volume: marketplace_nfs
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.90.5.105:/data/data0/marketplace_n
fs 49155 0 Y 20611
Brick 10.90.3.14:/data/data3/marketplace_nf
s 49158 0 Y 23161
Brick 10.90.4.195:/data/data0/marketplace_n
fs 49155 0 Y 5504
Brick 10.90.5.105:/data/data1/marketplace_n
fs 49156 0 Y 20616
Brick 10.90.3.14:/data/data1/marketplace_nf
s 49159 0 Y 23166
Brick 10.90.4.195:/data/data1/marketplace_n
fs 49156 0 Y 5509
NFS Server on localhost 2049 0 Y 23250
Self-heal Daemon on localhost N/A N/A Y 23262
NFS Server on ip-10-90-4-195.ec2.internal 2049 0 Y 25289
Self-heal Daemon on ip-10-90-4-195.ec2.inte
rnal N/A N/A Y 25297
NFS Server on ip-10-90-5-105.ec2.internal 2049 0 Y 8405
Self-heal Daemon on ip-10-90-5-105.ec2.inte
rnal N/A N/A Y 8416
Task Status of Volume marketplace_nfs
------------------------------------------------------------------------------
There are no active volume tasks
Status of volume: marketplace_uploads
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 10.90.4.195:/data/data2/uploads 49157 0 Y 5528
Brick 10.90.3.14:/data/data2/uploads 49160 0 Y 23180
Brick 10.90.5.105:/data/data2/uploads 49157 0 Y 20621
NFS Server on localhost 2049 0 Y 23250
Self-heal Daemon on localhost N/A N/A Y 23262
NFS Server on ip-10-90-4-195.ec2.internal 2049 0 Y 25289
Self-heal Daemon on ip-10-90-4-195.ec2.inte
rnal N/A N/A Y 25297
NFS Server on ip-10-90-5-105.ec2.internal 2049 0 Y 8405
Self-heal Daemon on ip-10-90-5-105.ec2.inte
rnal N/A N/A Y 8416
Task Status of Volume marketplace_uploads
------------------------------------------------------------------------------
There are no active volume tasks
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list