[Bugs] [Bug 1339246] New: High IO/load causes VMs to enter Paused state
bugzilla at redhat.com
bugzilla at redhat.com
Tue May 24 13:10:22 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1339246
Bug ID: 1339246
Summary: High IO/load causes VMs to enter Paused state
Product: GlusterFS
Version: 3.7.11
Component: sharding
Severity: high
Assignee: bugs at gluster.org
Reporter: Sustugriel at gmail.com
QA Contact: bugs at gluster.org
CC: bugs at gluster.org
Description of problem:
Taking a backup image of a running VM's disk consistently causes the VM to
enter a paused state. It can then be resumed with no issues.
This problem has started since the enabling of the sharding translator, which
has led to drastically faster heal times.
Version-Release number of selected component (if applicable): 3.7.11-1
How reproducible: 50-100%. It's intermittent, sometimes the machines will
pause, other times they won't. Does not seem to be related to disk size.
Steps to Reproduce:
1. Create and install oVirt environment using GlusterFS as storage in
Distributed Replicate platform.
2. Use default volume options, except enabling the sharding translator.
3. Create a Windows Server 2012 R2 VM, take a backup image using a VSS capture
utility like BackupExec, Acronis, Windows Server Backup, etc.
Actual results:
Machines will pause seconds after the backup has started. Hosts did not go
down, bricks did not go down. They can be resumed immediately which is
successful.
Expected results:
Machines should not pause.
Additional info:
Distributed replicate volume.
Number of bricks: 6
Replica Count: 3
Volume options:
cluster.self-heal-window-size: 256
cluster.data-self-heal-algorithm: full
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
server.allow-insecure: on
storage.owner-gid: 36
network.ping-timeout: 10
features.shard-block-size: 512MB
features.shard: on
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list