[Bugs] [Bug 1339246] New: High IO/load causes VMs to enter Paused state

bugzilla at redhat.com bugzilla at redhat.com
Tue May 24 13:10:22 UTC 2016


https://bugzilla.redhat.com/show_bug.cgi?id=1339246

            Bug ID: 1339246
           Summary: High IO/load causes VMs to enter Paused state
           Product: GlusterFS
           Version: 3.7.11
         Component: sharding
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: Sustugriel at gmail.com
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org



Description of problem:
Taking a backup image of a running VM's disk consistently causes the VM to
enter a paused state. It can then be resumed with no issues.

This problem has started since the enabling of the sharding translator, which
has led to drastically faster heal times.

Version-Release number of selected component (if applicable): 3.7.11-1


How reproducible: 50-100%. It's intermittent, sometimes the machines will
pause, other times they won't. Does not seem to be related to disk size.


Steps to Reproduce:
1. Create and install oVirt environment using GlusterFS as storage in
Distributed Replicate platform. 
2. Use default volume options, except enabling the sharding translator.
3. Create a Windows Server 2012 R2 VM, take a backup image using a VSS capture
utility like BackupExec, Acronis, Windows Server Backup, etc.

Actual results:
Machines will pause seconds after the backup has started. Hosts did not go
down, bricks did not go down. They can be resumed immediately which is
successful.

Expected results:
Machines should not pause.

Additional info:
Distributed replicate volume.
Number of bricks: 6
Replica Count: 3

Volume options:
cluster.self-heal-window-size: 256
cluster.data-self-heal-algorithm: full
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
server.allow-insecure: on
storage.owner-gid: 36
network.ping-timeout: 10
features.shard-block-size: 512MB
features.shard: on

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list