[Bugs] [Bug 1349953] New: thread CPU saturation limiting throughput on write workloads
bugzilla at redhat.com
bugzilla at redhat.com
Fri Jun 24 15:59:04 UTC 2016
https://bugzilla.redhat.com/show_bug.cgi?id=1349953
Bug ID: 1349953
Summary: thread CPU saturation limiting throughput on write
workloads
Product: GlusterFS
Version: 3.8.0
Component: fuse
Assignee: bugs at gluster.org
Reporter: mpillai at redhat.com
CC: bugs at gluster.org
Description of problem:
On a distributed iozone benchmark test involving sequential writes to
large-files, we are seeing poor write throughput when there are multiple
threads per client. Stats on the clients show a glusterfs thread at 100% CPU
utilization. Overall CPU utilization on the clients is low.
Version-Release number of selected component (if applicable):
glusterfs*-3.8.0-1.el7.x86_64 (on both clients and servers)
RHEL 7.1 (clients)
RHEL 7.2 (servers)
How reproducible:
consistently
Steps to Reproduce:
The h/w setup involves 6 servers and 6 clients, with 10gE network. Each server
has 12 hard disks for a total of 72 drives. A single 12x(4+2) EC volume is
created and fuse mounted on the 6 clients. Iozone is run in distributed mode
from the clients, as below (in this case, with 4 threads per client):
iozone -+m ${IOZONE_CONF} -i 0 -w -+n -c -C -e -s 20g -r 64k -t 24
For comparison, results were also obtained with a 3x2 dist-rep volume. In this
case, the disks on each server are aggregated into a 12-disk RAID-6 device on
which the gluster brick is created.
Actual results:
Throughput for 12x(4+2) dist-disperse volume with each brick on a single disk:
throughput for 24 initial writers = 738076.08 kB/sec
Throughput for 3x2 dist-replicated volume with bricks on 12-disk RAID-6:
throughput for 24 initial writers = 1817252.84 kB/sec
Expected results:
1. EC should exceed replica-2 performance on this workload:
EC needs to write out fewer bytes compared to replica-2. EC needs to write out
1.5x the number of bytes written by application. Replica-2 on the other hand
needs to write out 1.2 (RAID-6) * 2 (replica-2) = 2.4x number of bytes actually
written.
For this large-file workload, EC should be capable of achieving higher
throughput than replica-2, but it is not. For some other write-intensive
large-file benchmarks, we have seen EC exceed replica-2+RAID-6 by a significant
margin. So need to see why in this case that is not happening.
2. write throughput for both EC and replica-2 are much less than what the h/w
setup is capable of.
Additional info:
Output of "top -bH -d 10" on the clients shows output like below:
<body>
top - 09:12:10 up 191 days, 7:52, 0 users, load average: 0.56, 0.26, 0.51
Threads: 289 total, 1 running, 288 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.9 us, 5.3 sy, 0.0 ni, 83.5 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 65728904 total, 58975920 free, 929824 used, 5823160 buff/cache
KiB Swap: 32964604 total, 32964148 free, 456 used. 64008760 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21160 root 20 0 892360 233600 3492 S 99.8 0.4 0:33.07 glusterfs
21155 root 20 0 892360 233600 3492 S 22.7 0.4 0:07.88 glusterfs
21156 root 20 0 892360 233600 3492 S 22.5 0.4 0:07.98 glusterfs
21154 root 20 0 892360 233600 3492 S 22.2 0.4 0:08.29 glusterfs
21157 root 20 0 892360 233600 3492 S 21.8 0.4 0:08.02 glusterfs
21167 root 20 0 53752 19484 816 S 2.9 0.0 0:00.96 iozone
21188 root 20 0 53752 18528 816 S 2.8 0.0 0:00.95 iozone
21202 root 20 0 53752 19484 816 S 2.6 0.0 0:00.84 iozone
[...]
</body>
One of the glusterfs threads is at almost 100% CPU utilization for the duration
of the test. This is seen with both EC and replica-2, but the results seem to
indicate a higher hit for EC performance.
Volume options that have been changed (for all runs):
cluster.lookup-optimize: on
server.event-threads: 4
client.event-threads: 4
--
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
More information about the Bugs
mailing list