[Gluster-users] very High CPU load of brick servers while write performance is very slow
Mingfan Lu
mingfan.lu at gmail.com
Sat Feb 8 06:51:53 UTC 2014
CPU load in some of brick servers are very high and write performance is
very slow.
dd one file to the volume, the result is only 10+KB/sec
any comments?
more infomation >>>>>
Volume Name: prodvolume
Type: Distributed-Replicate
Volume ID: f3fc24b3-23c7-430d-8ab1-81a646b1ce06
Status: Started
Number of Bricks: 17 x 3 = 51 (I have 51 servers)
Transport-type: tcp
Bricks:
....
Options Reconfigured:
performance.io-thread-count: 32
auth.allow: *,10.121.48.244,10.121.48.82
features.limit-usage: /:400TB
features.quota: on
server.allow-insecure: on
features.quota-timeout: 5
most of cpu utilization from system/kernel mode
top - 14:47:13 up 219 days, 23:36, 2 users, load average: 17.76, 20.98,
24.74
Tasks: 493 total, 1 running, 491 sleeping, 0 stopped, 1 zombie
Cpu(s): 8.2%us, 49.0%sy, 0.0%ni, 42.2%id, 0.1%wa, 0.0%hi, 0.4%si,
0.0%st
Mem: 132112276k total, 131170760k used, 941516k free, 71224k buffers
Swap: 4194296k total, 867216k used, 3327080k free, 110888216k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
* 6226 root 20 0 2677m 496m 2268 S 1183.4 0.4 89252:09 glusterfsd*
27994 root 20 0 1691m 77m 2000 S 111.6 0.1 324333:47 glusterfsd
14169 root 20 0 14.9g 23m 1984 S 51.3 0.0 3700:30 glusterfsd
20582 root 20 0 2129m 1.4g 1708 S 12.6 1.1 198:03.53 glusterfs
24528 root 20 0 0 0 0 S 6.3 0.0 14:18.60 flush-8:16
17717 root 20 0 21416 11m 8268 S 5.0 0.0 14:51.18 oprofiled
use perf top -p 6226, are casusd by spin_lock
Events: 49K cycles
72.51% [kernel] [k] _spin_lock
4.00% libpthread-2.12.so [.] pthread_mutex_lock
2.63% [kernel] [k] _spin_unlock_irqrestore
1.61% libpthread-2.12.so [.] pthread_mutex_unlock
1.59% [unknown] [.] 0xffffffffff600157
1.57% [xfs] [k] xfs_inobt_get_rec
1.41% [xfs] [k] xfs_btree_increment
1.27% [xfs] [k] xfs_btree_get_rec
1.17% libpthread-2.12.so [.] __lll_lock_wait
0.96% [xfs] [k] _xfs_buf_find
0.95% [xfs] [k] xfs_btree_get_block
0.88% [kernel] [k] copy_user_generic_string
0.50% [xfs] [k] xfs_dialloc
0.48% [xfs] [k] xfs_btree_rec_offset
0.47% [xfs] [k] xfs_btree_readahead
0.41% [kernel] [k] futex_wait_setup
0.41% [kernel] [k] futex_wake
0.35% [kernel] [k] system_call_after_swapgs
0.33% [xfs] [k] xfs_btree_rec_addr
0.30% [kernel] [k] __link_path_walk
0.29% io-threads.so.0.0.0 [.] __iot_dequeue
0.29% io-threads.so.0.0.0 [.] iot_worker
0.25% [kernel] [k] __d_lookup
0.21% libpthread-2.12.so [.] __lll_unlock_wake
0.20% [kernel] [k] get_futex_key
0.18% [kernel] [k] hash_futex
0.17% [kernel] [k] do_futex
0.15% [kernel] [k] thread_return
0.15% libpthread-2.12.so [.] pthread_spin_lock
0.14% libc-2.12.so [.] _int_malloc
0.14% [kernel] [k] sys_futex
0.14% [kernel] [k] wake_futex
0.14% [kernel] [k] _atomic_dec_and_lock
0.12% [kernel] [k] kmem_cache_free
0.12% [xfs] [k] xfs_trans_buf_item_match
0.12% [xfs] [k] xfs_btree_check_sblock
0.11% libc-2.12.so [.] vfprintf
0.11% [kernel] [k] futex_wait
0.11% [kernel] [k] kmem_cache_alloc
0.09% [kernel] [k] acl_permission_check
use oprifile, I found the cpu are almost caused breakdown into:
CPU: Intel Sandy Bridge microarchitecture, speed 2000.02 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
mask of 0x00 (No unit mask) count 100000
samples % linenr info image name app
name symbol name
-------------------------------------------------------------------------------
288683303 41.2321 clocksource.c:828 vmlinux
vmlinux * sysfs_show_available_clocksources*
288683303 100.000 clocksource.c:828 vmlinux
vmlinux sysfs_show_available_clocksources [self]
-------------------------------------------------------------------------------
203797076 29.1079 clocksource.c:236 vmlinux
vmlinux *clocksource_mark_unstable*
203797076 100.000 clocksource.c:236 vmlinux
vmlinux clocksource_mark_unstable [self]
-------------------------------------------------------------------------------
42321053 6.0446 (no location information) xfs
xfs /xfs
42321053 100.000 (no location information) xfs
xfs /xfs [self]
-------------------------------------------------------------------------------
23662768 3.3797 (no location information) libpthread-2.12.so
libpthread-2.12.so pthread_mutex_lock
23662768 100.000 (no location information) libpthread-2.12.so
libpthread-2.12.so pthread_mutex_lock [self]
-------------------------------------------------------------------------------
10867915 1.5522 (no location information) libpthread-2.12.so
libpthread-2.12.so pthread_mutex_unlock
10867915 100.000 (no location information) libpthread-2.12.so
libpthread-2.12.so pthread_mutex_unlock [self]
-------------------------------------------------------------------------------
7727828 1.1038 (no location information) libpthread-2.12.so
libpthread-2.12.so __lll_lock_wait
7727828 100.000 (no location information) libpthread-2.12.so
libpthread-2.12.so __lll_lock_wait [self]
-------------------------------------------------------------------------------
6296394 0.8993 blk-sysfs.c:260 vmlinux
vmlinux queue_rq_affinity_store
6296394 100.000 blk-sysfs.c:260 vmlinux
vmlinux queue_rq_affinity_store [self]
-------------------------------------------------------------------------------
3543413 0.5061 sched.h:293 vmlinux
vmlinux ftrace_profile_templ_sched_stat_template
3543413 100.000 sched.h:293 vmlinux
vmlinux ftrace_profile_templ_sched_stat_template [self]
-------------------------------------------------------------------------------
2960958 0.4229 msi.c:82 vmlinux
vmlinux msi_set_enable
2960958 100.000 msi.c:82 vmlinux
vmlinux msi_set_enable [self]
-------------------------------------------------------------------------------
2814515 0.4020 clocksource.c:249 vmlinux
vmlinux clocksource_watchdog
2814515 100.000 clocksource.c:249 vmlinux
vmlinux clocksource_watchdog [self]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140208/72c015f5/attachment.html>
More information about the Gluster-users
mailing list