[Bugs] [Bug 1388837] New: enable features.shard glusterfs replication or arbiter volume performance bad

Wed Oct 26 09:38:00 UTC 2016

https://bugzilla.redhat.com/show_bug.cgi?id=1388837

            Bug ID: 1388837
           Summary: enable features.shard glusterfs replication or arbiter
                    volume performance bad
           Product: GlusterFS
           Version: 3.8
         Component: sharding
          Severity: medium
          Assignee: bugs at gluster.org
          Reporter: maorong.hu at horebdata.cn
        QA Contact: bugs at gluster.org
                CC: bugs at gluster.org
   External Bug ID: CentOS 1375225

Description of problem:

  glusterfs relicate 3 or relicate 3 arbiter 1 volume , set  features.shard
enable or set features.shard on , glusterfs volume performnace bad .
   And other also report arbiter volume enable shard performance bad bug : 
https://bugzilla.redhat.com/show_bug.cgi?id=1375125 ,  I use nightly build
version(2016-10-25):http://artifacts.ci.centos.org/gluster/nightly/release-3.8/7/x86_64/?C=M;O=D
 test , the problem also exist, I think this releally problem is shard ,not
arbiter .

Version-Release number of selected component (if applicable):
   test glusterfs version on : 3.7 , 3.8 and 3.9 and also 

How reproducible:

Steps to Reproduce:
1.create glusterfs relicate 3 and relicate 3 arbiter 1  volume :
[root at horeba ~]# gluster volume info data_volume

Volume Name: data_volume
Type: Replicate
Volume ID: 48d74735-db85-44e8-b0d2-1c8cf651418c
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.10.71:/data_sdc/brick3
Brick2: 192.168.10.72:/data_sdc/brick3
Brick3: 192.168.10.73:/data_sdc/brick3
Options Reconfigured:
features.shard-block-size: 512MB
features.shard: on
nfs.disable: on
cluster.data-self-heal-algorithm: full
server.allow-insecure: on
auth.allow: *
network.ping-timeout: 10
storage.owner-gid: 36
storage.owner-uid: 36
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on

[root at horeba ~]# gluster v info data_volume3

Volume Name: data_volume3
Type: Distributed-Replicate
Volume ID: cd5f4322-11e3-4f18-a39d-f0349b8d2a0c
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x (2 + 1) = 12
Transport-type: tcp
Bricks:
Brick1: 192.168.10.71:/data_sdaa/brick
Brick2: 192.168.10.72:/data_sdaa/brick
Brick3: 192.168.10.73:/data_sdaa/brick (arbiter)
Brick4: 192.168.10.71:/data_sdc/brick
Brick5: 192.168.10.73:/data_sdc/brick
Brick6: 192.168.10.72:/data_sdc/brick (arbiter)
Brick7: 192.168.10.72:/data_sde/brick
Brick8: 192.168.10.73:/data_sde/brick
Brick9: 192.168.10.71:/data_sde/brick (arbiter)
Brick10: 192.168.10.71:/data_sde/brick1
Brick11: 192.168.10.72:/data_sdc/brick1
Brick12: 192.168.10.73:/data_sdaa/brick1 (arbiter)
Options Reconfigured:
features.shard: on
server.allow-insecure: on
features.shard-block-size: 512MB
storage.owner-gid: 36
storage.owner-uid: 36
nfs.disable: on
cluster.data-self-heal-algorithm: full
auth.allow: *
network.ping-timeout: 10
performance.low-prio-threads: 32
performance.io-thread-count: 32
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on

2. mount on one host and test :

  all volume set features.shard on  as above , add dd test :

replication3 arbiter 1 , enable shard :
[root at horebc test]# for i in `seq 3`; do dd if=/dev/zero of=./file   bs=1G
count=1 oflag=direct ; done
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 56.3563 s, 19.1 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 56.8704 s, 18.9 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 54.8892 s, 19.6 MB/s

relication 3 , enable shard :
[root at horebc test2]# for i in `seq 3`; do dd if=/dev/zero of=./file   bs=1G
count=1 oflag=direct ; done
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 6.46174 s, 166 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 6.39413 s, 168 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 6.36879 s, 169 MB/s

[root at horeba ~]# gluster v reset  data_volume3 features.shard 
volume reset: success: reset volume successful
[root at horeba ~]# gluster v reset  data_volume features.shard 
volume reset: success: reset volume successful

relication 3 ,no shard enable :
[root at horebc test2]# for i in `seq 3`; do dd if=/dev/zero of=./file   bs=1G
count=1 oflag=direct ; done
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.85271 s, 580 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.85781 s, 578 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.85364 s, 579 MB/s

relication 3 arbiter 1 ,no shard enable :
[root at horebc test]# for i in `seq 3`; do dd if=/dev/zero of=./file1   bs=1G
count=1 oflag=direct ; done
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.40569 s, 764 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.33287 s, 806 MB/s
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 1.32026 s, 813 MB/s

3.

Actual results:
   wo can see if enable shard config , relication volume performance bad , and
arbiter volume crazy bad . 

Additional info:

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.