[Bugs] [Bug 1174016] New: network.compression fails simple '--ioengine=sync' fio test

bugzilla at redhat.com bugzilla at redhat.com
Sun Dec 14 20:20:58 UTC 2014


https://bugzilla.redhat.com/show_bug.cgi?id=1174016

            Bug ID: 1174016
           Summary: network.compression fails simple '--ioengine=sync' fio
                    test
           Product: GlusterFS
           Version: mainline
         Component: compression-xlator
          Keywords: Triaged
          Severity: high
          Assignee: bugs at gluster.org
          Reporter: ndevos at redhat.com
                CC: bugs at gluster.org, gluster-bugs at redhat.com
            Blocks: 1073763



+++ This bug was initially created as a clone of Bug #1073763 +++

Description of problem:

I have two volumes configured that are basically identical except name, path
and network.compression configured as on (no other compression changes were
made).

Running test #1 from this page ( http://docs.gz.ro/fio-perf-tool-nutshell.html
) fails with compression-enabled volume, but the same test succeeds wonderfully
on the volume without wire compression.  At first, both were configured with
direct-io-mode=enable, but I disabled this on the compressed mount with hopes
that this would help.  Omitting the direct-io-mode option in /etc/fstab did not
help.

FYI: btrfs is used everywhere.

fio test #1 on plain mount (healthy state):

[root at core-n1 ssd-vol-benchmark-n001]# fio --size=20g --bs=64k --rw=write
--ioengine=sync --name=fio.write.out.1
fio --size=20g --bs=64k --rw=write --ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync,
iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
Jobs: 1 (f=1): [W] [100.0% done] [0K/532.3M/0K /s] [0 /8516 /0  iops] [eta
00m:00s]
fio.write.out.1: (groupid=0, jobs=1): err= 0: pid=25928: Fri Mar  7 00:52:17
2014
  write: io=20480MB, bw=546347KB/s, iops=8536 , runt= 38385msec
    clat (usec): min=27 , max=1300.1K, avg=115.08, stdev=2277.05
     lat (usec): min=27 , max=1300.1K, avg=116.23, stdev=2277.05
    clat percentiles (usec):
     |  1.00th=[   32],  5.00th=[   33], 10.00th=[   34], 20.00th=[   35],
     | 30.00th=[   40], 40.00th=[   51], 50.00th=[   95], 60.00th=[  163],
     | 70.00th=[  175], 80.00th=[  187], 90.00th=[  203], 95.00th=[  213],
     | 99.00th=[  235], 99.50th=[  249], 99.90th=[  342], 99.95th=[  350],
     | 99.99th=[  426]
    bw (KB/s)  : min=41277, max=604032, per=100.00%, avg=558639.53,
stdev=69053.48
    lat (usec) : 50=37.10%, 100=13.49%, 250=48.91%, 500=0.49%, 750=0.01%
    lat (usec) : 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 100=0.01%, 2000=0.01%
  cpu          : usr=2.65%, sys=4.15%, ctx=327684, majf=0, minf=166
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=327680/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=20480MB, aggrb=546346KB/s, minb=546346KB/s, maxb=546346KB/s,
mint=38385msec, maxt=38385msec
[root at core-n1 ssd-vol-benchmark-n001]#


fio test #1 on network.compression on mount (fail state -- presumably because
of wire compression):

[root at core-n1 ssd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write
--ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync,
iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
fio: pid=26846, err=5/file:engines/sync.c:67, func=xfer, error=Input/output
error

fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67,
func=xfer, error=Input/output error): pid=26846: Fri Mar  7 01:04:43 2014
  write: io=262144 B, bw=36571KB/s, iops=714 , runt=     7msec
    clat (usec): min=35 , max=990 , avg=293.50, stdev=464.98
     lat (usec): min=36 , max=994 , avg=295.00, stdev=466.62
    clat percentiles (usec):
     |  1.00th=[   35],  5.00th=[   35], 10.00th=[   35], 20.00th=[   35],
     | 30.00th=[   55], 40.00th=[   55], 50.00th=[   55], 60.00th=[   94],
     | 70.00th=[   94], 80.00th=[  988], 90.00th=[  988], 95.00th=[  988],
     | 99.00th=[  988], 99.50th=[  988], 99.90th=[  988], 99.95th=[  988],
     | 99.99th=[  988]
    lat (usec) : 50=20.00%, 100=40.00%, 1000=20.00%
  cpu          : usr=0.00%, sys=0.00%, ctx=7, majf=0, minf=46
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=5/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=256KB, aggrb=36571KB/s, minb=36571KB/s, maxb=36571KB/s, mint=7msec,
maxt=7msec
[root at core-n1 ssd-vol-benchmark-n002]# 



[root at core-n1 ~]# gluster pool list
UUID                    Hostname    State
3e49dd15-c05f-4ef8-b1e0-c29c59623b45    core-n2.storage-s0.example.vpn   
Connected
2a637a4b-b9a9-4dd5-80b4-78474e9e33cb    core-n5.storage-s0.example.vpn   
Connected
6cc8c574-a237-4819-a175-e7218c8606d8    core-n6.storage-s0.example.vpn   
Connected
36201442-b264-4078-a2d3-ff61a266f9d3    core-n4.storage-s0.example.vpn   
Connected
10ac9d8a-ced5-4964-b8f3-802e7ccd2f2f    core-n3.storage-s0.example.vpn   
Connected
7e46d31a-cd08-4677-8fd6-a5e7b9d7e7fe    localhost    Connected
[root at core-n1 ~]#



[root at core-n1 ~]# gluster volume info ssd-vol-benchmark-n001

Volume Name: ssd-vol-benchmark-n001
Type: Distributed-Replicate
Volume ID: 30dde773-bcba-4bd1-8ed9-6865571283db
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1:
core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick2:
core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick3:
core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick4:
core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick5:
core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Brick6:
core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n001
Options Reconfigured:
cluster.server-quorum-type: server
server.allow-insecure: on
performance.io-thread-count: 12
auth.allow: 10.30.*
cluster.server-quorum-ratio: 51%
[root at core-n1 ssd-vol-benchmark-n002]# gluster volume info
ssd-vol-benchmark-n002

Volume Name: ssd-vol-benchmark-n002
Type: Distributed-Replicate
Volume ID: 809535b2-19d1-457c-a9f7-8b66094b358b
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1:
core-n1.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick2:
core-n2.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick3:
core-n3.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick4:
core-n4.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick5:
core-n5.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Brick6:
core-n6.storage-s0.example.vpn:/export/ssd-brick-n001/ssd-vol-benchmark-n002
Options Reconfigured:
network.compression.mode: server
network.compression: on
cluster.server-quorum-type: server
server.allow-insecure: on
performance.io-thread-count: 12
auth.allow: 10.30.*
cluster.server-quorum-ratio: 51%
[root at core-n1 ssd-vol-benchmark-n002]#


Relevant /etc/fstab entries:


# sda
UUID=3d3b896d-67ef-444c-9f48-4bef621144b6  /boot  ext4   defaults  1 2
UUID=d92d59fe-4e85-4619-8b72-793297d4c076  swap   swap   defaults  0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /                       btrfs 
autodefrag,compress=zlib,ssd,thread_pool=12,subvol=root            0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /home                   btrfs 
autodefrag,compress=zlib,ssd,thread_pool=12,subvol=home            0 0
UUID=3fd8c539-dc9e-482b-8f74-76369bf8763e  /export/ssd-brick-n001  btrfs 
autodefrag,compress=zlib,ssd,thread_pool=12,subvol=ssd-brick-n001  0 0

# sdb
LABEL=hdd-n001  /export/hdd-brick-n001  btrfs 
autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n001  0 0

# sdc
LABEL=hdd-n002  /export/hdd-brick-n002  btrfs 
autodefrag,compress=zlib,thread_pool=12,noatime,subvol=hdd-brick-n002  0 0

# ssd-vol-benchmark-n001
core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n001 
/import/gluster/ssd-vol-benchmark-n001  glusterfs 
defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n001
 0 0

# ssd-vol-benchmark-n002
core-n1.storage-s0.example.vpn:/ssd-vol-benchmark-n002 
/import/gluster/ssd-vol-benchmark-n002  glusterfs 
defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-benchmark-n002
 0 0

# ssd-vol-ovirt-iops-n001
core-n1.storage-s0.example.vpn:/ssd-vol-ovirt-iops-n001 
/import/gluster/ssd-vol-ovirt-iops-n001  glusterfs 
defaults,_netdev,backup-volfile-servers=core-n2.storage-s0.example.vpn:core-n3.storage-s0.example.vpn:core-n4.storage-s0.example.vpn:core-n5.storage-s0.example.vpn:core-n6.storage-s0.example.vpn,direct-io-mode=enable,volume-name=ssd-vol-ovirt-iops-n001
 0 0





Running:

[root at core-n1 ~]# yum list installed|grep gluster
glusterfs.x86_64                     3.5.0-0.5.beta3.fc19             
@/glusterfs-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-api.x86_64                 3.5.0-0.5.beta3.fc19             
@/glusterfs-api-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-api-devel.x86_64           3.5.0-0.5.beta3.fc19             
@/glusterfs-api-devel-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-cli.x86_64                 3.5.0-0.5.beta3.fc19             
@/glusterfs-cli-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-devel.x86_64               3.5.0-0.5.beta3.fc19             
@/glusterfs-devel-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-fuse.x86_64                3.5.0-0.5.beta3.fc19             
@/glusterfs-fuse-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-geo-replication.x86_64     3.5.0-0.5.beta3.fc19             
@/glusterfs-geo-replication-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-libs.x86_64                3.5.0-0.5.beta3.fc19             
@/glusterfs-libs-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-rdma.x86_64                3.5.0-0.5.beta3.fc19             
@/glusterfs-rdma-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-regression-tests.x86_64    3.5.0-0.5.beta3.fc19             
@/glusterfs-regression-tests-3.5.0-0.5.beta3.fc19.x86_64
glusterfs-server.x86_64              3.5.0-0.5.beta3.fc19             
@/glusterfs-server-3.5.0-0.5.beta3.fc19.x86_64
[root at core-n1 ~]# uname -a
Linux core-n1.example.com 3.13.5-101.fc19.x86_64 #1 SMP Tue Feb 25 21:25:32 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux
[root at core-n1 ~]#

--- Additional comment from josh at wrale.com on 2014-03-07 07:43:17 CET ---

I just ran the same two tests on my HDD bricks (the first two tests were on SSD
bricks).  I obtained the same result (volumes ending in -n002 have compression
enabled, where volumes ending in -n001 do not):

[root at core-n1 hdd-vol-benchmark-n002]# fio --size=20g --bs=64k --rw=write
--ioengine=sync --name=fio.write.out.1
fio.write.out.1: (g=0): rw=write, bs=64K-64K/64K-64K/64K-64K, ioengine=sync,
iodepth=1
fio-2.0.13
Starting 1 process
fio.write.out.1: Laying out IO file(s) (1 file(s) / 20480MB)
fio: pid=28809, err=5/file:engines/sync.c:67, func=xfer, error=Input/output
error

fio.write.out.1: (groupid=0, jobs=1): err= 5 (file:engines/sync.c:67,
func=xfer, error=Input/output error): pid=28809: Fri Mar  7 01:40:19 2014
  write: io=262144 B, bw=32000KB/s, iops=625 , runt=     8msec
    clat (usec): min=40 , max=1037 , avg=441.50, stdev=462.21
     lat (usec): min=43 , max=1041 , avg=444.75, stdev=462.63
    clat percentiles (usec):
     |  1.00th=[   40],  5.00th=[   40], 10.00th=[   40], 20.00th=[   40],
     | 30.00th=[  114], 40.00th=[  114], 50.00th=[  114], 60.00th=[  572],
     | 70.00th=[  572], 80.00th=[ 1032], 90.00th=[ 1032], 95.00th=[ 1032],
     | 99.00th=[ 1032], 99.50th=[ 1032], 99.90th=[ 1032], 99.95th=[ 1032],
     | 99.99th=[ 1032]
    lat (usec) : 50=20.00%, 250=20.00%, 750=20.00%
    lat (msec) : 2=20.00%
  cpu          : usr=0.00%, sys=0.00%, ctx=8, majf=0, minf=47
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=16.7%, 4=83.3%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=5/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=256KB, aggrb=32000KB/s, minb=32000KB/s, maxb=32000KB/s, mint=8msec,
maxt=8msec
[root at core-n1 hdd-vol-benchmark-n002]#


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1073763
[Bug 1073763] network.compression fails simple '--ioengine=sync' fio test
-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Bugs mailing list