[Gluster-users] Slow performance from simple tar -x && rm -r benchmark

Sun Mar 18 11:20:58 UTC 2012

Hi. I've been considering using GlusterFS as backing storage for one of our
applications, and wanted to get a feel for its reliability and performance on a
toy test cluster.

I took two 16-core Opteron 6128 machines running stock linux 3.2.2 kernels
with sata-backed ext4 filesystems to use as backing storage for a test
glusterfs, mounted with default options:

  # grep store /proc/mounts 
  /dev/sda2 /store ext4 rw,relatime,user_xattr,acl,barrier=1,data=ordered 0 0

The machines are connected with gigabit ethernet cross-over.

I built the software from the glusterfs-3.2.5.tar.gz with no special configure
options except for a few --XXXdir changes to match our filesystem layout and
--enable-fusermount to avoid building fuse userspace specially.

I created /etc/glusterfs and /etc/glusterfs/glusterd.vol:

  # cat /etc/glusterfs/glusterd.vol 
  volume management
      type mgmt/glusterd
      option working-directory /etc/glusterd
      option transport-type socket
      option transport.socket.keepalive-time 10
      option transport.socket.keepalive-interval 2
  end-volume
  #

and started glusterd. I then taught the machines about each other's existence:

  # gluster peer probe 172.16.101.11
  Probe successful
  # gluster peer status
  Number of Peers: 1

  Hostname: 172.16.101.11
  Uuid: 52e9f1a2-8404-4945-a769-4b569ec982ed
  State: Accepted peer request (Connected)

and then created and mounted a mirror volume:

  # gluster volume create test replica 2 transport tcp 172.16.101.{9,11}:/store
  Creation of volume test has been successful. Please start the volume to access data.
  # gluster volume start test
  Starting volume test has been successful
  # mount -t glusterfs localhost:/test /mnt/test
  #

Mounting it on both machines, I can see that a file I add on one appears on the
other and so on. Great! Write performance streaming to a large file is fine
compared to a local write:

  # time bash -c 'dd if=/dev/zero of=/mnt/test/bigfile bs=1M count=1000; sync'
  1000+0 records in
  1000+0 records out
  1048576000 bytes (1.0 GB) copied, 11.4892 s, 91.3 MB/s

  real    0m11.531s
  user    0m0.000s
  sys     0m1.085s

vs

  # time bash -c 'dd if=/dev/zero of=/store2/bigfile bs=1M count=1000; sync'
  1000+0 records in
  1000+0 records out
  1048576000 bytes (1.0 GB) copied, 10.67 s, 98.3 MB/s

  real    0m10.912s
  user    0m0.000s
  sys     0m1.753s

However, if I try a simple metadata-intensive benchmark such as unpacking and
deleting a linux kernel source tree, performance is a factor of eleven worse
than local storage:

  # time bash -c 'tar xfz ~/linux-3.3-rc7.tgz; rm -rf linux-3.3-rc7'

  real    4m20.493s
  user    0m24.835s
  sys     0m7.119s

vs

  # time bash -c 'tar xfz ~/linux-3.3-rc7.tgz; rm -rf linux-3.3-rc7'

  real    0m23.196s
  user    0m20.775s
  sys     0m2.287s

Is this normal, or do I have something badly misconfigured? Is there anything I
can do to improve performance when creating and deleting small files on
glusterfs? I see that I already have client-side write-behind in the default
/etc/glusterd/vols/test/test-fuse.vol translator stack that was created for me,
and have tried playing with the parameters a bit without having any real
effect.

Best wishes,

Chris.