[Gluster-users] odd results/questions about performance

Tue Sep 2 15:13:11 UTC 2008

Hi guys,

I've set up a glusterfs cluster for image serving from a web server.
Overall it works great, but after running some performance tests, I'm
seeing some very odd results that I'm hoping somebody can shed some
light on.

The setup is as follows - 3 machines, each has a glusterfs server
running, exporting a namespace and a regular brick (both over tcp), a
client that has afr set up to all 3 machines for both the data bricks
and the namespace (with unify on top), and a webserver on each.

For tests, I was using JMeter to request 8k and 16k files from the web
server, which was in turn getting them from glusterfs.  In general,
performance from glusterfs was between 50% and 90% compared to just
serving from disk.

The oddities:

- reading the same file repeatedly was orders of magnitude slower than
reading randomly from a set of 100+ files.  The glusterfs server log has
messages about a lock request for that file being queued.  Thought that
might be due to the need to update access time, tried mounting with
noatime, didn't make a difference.

- setting "option read-subvolume" on the unify translator to point to a
local machine results in lower performance than pointing to a remote
machine.  If read-subvolume points to the local machine, but the
glusterfs server is down on it, the performance is *significantly
better* than it is if the read-subvolume points to a remote machine or
isn't specified.

- setting up an io-cache translator on the client reduced performance
considerably (both at the top of the translator stack and between afr
and the subvolumes).  Setting up read-ahead on top of the translator
stack also reduced performance considerably.  Setting up the
write-behind translator improved read performance.

I'd really appreciate any insight into these.  These behaviors are
opposite to what I'd expect, but I'm sure it makes sense to someone
familiar with the internal workings of glusterfs.  Any other ideas for
improving performance further would be great, too.

Thank you,

Alex

Glusterfs is running with direct_io enabled (the default).  The version
used is 1.3.10, running rh4 x86_64.

Server volume config:

volume posix

  type storage/posix

  option directory /var/opt/hcserv/glusterfs/export

end-volume

volume plocks

  type features/posix-locks

  subvolumes posix

end-volume

volume brick

  type performance/io-threads

  option thread-count 2

  subvolumes plocks

end-volume

volume brick-ns

  type storage/posix

  option directory /var/opt/hcserv/glusterfs/export-ns

end-volume

volume server

  type protocol/server

  option transport-type tcp/server

  subvolumes brick brick-ns

  option auth.ip.brick.allow *

  option auth.ip.brick-ns.allow *

end-volume

Client volume config:

volume 192.168.20.1

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.1

    option remote-subvolume brick

end-volume

volume 192.168.20.1-ns

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.1

    option remote-subvolume brick-ns

end-volume

volume 192.168.20.2

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.2

    option remote-subvolume brick

end-volume

volume 192.168.20.2-ns

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.2

    option remote-subvolume brick-ns

end-volume

volume 192.168.20.3

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.3

    option remote-subvolume brick

end-volume

volume 192.168.20.3-ns

    type protocol/client

    option transport-type tcp/client

    option remote-host 192.168.20.3

    option remote-subvolume brick-ns

end-volume

volume clients-afr

    type cluster/afr

    subvolumes 192.168.20.1 192.168.20.2 192.168.20.3

end-volume

volume clients-ns-afr

    type cluster/afr

    subvolumes 192.168.20.1-ns 192.168.20.2-ns 192.168.20.3-ns

end-volume

volume unify

    type cluster/unify

    subvolumes clients-afr

    option scheduler rr

    option namespace clients-ns-afr

end-volume

volume iot

    type performance/io-threads

    option thread-count 2

    subvolumes unify

end-volume

volume wb

    type performance/write-behind

    subvolumes iot

    option aggregate-size 1MB

end-volume

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20080902/bf66b0a1/attachment.html>