[Gluster-devel] Performance report and some issues

Thu Mar 6 12:58:49 UTC 2008

Hi,

I want to report back the performance issues i've had so far with 
glusterfs mainline 2.5, patch 690 and fuse-2.7.2glfs8.

I'm setting a mail system, which is all ran by Xen 3.2.0 and every 
"actual" piece of the mail system is a virtual machine from xen.

Anyway... the virtual machines accessing glusterfs are 6 dovecots and 4 
postfixs.  There are also 6 nodes, which share their own disk to the 
gluster filesystem. Two of the nodes, share 2 disks, one for the 
glusterfs, and the other for the namespace

these are the conf files:

****nodes with namespace****

volume esp
    type storage/posix
    option directory /mnt/compartit
end-volume

volume espa
    type features/posix-locks
    subvolumes esp
end-volume

volume espai
   type performance/io-threads
   option thread-count 15
   option cache-size 512MB
   subvolumes espa
end-volume

volume nm
    type storage/posix
    option directory /mnt/namespace
end-volume

volume ultim
    type protocol/server
    subvolumes espai nm
    option transport-type tcp/server
    option auth.ip.espai.allow *
    option auth.ip.nm.allow *
end-volume

*************

***nodes without namespace*****

volume esp
    type storage/posix
    option directory /mnt/compartit
end-volume

volume espa
    type features/posix-locks
    subvolumes esp
end-volume

volume espai
   type performance/io-threads
   option thread-count 15
   option cache-size 512MB
   subvolumes espa
end-volume

volume ultim
    type protocol/server
    subvolumes espai
    option transport-type tcp/server
    option auth.ip.espai.allow *
end-volume

*****************************

***clients****

volume espai1
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.204
    option remote-subvolume espai
end-volume

volume espai2
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.205
    option remote-subvolume espai
end-volume

volume espai3
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.206
    option remote-subvolume espai
end-volume

volume espai4
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.207
    option remote-subvolume espai
end-volume

volume espai5
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.213
    option remote-subvolume espai
end-volume

volume espai6
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.214
    option remote-subvolume espai
end-volume

volume namespace1
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.204
    option remote-subvolume nm
end-volume

volume namespace2
    type protocol/client
    option transport-type tcp/client
    option remote-host 192.168.1.205
    option remote-subvolume nm
end-volume

volume grup1
    type cluster/afr
    subvolumes espai1 espai2
end-volume

volume grup2
    type cluster/afr
    subvolumes espai3 espai4
end-volume

volume grup3
    type cluster/afr
    subvolumes espai5 espai6
end-volume

volume nm
    type cluster/afr
    subvolumes namespace1 namespace2
end-volume

volume ultim
    type cluster/unify
    subvolumes grup1 grup2 grup3
    option scheduler rr
    option namespace nm
end-volume

************

The thing is that in earlier patches, the whole system used to hang, 
with many different error messages.

Right now, it's been on for days without any hang at all, but i'm facing 
serious performance issues.

By only running an "ls" command, it can take like 3 seconds to show 
something when the system is "under load". It doesn't happen at all when 
there's no activity, so i don't thing has anything to do with xen. Well, 
actually, "under load" can mean 3 mails arriving per second. I'm 
monitoring everything, and no virtual machine is using more than 20% of 
cpu or so.

First, i had log level on both nodes and clients set to DEBUG, but now 
is just WARNING, and i've restarted everything so many times.

I was suggested to use "type performance/io-threads" on the node side. 
It actually worked, before that, it wasn't 3 seconds, but 5 or more. 
I've set the "thread-count" value to different values and also "cache-size"

The system is supposed to handle a big amount of traffic, far more than 
3 mails a second.

What do you think about the whole set up? Should i keep using namespace? 
Should i use new nodes for namespaces? Should i use different values for 
iothread?

One last thing... i'm using reiserfs on the "storage devices" that nodes 
share. Should i be using XFS or something else?

Logs don't show any kind of error now... i don't have a clue of what is 
failing now....

I would be pleased if you could give some ideas.

Thank you.