[Gluster-devel] Performance report and some issues

Thu Mar 6 17:01:01 UTC 2008

Hi,

i thought that apart from write-behind on postfix side, which is all about writing, i could also add

io-cache on dovecot side.

So... i added both over afr blocks (write on postfix and cache on dovecot), but performance hasn't improved at all.

I've tried just one: write-behind or io-cache and also both at the same time, but i can't see any improvement.

If i have a look at the postfix's log file, i can see that mails are getting delivered "in groups". I mean, we've got like 5 seconds in which no mail is delivered, and then,
like 5 mails are delivered at the same second. It's weird, i've never seen this in other mail systems i've set up, specially when mails are sent to the system every second.

***************WROTE:

Hi,

thanks for the info, but... where should i add the write-behind code? 
under afr? over afr?

something like this...

***clients****

volume espa1
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.204
 option remote-subvolume espai
end-volume

volume espai1
 type performance/write-behind
 option aggregate-size 1MB
 option flush-behind on
 subvolumes espa1
end-volume

volume espa2
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.205
 option remote-subvolume espai
end-volume

volume espai1
 type performance/write-behind
 option aggregate-size 1MB
 option flush-behind on
 subvolumes espa1
end-volume

.......

.......

volume namespace1
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.204
 option remote-subvolume nm
end-volume

volume namespace2
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.205
 option remote-subvolume nm
end-volume

volume grup1
 type cluster/afr
 subvolumes espai1 espai2
end-volume

volume grup2
 type cluster/afr
 subvolumes espai3 espai4
end-volume

volume grup3
 type cluster/afr
 subvolumes espai5 espai6
end-volume

volume nm
 type cluster/afr
 subvolumes namespace1 namespace2
end-volume

volume ultim
 type cluster/unify
 subvolumes grup1 grup2 grup3
 option scheduler rr
 option namespace nm
end-volume

************

or maybe like this:

***clients****

volume espai1
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.204
 option remote-subvolume espai
end-volume

volume espai2
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.205
 option remote-subvolume espai
end-volume

.......

........

volume namespace1
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.204
 option remote-subvolume nm
end-volume

volume namespace2
 type protocol/client
 option transport-type tcp/client
 option remote-host 192.168.1.205
 option remote-subvolume nm
end-volume

volume gru1
 type cluster/afr
 subvolumes espai1 espai2
end-volume

volume grup1
 type performance/write-behind
 option aggregate-size 1MB
 option flush-behind on
 subvolumes gru1
end-volume

volume gru2
 type cluster/afr
 subvolumes espai3 espai4
end-volume

volume grup2
 type performance/write-behind
 option aggregate-size 1MB
 option flush-behind on
 subvolumes gru2
end-volume

...

...

volume ultim
 type cluster/unify
 subvolumes grup1 grup2 grup3
 option scheduler rr
 option namespace nm
end-volume

************

En/na Amar S. Tumballi ha escrit:
 > Hi Jordi,
 > I see no performance translators on client side. You can load 
write-behind
 > and read-ahead/io-cache on client side. Without write-behind loaded, 
write
 > performance will be *very* less.
 >
 > Regards,
 > Amar
 > On Thu, Mar 6, 2008 at 4:58 AM, Jordi Moles <jordi at cdmon.com> wrote:
 >
 >> Hi,
 >>
 >> I want to report back the performance issues i've had so far with
 >> glusterfs mainline 2.5, patch 690 and fuse-2.7.2glfs8.
 >>
 >> I'm setting a mail system, which is all ran by Xen 3.2.0 and every
 >> "actual" piece of the mail system is a virtual machine from xen.
 >>
 >> Anyway... the virtual machines accessing glusterfs are 6 dovecots and 4
 >> postfixs.  There are also 6 nodes, which share their own disk to the
 >> gluster filesystem. Two of the nodes, share 2 disks, one for the
 >> glusterfs, and the other for the namespace
 >>
 >> these are the conf files:
 >>
 >> ****nodes with namespace****
 >>
 >> volume esp
 >>    type storage/posix
 >>    option directory /mnt/compartit
 >> end-volume
 >>
 >> volume espa
 >>    type features/posix-locks
 >>    subvolumes esp
 >> end-volume
 >>
 >> volume espai
 >>   type performance/io-threads
 >>   option thread-count 15
 >>   option cache-size 512MB
 >>   subvolumes espa
 >> end-volume
 >>
 >> volume nm
 >>    type storage/posix
 >>    option directory /mnt/namespace
 >> end-volume
 >>
 >> volume ultim
 >>    type protocol/server
 >>    subvolumes espai nm
 >>    option transport-type tcp/server
 >>    option auth.ip.espai.allow *
 >>    option auth.ip.nm.allow *
 >> end-volume
 >>
 >> *************
 >>
 >>
 >> ***nodes without namespace*****
 >>
 >> volume esp
 >>    type storage/posix
 >>    option directory /mnt/compartit
 >> end-volume
 >>
 >> volume espa
 >>    type features/posix-locks
 >>    subvolumes esp
 >> end-volume
 >>
 >> volume espai
 >>   type performance/io-threads
 >>   option thread-count 15
 >>   option cache-size 512MB
 >>   subvolumes espa
 >> end-volume
 >>
 >> volume ultim
 >>    type protocol/server
 >>    subvolumes espai
 >>    option transport-type tcp/server
 >>    option auth.ip.espai.allow *
 >> end-volume
 >>
 >> *****************************
 >>
 >>
 >> ***clients****
 >>
 >> volume espai1
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.204
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume espai2
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.205
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume espai3
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.206
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume espai4
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.207
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume espai5
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.213
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume espai6
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.214
 >>    option remote-subvolume espai
 >> end-volume
 >>
 >> volume namespace1
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.204
 >>    option remote-subvolume nm
 >> end-volume
 >>
 >> volume namespace2
 >>    type protocol/client
 >>    option transport-type tcp/client
 >>    option remote-host 192.168.1.205
 >>    option remote-subvolume nm
 >> end-volume
 >>
 >> volume grup1
 >>    type cluster/afr
 >>    subvolumes espai1 espai2
 >> end-volume
 >>
 >> volume grup2
 >>    type cluster/afr
 >>    subvolumes espai3 espai4
 >> end-volume
 >>
 >> volume grup3
 >>    type cluster/afr
 >>    subvolumes espai5 espai6
 >> end-volume
 >>
 >> volume nm
 >>    type cluster/afr
 >>    subvolumes namespace1 namespace2
 >> end-volume
 >>
 >> volume ultim
 >>    type cluster/unify
 >>    subvolumes grup1 grup2 grup3
 >>    option scheduler rr
 >>    option namespace nm
 >> end-volume
 >>
 >> ************
 >>
 >> The thing is that in earlier patches, the whole system used to hang,
 >> with many different error messages.
 >>
 >> Right now, it's been on for days without any hang at all, but i'm 
facing
 >> serious performance issues.
 >>
 >> By only running an "ls" command, it can take like 3 seconds to show
 >> something when the system is "under load". It doesn't happen at all 
when
 >> there's no activity, so i don't thing has anything to do with xen. 
Well,
 >> actually, "under load" can mean 3 mails arriving per second. I'm
 >> monitoring everything, and no virtual machine is using more than 20% of
 >> cpu or so.
 >>
 >> First, i had log level on both nodes and clients set to DEBUG, but now
 >> is just WARNING, and i've restarted everything so many times.
 >>
 >> I was suggested to use "type performance/io-threads" on the node side.
 >> It actually worked, before that, it wasn't 3 seconds, but 5 or more.
 >> I've set the "thread-count" value to different values and also
 >> "cache-size"
 >>
 >> The system is supposed to handle a big amount of traffic, far more than
 >> 3 mails a second.
 >>
 >> What do you think about the whole set up? Should i keep using 
namespace?
 >> Should i use new nodes for namespaces? Should i use different values 
for
 >> iothread?
 >>
 >> One last thing... i'm using reiserfs on the "storage devices" that 
nodes
 >> share. Should i be using XFS or something else?
 >>
 >> Logs don't show any kind of error now... i don't have a clue of what is
 >> failing now....
 >>
 >> I would be pleased if you could give some ideas.
 >>
 >> Thank you.
 >>
 >>
 >> _______________________________________________
 >> Gluster-devel mailing list
 >> Gluster-devel at nongnu.org
 >> http://lists.nongnu.org/mailman/listinfo/gluster-devel
 >>
 >
 >
 >

_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel