[Gluster-devel] Performance report and some issues
Jordi Moles
jordi at cdmon.com
Thu Mar 6 17:01:01 UTC 2008
Hi,
i thought that apart from write-behind on postfix side, which is all about writing, i could also add
io-cache on dovecot side.
So... i added both over afr blocks (write on postfix and cache on dovecot), but performance hasn't improved at all.
I've tried just one: write-behind or io-cache and also both at the same time, but i can't see any improvement.
If i have a look at the postfix's log file, i can see that mails are getting delivered "in groups". I mean, we've got like 5 seconds in which no mail is delivered, and then,
like 5 mails are delivered at the same second. It's weird, i've never seen this in other mail systems i've set up, specially when mails are sent to the system every second.
***************WROTE:
Hi,
thanks for the info, but... where should i add the write-behind code?
under afr? over afr?
something like this...
***clients****
volume espa1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.204
option remote-subvolume espai
end-volume
volume espai1
type performance/write-behind
option aggregate-size 1MB
option flush-behind on
subvolumes espa1
end-volume
volume espa2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.205
option remote-subvolume espai
end-volume
volume espai1
type performance/write-behind
option aggregate-size 1MB
option flush-behind on
subvolumes espa1
end-volume
.......
.......
volume namespace1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.204
option remote-subvolume nm
end-volume
volume namespace2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.205
option remote-subvolume nm
end-volume
volume grup1
type cluster/afr
subvolumes espai1 espai2
end-volume
volume grup2
type cluster/afr
subvolumes espai3 espai4
end-volume
volume grup3
type cluster/afr
subvolumes espai5 espai6
end-volume
volume nm
type cluster/afr
subvolumes namespace1 namespace2
end-volume
volume ultim
type cluster/unify
subvolumes grup1 grup2 grup3
option scheduler rr
option namespace nm
end-volume
************
or maybe like this:
***clients****
volume espai1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.204
option remote-subvolume espai
end-volume
volume espai2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.205
option remote-subvolume espai
end-volume
.......
........
volume namespace1
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.204
option remote-subvolume nm
end-volume
volume namespace2
type protocol/client
option transport-type tcp/client
option remote-host 192.168.1.205
option remote-subvolume nm
end-volume
volume gru1
type cluster/afr
subvolumes espai1 espai2
end-volume
volume grup1
type performance/write-behind
option aggregate-size 1MB
option flush-behind on
subvolumes gru1
end-volume
volume gru2
type cluster/afr
subvolumes espai3 espai4
end-volume
volume grup2
type performance/write-behind
option aggregate-size 1MB
option flush-behind on
subvolumes gru2
end-volume
...
...
volume ultim
type cluster/unify
subvolumes grup1 grup2 grup3
option scheduler rr
option namespace nm
end-volume
************
En/na Amar S. Tumballi ha escrit:
> Hi Jordi,
> I see no performance translators on client side. You can load
write-behind
> and read-ahead/io-cache on client side. Without write-behind loaded,
write
> performance will be *very* less.
>
> Regards,
> Amar
> On Thu, Mar 6, 2008 at 4:58 AM, Jordi Moles <jordi at cdmon.com> wrote:
>
>> Hi,
>>
>> I want to report back the performance issues i've had so far with
>> glusterfs mainline 2.5, patch 690 and fuse-2.7.2glfs8.
>>
>> I'm setting a mail system, which is all ran by Xen 3.2.0 and every
>> "actual" piece of the mail system is a virtual machine from xen.
>>
>> Anyway... the virtual machines accessing glusterfs are 6 dovecots and 4
>> postfixs. There are also 6 nodes, which share their own disk to the
>> gluster filesystem. Two of the nodes, share 2 disks, one for the
>> glusterfs, and the other for the namespace
>>
>> these are the conf files:
>>
>> ****nodes with namespace****
>>
>> volume esp
>> type storage/posix
>> option directory /mnt/compartit
>> end-volume
>>
>> volume espa
>> type features/posix-locks
>> subvolumes esp
>> end-volume
>>
>> volume espai
>> type performance/io-threads
>> option thread-count 15
>> option cache-size 512MB
>> subvolumes espa
>> end-volume
>>
>> volume nm
>> type storage/posix
>> option directory /mnt/namespace
>> end-volume
>>
>> volume ultim
>> type protocol/server
>> subvolumes espai nm
>> option transport-type tcp/server
>> option auth.ip.espai.allow *
>> option auth.ip.nm.allow *
>> end-volume
>>
>> *************
>>
>>
>> ***nodes without namespace*****
>>
>> volume esp
>> type storage/posix
>> option directory /mnt/compartit
>> end-volume
>>
>> volume espa
>> type features/posix-locks
>> subvolumes esp
>> end-volume
>>
>> volume espai
>> type performance/io-threads
>> option thread-count 15
>> option cache-size 512MB
>> subvolumes espa
>> end-volume
>>
>> volume ultim
>> type protocol/server
>> subvolumes espai
>> option transport-type tcp/server
>> option auth.ip.espai.allow *
>> end-volume
>>
>> *****************************
>>
>>
>> ***clients****
>>
>> volume espai1
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.204
>> option remote-subvolume espai
>> end-volume
>>
>> volume espai2
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.205
>> option remote-subvolume espai
>> end-volume
>>
>> volume espai3
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.206
>> option remote-subvolume espai
>> end-volume
>>
>> volume espai4
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.207
>> option remote-subvolume espai
>> end-volume
>>
>> volume espai5
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.213
>> option remote-subvolume espai
>> end-volume
>>
>> volume espai6
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.214
>> option remote-subvolume espai
>> end-volume
>>
>> volume namespace1
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.204
>> option remote-subvolume nm
>> end-volume
>>
>> volume namespace2
>> type protocol/client
>> option transport-type tcp/client
>> option remote-host 192.168.1.205
>> option remote-subvolume nm
>> end-volume
>>
>> volume grup1
>> type cluster/afr
>> subvolumes espai1 espai2
>> end-volume
>>
>> volume grup2
>> type cluster/afr
>> subvolumes espai3 espai4
>> end-volume
>>
>> volume grup3
>> type cluster/afr
>> subvolumes espai5 espai6
>> end-volume
>>
>> volume nm
>> type cluster/afr
>> subvolumes namespace1 namespace2
>> end-volume
>>
>> volume ultim
>> type cluster/unify
>> subvolumes grup1 grup2 grup3
>> option scheduler rr
>> option namespace nm
>> end-volume
>>
>> ************
>>
>> The thing is that in earlier patches, the whole system used to hang,
>> with many different error messages.
>>
>> Right now, it's been on for days without any hang at all, but i'm
facing
>> serious performance issues.
>>
>> By only running an "ls" command, it can take like 3 seconds to show
>> something when the system is "under load". It doesn't happen at all
when
>> there's no activity, so i don't thing has anything to do with xen.
Well,
>> actually, "under load" can mean 3 mails arriving per second. I'm
>> monitoring everything, and no virtual machine is using more than 20% of
>> cpu or so.
>>
>> First, i had log level on both nodes and clients set to DEBUG, but now
>> is just WARNING, and i've restarted everything so many times.
>>
>> I was suggested to use "type performance/io-threads" on the node side.
>> It actually worked, before that, it wasn't 3 seconds, but 5 or more.
>> I've set the "thread-count" value to different values and also
>> "cache-size"
>>
>> The system is supposed to handle a big amount of traffic, far more than
>> 3 mails a second.
>>
>> What do you think about the whole set up? Should i keep using
namespace?
>> Should i use new nodes for namespaces? Should i use different values
for
>> iothread?
>>
>> One last thing... i'm using reiserfs on the "storage devices" that
nodes
>> share. Should i be using XFS or something else?
>>
>> Logs don't show any kind of error now... i don't have a clue of what is
>> failing now....
>>
>> I would be pleased if you could give some ideas.
>>
>> Thank you.
>>
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> Gluster-devel at nongnu.org
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>
>
>
_______________________________________________
Gluster-devel mailing list
Gluster-devel at nongnu.org
http://lists.nongnu.org/mailman/listinfo/gluster-devel
More information about the Gluster-devel
mailing list