[Gluster-users] Poor Gluster performance

Lars Hanke debian at lhanke.de
Wed Feb 18 23:10:49 UTC 2015


Am 18.02.2015 um 23:26 schrieb Ben Turner:
> ----- Original Message -----
>> From: "Lars Hanke" <debian at lhanke.de>
>> To: "Ben Turner" <bturner at redhat.com>
>> Cc: gluster-users at gluster.org
>> Sent: Wednesday, February 18, 2015 5:09:19 PM
>> Subject: Re: [Gluster-users] Poor Gluster performance
>>
>> Am 18.02.2015 um 22:05 schrieb Ben Turner:
>>> ----- Original Message -----
>>>> From: "Lars Hanke" <debian at lhanke.de>
>>>> To: gluster-users at gluster.org
>>>> Sent: Wednesday, February 18, 2015 3:01:54 PM
>>>> Subject: [Gluster-users] Poor Gluster performance
>>>>
>>>> I set up a distributed, replicated volume consisting of just 2 bricks on
>>>> two physical nodes. The nodes are peered using a dedicated GB ethernet
>>>> and can be accessed from the clients using a separate GB ethernet NIC.
>>>>
>>>> Doing a simple dd performance test I see about 11 MB/s for read and
>>>> write. Running a local setup, i.e. both bricks on the same machine and
>>>> local mount, I saw even 500 MB/s. So network sould be the limiting
>>>> factor. But using NFS or CIFS on the same network I see 110 MB/s.
>>>>
>>>> Is gluster 10 times slower than NFS?
>>>
>>> Something is going on there.  On my gigabit setups I see 100-120 MB / sec
>>> writes for pure distribute and about 45-55 MB / sec with replica 2.  What
>>> block size are you using?  I could see that if you were writing something
>>> like 4k or under but 64k and up you should be getting about what I said.
>>> Can you tell me more about your test?
>>
>> Block size is 50M:
>>
>> root at gladsheim:/# mount -t glusterfs node2:/test ~/mnt
>> root at gladsheim:/# dd if=/dev/zero of=~/mnt/testfile.null bs=50M count=10
>> 10+0 records in
>> 10+0 records out
>> 524288000 bytes (524 MB) copied, 46.6079 s, 11.2 MB/s
>> root at gladsheim:/# dd if=~/mnt/testfile.null of=/dev/null bs=50M count=10
>> 10+0 records in
>> 10+0 records out
>> 524288000 bytes (524 MB) copied, 45.7487 s, 11.5 MB/s
>>
>> It doesn't depend on whether I use node1 or node2 for the mount.
>
> Here is how I usually run:
>
> [root at gqac022 gluster-mount]# time `dd if=/dev/zero of=/gluster-mount/test.txt bs=1024k count=1000; sync`
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 9.12639 s, 115 MB/s
> real	0m9.205s
> user	0m0.000s
> sys	0m0.670s
>
> [root at gqac022 gluster-mount]# sync; echo 3 > /proc/sys/vm/drop_caches
>
> [root at gqac022 gluster-mount]# dd if=./test.txt of=/dev/null bs=1024k count=1000
> 1000+0 records in
> 1000+0 records out
> 1048576000 bytes (1.0 GB) copied, 9.04464 s, 116 MB/s
>
> And with your commands:
>
> [root at gqac022 gluster-mount]# dd if=/dev/zero of=/gluster-mount/testfile.null bs=50M count=10
> 10+0 records in
> 10+0 records out
> 524288000 bytes (524 MB) copied, 5.00876 s, 105 MB/s
>
> [root at gqac022 gluster-mount]# sync; echo 3 > /proc/sys/vm/drop_caches
>
> [root at gqac022 gluster-mount]# dd if=./testfile.null of=/dev/null bs=1024k count=1000
> 500+0 records in
> 500+0 records out
> 524288000 bytes (524 MB) copied, 4.51992 s, 116 MB/s
>
> Normally to troubleshoot these issues I break the storage stack into it's individual pieces and test each one.  Try running on the bricks outside gluster and see what you are getting.  What all tuning are you using?  Is anything nonstandard?  What are the disks?

No tuning so far. Just peered and tried my luck.

node1 is LXC using LVM2 on RAID5 as brick. Here I see a raw performance:

root at muspelheim:~# time `dd if=/dev/zero 
of=/srv/bricks/test1/data/test.txt bs=1024k count=1000; sync`
1000+0 Datensätze ein
1000+0 Datensätze aus
1048576000 Bytes (1,0 GB) kopiert, 0,313791 s, 3,3 GB/s

real    0m8.444s
user    0m0.004s
sys     0m0.452s

self-mount on the node yields:

root at muspelheim:~# time `dd if=/dev/zero of=~/mnt/test.txt bs=1024k 
count=1000; sync`
1000+0 Datensätze ein
1000+0 Datensätze aus
1048576000 Bytes (1,0 GB) kopiert, 10,2122 s, 103 MB/s

real    0m15.492s
user    0m0.012s
sys     0m0.556s

which is the data rate to node2. node2 is a chroot on a SynologyNAS:

root at urdaborn:~# time `dd if=/dev/zero 
of=/srv/bricks/test1/data/test2.txt bs=1024k count=1000; sync`
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 5.31103 s, 197 MB/s

real    0m5.693s
user    0m0.012s
sys     0m4.680s

root at urdaborn:~# time `dd if=/dev/zero of=./mnt/test2.txt bs=1024k 
count=1000; sync`
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 15.172 s, 69.1 MB/s

real    0m15.765s
user    0m0.010s
sys     0m3.353s

so this already is only slightly more than half the network 
bandwidth.The connection between the nodes is a dedicated GB ethernet, 
so no disturbance by other services on the net.

Back on the test client (LXC on yet another physical machine):

root at gladsheim:/# time `dd if=/dev/zero of=~/mnt/testfile.null bs=1024k 
count=1000; sync`
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 91.6532 s, 11.4 MB/s

real    1m34.534s
user    0m0.004s
sys     0m2.280s

All physical machines are attached to the same switch.

Regards,
  - lars.



More information about the Gluster-users mailing list