[Gluster-users] Very slow performance on Sharded GlusterFS

Mon Jul 3 15:12:05 UTC 2017

Hi Krutika,

Have you be able to look out my profiles? Do you have any clue, idea or suggestion?

Thanks,

-Gencer

From: Krutika Dhananjay [mailto:kdhananj at redhat.com] 
Sent: Friday, June 30, 2017 3:50 PM
To: gencer at gencgiyen.com
Cc: gluster-user <gluster-users at gluster.org>
Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS

Just noticed that the way you have configured your brick order during volume-create makes both replicas of every set reside on the same machine.

That apart, do you see any difference if you change shard-block-size to 512MB? Could you try that?

If it doesn't help, could you share the volume-profile output for both the tests (separate)?

Here's what you do:

1. Start profile before starting your test - it could be dd or it could be file download.

# gluster volume profile <VOL> start

2. Run your test - again either dd or file-download.

3. Once the test has completed, run `gluster volume profile <VOL> info` and redirect its output to a tmp file.

4. Stop profile

# gluster volume profile <VOL> stop

And attach the volume-profile output file that you saved at a temporary location in step 3.

-Krutika

On Fri, Jun 30, 2017 at 5:33 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote:

Hi Krutika,

Sure, here is volume info:

root at sr-09-loc-50-14-18:/# gluster volume info testvol

Volume Name: testvol

Type: Distributed-Replicate

Volume ID: 30426017-59d5-4091-b6bc-279a905b704a

Status: Started

Snapshot Count: 0

Number of Bricks: 10 x 2 = 20

Transport-type: tcp

Bricks:

Brick1: sr-09-loc-50-14-18:/bricks/brick1

Brick2: sr-09-loc-50-14-18:/bricks/brick2

Brick3: sr-09-loc-50-14-18:/bricks/brick3

Brick4: sr-09-loc-50-14-18:/bricks/brick4

Brick5: sr-09-loc-50-14-18:/bricks/brick5

Brick6: sr-09-loc-50-14-18:/bricks/brick6

Brick7: sr-09-loc-50-14-18:/bricks/brick7

Brick8: sr-09-loc-50-14-18:/bricks/brick8

Brick9: sr-09-loc-50-14-18:/bricks/brick9

Brick10: sr-09-loc-50-14-18:/bricks/brick10

Brick11: sr-10-loc-50-14-18:/bricks/brick1

Brick12: sr-10-loc-50-14-18:/bricks/brick2

Brick13: sr-10-loc-50-14-18:/bricks/brick3

Brick14: sr-10-loc-50-14-18:/bricks/brick4

Brick15: sr-10-loc-50-14-18:/bricks/brick5

Brick16: sr-10-loc-50-14-18:/bricks/brick6

Brick17: sr-10-loc-50-14-18:/bricks/brick7

Brick18: sr-10-loc-50-14-18:/bricks/brick8

Brick19: sr-10-loc-50-14-18:/bricks/brick9

Brick20: sr-10-loc-50-14-18:/bricks/brick10

Options Reconfigured:

features.shard-block-size: 32MB

features.shard: on

transport.address-family: inet

nfs.disable: on

-Gencer.

From: Krutika Dhananjay [mailto:kdhananj at redhat.com <mailto:kdhananj at redhat.com> ] 
Sent: Friday, June 30, 2017 2:50 PM
To: gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> 
Cc: gluster-user <gluster-users at gluster.org <mailto:gluster-users at gluster.org> >
Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS

Could you please provide the volume-info output?

-Krutika

On Fri, Jun 30, 2017 at 4:23 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote:

Hi,

I have an 2 nodes with 20 bricks in total (10+10).

First test: 

2 Nodes with Distributed – Striped – Replicated (2 x 2)

10GbE Speed between nodes

“dd” performance: 400mb/s and higher

Downloading a large file from internet and directly to the gluster: 250-300mb/s

Now same test without Stripe but with sharding. This results are same when I set shard size 4MB or 32MB. (Again 2x Replica here)

Dd performance: 70mb/s

Download directly to the gluster performance : 60mb/s

Now, If we do this test twice at the same time (two dd or two doewnload at the same time) it goes below 25/mb each or slower.

I thought sharding is at least equal or a little slower (maybe?) but these results are terribly slow.

I tried tuning (cache, window-size etc..). Nothing helps.

GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are “xfs” and 4TB each.

Is there any tweak/tuning out there to make it fast?

Or is this an expected behavior? If its, It is unacceptable. So slow. I cannot use this on production as it is terribly slow. 

The reason behind I use shard instead of stripe is i would like to eleminate files that bigger than brick size.

Thanks,

Gencer.

_______________________________________________
Gluster-users mailing list
Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> 
http://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170703/eccdcf8d/attachment.html>