[Gluster-users] Slow write times to gluster disk
Pranith Kumar Karampuri
pkarampu at redhat.com
Wed May 10 17:27:46 UTC 2017
On Wed, May 10, 2017 at 10:15 PM, Pat Haley <phaley at mit.edu> wrote:
>
> Hi Pranith,
>
> Not entirely sure (this isn't my area of expertise). I'll run your answer
> by some other people who are more familiar with this.
>
> I am also uncertain about how to interpret the results when we also add
> the dd tests writing to the /home area (no gluster, still on the same
> machine)
>
> - dd test without oflag=sync (rough average of multiple tests)
> - gluster w/ fuse mount : 570 Mb/s
> - gluster w/ nfs mount: 390 Mb/s
> - nfs (no gluster): 1.2 Gb/s
> - dd test with oflag=sync (rough average of multiple tests)
> - gluster w/ fuse mount: 5 Mb/s
> - gluster w/ nfs mount: 200 Mb/s
> - nfs (no gluster): 20 Mb/s
>
> Given that the non-gluster area is a RAID-6 of 4 disks while each brick of
> the gluster area is a RAID-6 of 32 disks, I would naively expect the writes
> to the gluster area to be roughly 8x faster than to the non-gluster.
>
I think a better test is to try and write to a file using nfs without any
gluster to a location that is not inside the brick but someother location
that is on same disk(s). If you are mounting the partition as the brick,
then we can write to a file inside .glusterfs directory, something like
<brick-path>/.glusterfs/<file-to-be-removed-after-test>.
> I still think we have a speed issue, I can't tell if fuse vs nfs is part
> of the problem.
>
I got interested in the post because I read that fuse speed is lesser than
nfs speed which is counter-intuitive to my understanding. So wanted
clarifications. Now that I got my clarifications where fuse outperformed
nfs without sync, we can resume testing as described above and try to find
what it is. Based on your email-id I am guessing you are from Boston and I
am from Bangalore so if you are okay with doing this debugging for multiple
days because of timezones, I will be happy to help. Please be a bit patient
with me, I am under a release crunch but I am very curious with the problem
you posted.
Was there anything useful in the profiles?
>
Unfortunately profiles didn't help me much, I think we are collecting the
profiles from an active volume, so it has a lot of information that is not
pertaining to dd so it is difficult to find the contributions of dd. So I
went through your post again and found something I didn't pay much
attention to earlier i.e. oflag=sync, so did my own tests on my setup with
FUSE so sent that reply.
>
> Pat
>
>
>
> On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
>
> Okay good. At least this validates my doubts. Handling O_SYNC in gluster
> NFS and fuse is a bit different.
> When application opens a file with O_SYNC on fuse mount then each write
> syscall has to be written to disk as part of the syscall where as in case
> of NFS, there is no concept of open. NFS performs write though a handle
> saying it needs to be a synchronous write, so write() syscall is performed
> first then it performs fsync(). so an write on an fd with O_SYNC becomes
> write+fsync. I am suspecting that when multiple threads do this
> write+fsync() operation on the same file, multiple writes are batched
> together to be written do disk so the throughput on the disk is increasing
> is my guess.
>
> Does it answer your doubts?
>
> On Wed, May 10, 2017 at 9:35 PM, Pat Haley <phaley at mit.edu> wrote:
>
>>
>> Without the oflag=sync and only a single test of each, the FUSE is going
>> faster than NFS:
>>
>> FUSE:
>> mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
>> of=zeros.txt conv=sync
>> 4096+0 records in
>> 4096+0 records out
>> 4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>>
>>
>> NFS
>> mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576 of=zeros.txt
>> conv=sync
>> 4096+0 records in
>> 4096+0 records out
>> 4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>>
>>
>>
>> On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>>
>> Could you let me know the speed without oflag=sync on both the mounts? No
>> need to collect profiles.
>>
>> On Wed, May 10, 2017 at 9:17 PM, Pat Haley <phaley at mit.edu> wrote:
>>
>>>
>>> Here is what I see now:
>>>
>>> [root at mseas-data2 ~]# gluster volume info
>>>
>>> Volume Name: data-volume
>>> Type: Distribute
>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>> Status: Started
>>> Number of Bricks: 2
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: mseas-data2:/mnt/brick1
>>> Brick2: mseas-data2:/mnt/brick2
>>> Options Reconfigured:
>>> diagnostics.count-fop-hits: on
>>> diagnostics.latency-measurement: on
>>> nfs.exports-auth-enable: on
>>> diagnostics.brick-sys-log-level: WARNING
>>> performance.readdir-ahead: on
>>> nfs.disable: on
>>> nfs.export-volumes: off
>>>
>>>
>>>
>>> On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>>>
>>> Is this the volume info you have?
>>>
>>> >* [root at mseas-data2 <http://www.gluster.org/mailman/listinfo/gluster-users> ~]# gluster volume info
>>> *>>* Volume Name: data-volume
>>> *>* Type: Distribute
>>> *>* Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>> *>* Status: Started
>>> *>* Number of Bricks: 2
>>> *>* Transport-type: tcp
>>> *>* Bricks:
>>> *>* Brick1: mseas-data2:/mnt/brick1
>>> *>* Brick2: mseas-data2:/mnt/brick2
>>> *>* Options Reconfigured:
>>> *>* performance.readdir-ahead: on
>>> *>* nfs.disable: on
>>> *>* nfs.export-volumes: off
>>>
>>> *
>>>
>>> I copied this from old thread from 2016. This is distribute volume. Did
>>> you change any of the options in between?
>>>
>>> --
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> Pat Haley Email: phaley at mit.edu
>>> Center for Ocean Engineering Phone: (617) 253-6824
>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>>> 77 Massachusetts Avenue
>>> Cambridge, MA 02139-4301
>>>
>>> --
>> Pranith
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley Email: phaley at mit.edu
>> Center for Ocean Engineering Phone: (617) 253-6824
>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>> MIT, Room 5-213 http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA 02139-4301
>>
>> --
> Pranith
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley Email: phaley at mit.edu
> Center for Ocean Engineering Phone: (617) 253-6824
> Dept. of Mechanical Engineering Fax: (617) 253-8125
> MIT, Room 5-213 http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA 02139-4301
>
>
--
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170510/45a5a1d6/attachment.html>
More information about the Gluster-users
mailing list