[Gluster-users] Slow write times to gluster disk

Pat Haley phaley at mit.edu
Wed May 10 21:18:26 UTC 2017


Hi Pranith,

Since we are mounting the partitions as the bricks, I tried the dd test 
writing to <brick-path>/.glusterfs/<file-to-be-removed-after-test>. The 
results without oflag=sync were 1.6 Gb/s (faster than gluster but not as 
fast as I was expecting given the 1.2 Gb/s to the no-gluster area w/ 
fewer disks).

Pat


On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:
>
>
> On Wed, May 10, 2017 at 10:15 PM, Pat Haley <phaley at mit.edu 
> <mailto:phaley at mit.edu>> wrote:
>
>
>     Hi Pranith,
>
>     Not entirely sure (this isn't my area of expertise). I'll run your
>     answer by some other people who are more familiar with this.
>
>     I am also uncertain about how to interpret the results when we
>     also add the dd tests writing to the /home area (no gluster, still
>     on the same machine)
>
>       * dd test without oflag=sync (rough average of multiple tests)
>           o gluster w/ fuse mount : 570 Mb/s
>           o gluster w/ nfs mount:  390 Mb/s
>           o nfs (no gluster):  1.2 Gb/s
>       * dd test with oflag=sync (rough average of multiple tests)
>           o gluster w/ fuse mount:  5 Mb/s
>           o gluster w/ nfs mount:  200 Mb/s
>           o nfs (no gluster): 20 Mb/s
>
>     Given that the non-gluster area is a RAID-6 of 4 disks while each
>     brick of the gluster area is a RAID-6 of 32 disks, I would naively
>     expect the writes to the gluster area to be roughly 8x faster than
>     to the non-gluster.
>
>
> I think a better test is to try and write to a file using nfs without 
> any gluster to a location that is not inside the brick but someother 
> location that is on same disk(s). If you are mounting the partition as 
> the brick, then we can write to a file inside .glusterfs directory, 
> something like <brick-path>/.glusterfs/<file-to-be-removed-after-test>.
>
>
>     I still think we have a speed issue, I can't tell if fuse vs nfs
>     is part of the problem.
>
>
> I got interested in the post because I read that fuse speed is lesser 
> than nfs speed which is counter-intuitive to my understanding. So 
> wanted clarifications. Now that I got my clarifications where fuse 
> outperformed nfs without sync, we can resume testing as described 
> above and try to find what it is. Based on your email-id I am guessing 
> you are from Boston and I am from Bangalore so if you are okay with 
> doing this debugging for multiple days because of timezones, I will be 
> happy to help. Please be a bit patient with me, I am under a release 
> crunch but I am very curious with the problem you posted.
>
>       Was there anything useful in the profiles?
>
>
> Unfortunately profiles didn't help me much, I think we are collecting 
> the profiles from an active volume, so it has a lot of information 
> that is not pertaining to dd so it is difficult to find the 
> contributions of dd. So I went through your post again and found 
> something I didn't pay much attention to earlier i.e. oflag=sync, so 
> did my own tests on my setup with FUSE so sent that reply.
>
>
>     Pat
>
>
>
>     On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
>>     Okay good. At least this validates my doubts. Handling O_SYNC in
>>     gluster NFS and fuse is a bit different.
>>     When application opens a file with O_SYNC on fuse mount then each
>>     write syscall has to be written to disk as part of the syscall
>>     where as in case of NFS, there is no concept of open. NFS
>>     performs write though a handle saying it needs to be a
>>     synchronous write, so write() syscall is performed first then it
>>     performs fsync(). so an write on an fd with O_SYNC becomes
>>     write+fsync. I am suspecting that when multiple threads do this
>>     write+fsync() operation on the same file, multiple writes are
>>     batched together to be written do disk so the throughput on the
>>     disk is increasing is my guess.
>>
>>     Does it answer your doubts?
>>
>>     On Wed, May 10, 2017 at 9:35 PM, Pat Haley <phaley at mit.edu
>>     <mailto:phaley at mit.edu>> wrote:
>>
>>
>>         Without the oflag=sync and only a single test of each, the
>>         FUSE is going faster than NFS:
>>
>>         FUSE:
>>         mseas-data2(dri_nascar)% dd if=/dev/zero count=4096
>>         bs=1048576 of=zeros.txt conv=sync
>>         4096+0 records in
>>         4096+0 records out
>>         4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>>
>>
>>         NFS
>>         mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
>>         of=zeros.txt conv=sync
>>         4096+0 records in
>>         4096+0 records out
>>         4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>>
>>
>>
>>         On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>>>         Could you let me know the speed without oflag=sync on both
>>>         the mounts? No need to collect profiles.
>>>
>>>         On Wed, May 10, 2017 at 9:17 PM, Pat Haley <phaley at mit.edu
>>>         <mailto:phaley at mit.edu>> wrote:
>>>
>>>
>>>             Here is what I see now:
>>>
>>>             [root at mseas-data2 ~]# gluster volume info
>>>
>>>             Volume Name: data-volume
>>>             Type: Distribute
>>>             Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>>             Status: Started
>>>             Number of Bricks: 2
>>>             Transport-type: tcp
>>>             Bricks:
>>>             Brick1: mseas-data2:/mnt/brick1
>>>             Brick2: mseas-data2:/mnt/brick2
>>>             Options Reconfigured:
>>>             diagnostics.count-fop-hits: on
>>>             diagnostics.latency-measurement: on
>>>             nfs.exports-auth-enable: on
>>>             diagnostics.brick-sys-log-level: WARNING
>>>             performance.readdir-ahead: on
>>>             nfs.disable: on
>>>             nfs.export-volumes: off
>>>
>>>
>>>
>>>             On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>>>>             Is this the volume info you have?
>>>>
>>>>             >/[root at mseas-data2
>>>>             <http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>             ~]# gluster volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp />/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured: />/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
>>>>             ​I copied this from old thread from 2016. This is
>>>>             distribute volume. Did you change any of the options in
>>>>             between?
>>>
>>>             -- 
>>>
>>>             -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>             Pat Haley                          Email:phaley at mit.edu <mailto:phaley at mit.edu>
>>>             Center for Ocean Engineering       Phone:  (617) 253-6824
>>>             Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>             MIT, Room 5-213http://web.mit.edu/phaley/www/
>>>             77 Massachusetts Avenue
>>>             Cambridge, MA  02139-4301
>>>
>>>         -- 
>>>         Pranith
>>         -- 
>>
>>         -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>         Pat Haley                          Email:phaley at mit.edu <mailto:phaley at mit.edu>
>>         Center for Ocean Engineering       Phone:  (617) 253-6824
>>         Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>         MIT, Room 5-213http://web.mit.edu/phaley/www/
>>         77 Massachusetts Avenue
>>         Cambridge, MA  02139-4301
>>
>>     -- 
>>     Pranith
>     -- 
>
>     -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>     Pat Haley                          Email:phaley at mit.edu <mailto:phaley at mit.edu>
>     Center for Ocean Engineering       Phone:  (617) 253-6824
>     Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>     MIT, Room 5-213http://web.mit.edu/phaley/www/
>     77 Massachusetts Avenue
>     Cambridge, MA  02139-4301
>
> -- 
> Pranith
-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley at mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170510/1f7fba06/attachment.html>


More information about the Gluster-users mailing list