[Gluster-users] Slow write times to gluster disk

Pat Haley phaley at mit.edu
Wed May 10 16:45:04 UTC 2017


Hi Pranith,

Not entirely sure (this isn't my area of expertise).  I'll run your 
answer by some other people who are more familiar with this.

I am also uncertain about how to interpret the results when we also add 
the dd tests writing to the /home area (no gluster, still on the same 
machine)

  * dd test without oflag=sync (rough average of multiple tests)
      o gluster w/ fuse mount : 570 Mb/s
      o gluster w/ nfs mount:  390 Mb/s
      o nfs (no gluster):  1.2 Gb/s
  * dd test with oflag=sync (rough average of multiple tests)
      o gluster w/ fuse mount:  5 Mb/s
      o gluster w/ nfs mount:  200 Mb/s
      o nfs (no gluster): 20 Mb/s

Given that the non-gluster area is a RAID-6 of 4 disks while each brick 
of the gluster area is a RAID-6 of 32 disks, I would naively expect the 
writes to the gluster area to be roughly 8x faster than to the non-gluster.

I still think we have a speed issue, I can't tell if fuse vs nfs is part 
of the problem.  Was there anything useful in the profiles?

Pat


On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
> Okay good. At least this validates my doubts. Handling O_SYNC in 
> gluster NFS and fuse is a bit different.
> When application opens a file with O_SYNC on fuse mount then each 
> write syscall has to be written to disk as part of the syscall where 
> as in case of NFS, there is no concept of open. NFS performs write 
> though a handle saying it needs to be a synchronous write, so write() 
> syscall is performed first then it performs fsync(). so an write on an 
> fd with O_SYNC becomes write+fsync. I am suspecting that when multiple 
> threads do this write+fsync() operation on the same file, multiple 
> writes are batched together to be written do disk so the throughput on 
> the disk is increasing is my guess.
>
> Does it answer your doubts?
>
> On Wed, May 10, 2017 at 9:35 PM, Pat Haley <phaley at mit.edu 
> <mailto:phaley at mit.edu>> wrote:
>
>
>     Without the oflag=sync and only a single test of each, the FUSE is
>     going faster than NFS:
>
>     FUSE:
>     mseas-data2(dri_nascar)% dd if=/dev/zero count=4096 bs=1048576
>     of=zeros.txt conv=sync
>     4096+0 records in
>     4096+0 records out
>     4294967296 bytes (4.3 GB) copied, 7.46961 s, 575 MB/s
>
>
>     NFS
>     mseas-data2(HYCOM)% dd if=/dev/zero count=4096 bs=1048576
>     of=zeros.txt conv=sync
>     4096+0 records in
>     4096+0 records out
>     4294967296 bytes (4.3 GB) copied, 11.4264 s, 376 MB/s
>
>
>
>     On 05/10/2017 11:53 AM, Pranith Kumar Karampuri wrote:
>>     Could you let me know the speed without oflag=sync on both the
>>     mounts? No need to collect profiles.
>>
>>     On Wed, May 10, 2017 at 9:17 PM, Pat Haley <phaley at mit.edu
>>     <mailto:phaley at mit.edu>> wrote:
>>
>>
>>         Here is what I see now:
>>
>>         [root at mseas-data2 ~]# gluster volume info
>>
>>         Volume Name: data-volume
>>         Type: Distribute
>>         Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
>>         Status: Started
>>         Number of Bricks: 2
>>         Transport-type: tcp
>>         Bricks:
>>         Brick1: mseas-data2:/mnt/brick1
>>         Brick2: mseas-data2:/mnt/brick2
>>         Options Reconfigured:
>>         diagnostics.count-fop-hits: on
>>         diagnostics.latency-measurement: on
>>         nfs.exports-auth-enable: on
>>         diagnostics.brick-sys-log-level: WARNING
>>         performance.readdir-ahead: on
>>         nfs.disable: on
>>         nfs.export-volumes: off
>>
>>
>>
>>         On 05/10/2017 11:44 AM, Pranith Kumar Karampuri wrote:
>>>         Is this the volume info you have?
>>>
>>>         >/[root at mseas-data2
>>>         <http://www.gluster.org/mailman/listinfo/gluster-users> ~]#
>>>         gluster volume info />//>/Volume Name: data-volume />/Type: Distribute />/Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 />/Transport-type: tcp />/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: mseas-data2:/mnt/brick2 />/Options Reconfigured: />/performance.readdir-ahead: on />/nfs.disable: on />/nfs.export-volumes: off /
>>>         ​I copied this from old thread from 2016. This is distribute
>>>         volume. Did you change any of the options in between?
>>
>>         -- 
>>
>>         -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>         Pat Haley                          Email:phaley at mit.edu <mailto:phaley at mit.edu>
>>         Center for Ocean Engineering       Phone:  (617) 253-6824
>>         Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>         MIT, Room 5-213http://web.mit.edu/phaley/www/
>>         77 Massachusetts Avenue
>>         Cambridge, MA  02139-4301
>>
>>     -- 
>>     Pranith
>     -- 
>
>     -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>     Pat Haley                          Email:phaley at mit.edu <mailto:phaley at mit.edu>
>     Center for Ocean Engineering       Phone:  (617) 253-6824
>     Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>     MIT, Room 5-213http://web.mit.edu/phaley/www/
>     77 Massachusetts Avenue
>     Cambridge, MA  02139-4301
>
> -- 
> Pranith
-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley at mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170510/8131b742/attachment.html>


More information about the Gluster-users mailing list