[Gluster-users] Slow write times to gluster disk

Thu Apr 13 22:18:30 UTC 2017

Hi Ravi (and list),

We are planning on testing the NFS route to see what kind of speed-up we 
get.  A little research led us to the following:

https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/

Is this correct path to take to mount 2 xfs volumes as a single gluster 
file system volume?  If not, what would be a better path?

Pat

On 04/11/2017 12:21 AM, Ravishankar N wrote:
> On 04/11/2017 12:42 AM, Pat Haley wrote:
>>
>> Hi Ravi,
>>
>> Thanks for the reply.  And yes, we are using the gluster native 
>> (fuse) mount.  Since this is not my area of expertise I have a few 
>> questions (mostly clarifications)
>>
>> Is a factor of 20 slow-down typical when compare a fuse-mounted 
>> filesytem versus an NFS-mounted filesystem or should we also be 
>> looking for additional issues?  (Note the first dd test described 
>> below was run on the server that hosts the file-systems so no network 
>> communication was involved).
>
> Though both the gluster bricks and the mounts are on the same physical 
> machine in your setup, the I/O still passes through different layers 
> of kernel/user-space fuse stack although I don't know if 20x slow down 
> on gluster vs NFS share is normal. Why don't you try doing a gluster 
> NFS mount on the machine and try the dd test and compare it with the 
> gluster fuse mount results?
>
>>
>> You also mention tweaking " write-behind xlator settings". Would you 
>> expect better speed improvements from switching the mounting from 
>> fuse to gnfs or from tweaking the settings?  Also are these mutually 
>> exclusive or would the be additional benefits from both switching to 
>> gfns and tweaking?
> You should test these out and find the answers yourself. :-)
>
>>
>> My next question is to make sure I'm clear on the comment " if the 
>> gluster node containing the gnfs server goes down, all mounts done 
>> using that node will fail".  If you have 2 servers, each 1 brick in 
>> the over-all gluster FS, and one server fails, then for gnfs nothing 
>> on either server is visible to other nodes while under fuse only the 
>> files on the dead server are not visible.  Is this what you meant?
> Yes, for gnfs mounts, all I/O from various mounts go to the gnfs 
> server process (on the machine whose IP was used at the time of 
> mounting) which then sends the I/O to the brick processes. For fuse, 
> the gluster fuse mount itself talks directly to the bricks.
>>
>> Finally, you mention "even for gnfs mounts, you can achieve fail-over 
>> by using CTDB".  Do you know if CTDB would have any performance 
>> impact (i.e. in a worst cast scenario could adding CTDB to gnfs erase 
>> the speed benefits of going to gnfs in the first place)?
> I don't think it would. You can even achieve load balancing via CTDB 
> to use different gnfs servers for different clients. But I don't know 
> if this is needed/ helpful in your current setup where everything 
> (bricks and clients) seem to be on just one server.
>
> -Ravi
>> Thanks
>>
>> Pat
>>
>>
>> On 04/08/2017 12:58 AM, Ravishankar N wrote:
>>> Hi Pat,
>>>
>>> I'm assuming you are using gluster native (fuse mount). If it helps, 
>>> you could try mounting it via gluster NFS (gnfs) and then see if 
>>> there is an improvement in speed. Fuse mounts are slower than gnfs 
>>> mounts but you get the benefit of avoiding a single point of 
>>> failure. Unlike fuse mounts, if the gluster node containing the gnfs 
>>> server goes down, all mounts done using that node will fail). For 
>>> fuse mounts, you could try tweaking the write-behind xlator settings 
>>> to see if it helps. See the performance.write-behind and 
>>> performance.write-behind-window-size options in `gluster volume set 
>>> help`. Of course, even for gnfs mounts, you can achieve fail-over by 
>>> using CTDB.
>>>
>>> Thanks,
>>> Ravi
>>>
>>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>>>
>>>> Hi,
>>>>
>>>> We noticed a dramatic slowness when writing to a gluster disk when 
>>>> compared to writing to an NFS disk. Specifically when using dd 
>>>> (data duplicator) to write a 4.3 GB file of zeros:
>>>>
>>>>   * on NFS disk (/home): 9.5 Gb/s
>>>>   * on gluster disk (/gdata): 508 Mb/s
>>>>
>>>> The gluser disk is 2 bricks joined together, no replication or 
>>>> anything else. The hardware is (literally) the same:
>>>>
>>>>   * one server with 70 hard disks  and a hardware RAID card.
>>>>   * 4 disks in a RAID-6 group (the NFS disk)
>>>>   * 32 disks in a RAID-6 group (the max allowed by the card,
>>>>     /mnt/brick1)
>>>>   * 32 disks in another RAID-6 group (/mnt/brick2)
>>>>   * 2 hot spare
>>>>
>>>> Some additional information and more tests results (after changing 
>>>> the log level):
>>>>
>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>>> CentOS release 6.8 (Final)
>>>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 
>>>> [Invader] (rev 02)
>>>>
>>>>
>>>>
>>>> *Create the file to /gdata (gluster)*
>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M 
>>>> count=1000
>>>> 1000+0 records in
>>>> 1000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>>>
>>>> *Create the file to /home (ext4)*
>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M 
>>>> count=1000
>>>> 1000+0 records in
>>>> 1000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times 
>>>> as fast*
>>>>
>>>>
>>>> Copy from /gdata to /gdata (gluster to gluster)
>>>> *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>> 2048000+0 records in
>>>> 2048000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - 
>>>> realllyyy slooowww
>>>>
>>>>
>>>> *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>>> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>> 2048000+0 records in
>>>> 2048000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - 
>>>> realllyyy slooowww again
>>>>
>>>>
>>>>
>>>> *Copy from /home to /home (ext4 to ext4)*
>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>>> 2048000+0 records in
>>>> 2048000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as 
>>>> fast
>>>>
>>>>
>>>> *Copy from /home to /home (ext4 to ext4)*
>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>>> 2048000+0 records in
>>>> 2048000+0 records out
>>>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times 
>>>> as fast
>>>>
>>>>
>>>> As a test, can we copy data directly to the xfs mountpoint 
>>>> (/mnt/brick1) and bypass gluster?
>>>>
>>>>
>>>> Any help you could give us would be appreciated.
>>>>
>>>> Thanks
>>>>
>>>> -- 
>>>>
>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>> Pat Haley                          Email:phaley at mit.edu
>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>>>> 77 Massachusetts Avenue
>>>> Cambridge, MA  02139-4301
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>> -- 
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley                          Email:phaley at mit.edu
>> Center for Ocean Engineering       Phone:  (617) 253-6824
>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>
>

-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  phaley at mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170413/d012cac3/attachment.html>