[Gluster-users] Slow write times to gluster disk
Ravishankar N
ravishankar at redhat.com
Fri Apr 14 04:57:22 UTC 2017
I'm not sure if the version you are running (glusterfs 3.7.11 ) works
with NFS-Ganesha as the link seems to suggest version >=3.8 as a
per-requisite. Adding Soumya for help. If it is not supported, then you
might have to go the plain glusterNFS way.
Regards,
Ravi
On 04/14/2017 03:48 AM, Pat Haley wrote:
>
> Hi Ravi (and list),
>
> We are planning on testing the NFS route to see what kind of speed-up
> we get. A little research led us to the following:
>
> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/
>
> Is this correct path to take to mount 2 xfs volumes as a single
> gluster file system volume? If not, what would be a better path?
>
>
> Pat
>
>
>
> On 04/11/2017 12:21 AM, Ravishankar N wrote:
>> On 04/11/2017 12:42 AM, Pat Haley wrote:
>>>
>>> Hi Ravi,
>>>
>>> Thanks for the reply. And yes, we are using the gluster native
>>> (fuse) mount. Since this is not my area of expertise I have a few
>>> questions (mostly clarifications)
>>>
>>> Is a factor of 20 slow-down typical when compare a fuse-mounted
>>> filesytem versus an NFS-mounted filesystem or should we also be
>>> looking for additional issues? (Note the first dd test described
>>> below was run on the server that hosts the file-systems so no
>>> network communication was involved).
>>
>> Though both the gluster bricks and the mounts are on the same
>> physical machine in your setup, the I/O still passes through
>> different layers of kernel/user-space fuse stack although I don't
>> know if 20x slow down on gluster vs NFS share is normal. Why don't
>> you try doing a gluster NFS mount on the machine and try the dd test
>> and compare it with the gluster fuse mount results?
>>
>>>
>>> You also mention tweaking " write-behind xlator settings". Would you
>>> expect better speed improvements from switching the mounting from
>>> fuse to gnfs or from tweaking the settings? Also are these mutually
>>> exclusive or would the be additional benefits from both switching to
>>> gfns and tweaking?
>> You should test these out and find the answers yourself. :-)
>>
>>>
>>> My next question is to make sure I'm clear on the comment " if the
>>> gluster node containing the gnfs server goes down, all mounts done
>>> using that node will fail". If you have 2 servers, each 1 brick in
>>> the over-all gluster FS, and one server fails, then for gnfs nothing
>>> on either server is visible to other nodes while under fuse only the
>>> files on the dead server are not visible. Is this what you meant?
>> Yes, for gnfs mounts, all I/O from various mounts go to the gnfs
>> server process (on the machine whose IP was used at the time of
>> mounting) which then sends the I/O to the brick processes. For fuse,
>> the gluster fuse mount itself talks directly to the bricks.
>>>
>>> Finally, you mention "even for gnfs mounts, you can achieve
>>> fail-over by using CTDB". Do you know if CTDB would have any
>>> performance impact (i.e. in a worst cast scenario could adding CTDB
>>> to gnfs erase the speed benefits of going to gnfs in the first place)?
>> I don't think it would. You can even achieve load balancing via CTDB
>> to use different gnfs servers for different clients. But I don't know
>> if this is needed/ helpful in your current setup where everything
>> (bricks and clients) seem to be on just one server.
>>
>> -Ravi
>>> Thanks
>>>
>>> Pat
>>>
>>>
>>> On 04/08/2017 12:58 AM, Ravishankar N wrote:
>>>> Hi Pat,
>>>>
>>>> I'm assuming you are using gluster native (fuse mount). If it
>>>> helps, you could try mounting it via gluster NFS (gnfs) and then
>>>> see if there is an improvement in speed. Fuse mounts are slower
>>>> than gnfs mounts but you get the benefit of avoiding a single point
>>>> of failure. Unlike fuse mounts, if the gluster node containing the
>>>> gnfs server goes down, all mounts done using that node will fail).
>>>> For fuse mounts, you could try tweaking the write-behind xlator
>>>> settings to see if it helps. See the performance.write-behind and
>>>> performance.write-behind-window-size options in `gluster volume set
>>>> help`. Of course, even for gnfs mounts, you can achieve fail-over
>>>> by using CTDB.
>>>>
>>>> Thanks,
>>>> Ravi
>>>>
>>>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> We noticed a dramatic slowness when writing to a gluster disk when
>>>>> compared to writing to an NFS disk. Specifically when using dd
>>>>> (data duplicator) to write a 4.3 GB file of zeros:
>>>>>
>>>>> * on NFS disk (/home): 9.5 Gb/s
>>>>> * on gluster disk (/gdata): 508 Mb/s
>>>>>
>>>>> The gluser disk is 2 bricks joined together, no replication or
>>>>> anything else. The hardware is (literally) the same:
>>>>>
>>>>> * one server with 70 hard disks and a hardware RAID card.
>>>>> * 4 disks in a RAID-6 group (the NFS disk)
>>>>> * 32 disks in a RAID-6 group (the max allowed by the card,
>>>>> /mnt/brick1)
>>>>> * 32 disks in another RAID-6 group (/mnt/brick2)
>>>>> * 2 hot spare
>>>>>
>>>>> Some additional information and more tests results (after changing
>>>>> the log level):
>>>>>
>>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>>>> CentOS release 6.8 (Final)
>>>>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
>>>>> [Invader] (rev 02)
>>>>>
>>>>>
>>>>>
>>>>> *Create the file to /gdata (gluster)*
>>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>>>> count=1000
>>>>> 1000+0 records in
>>>>> 1000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>>>>
>>>>> *Create the file to /home (ext4)*
>>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>>>> count=1000
>>>>> 1000+0 records in
>>>>> 1000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times
>>>>> as fast*
>>>>>
>>>>>
>>>>> Copy from /gdata to /gdata (gluster to gluster)
>>>>> *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>>> 2048000+0 records in
>>>>> 2048000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
>>>>> realllyyy slooowww
>>>>>
>>>>>
>>>>> *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>>>> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>>> 2048000+0 records in
>>>>> 2048000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
>>>>> realllyyy slooowww again
>>>>>
>>>>>
>>>>>
>>>>> *Copy from /home to /home (ext4 to ext4)*
>>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>>>> 2048000+0 records in
>>>>> 2048000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
>>>>> as fast
>>>>>
>>>>>
>>>>> *Copy from /home to /home (ext4 to ext4)*
>>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>>>> 2048000+0 records in
>>>>> 2048000+0 records out
>>>>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
>>>>> as fast
>>>>>
>>>>>
>>>>> As a test, can we copy data directly to the xfs mountpoint
>>>>> (/mnt/brick1) and bypass gluster?
>>>>>
>>>>>
>>>>> Any help you could give us would be appreciated.
>>>>>
>>>>> Thanks
>>>>>
>>>>> --
>>>>>
>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>> Pat Haley Email:phaley at mit.edu
>>>>> Center for Ocean Engineering Phone: (617) 253-6824
>>>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>>>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>>>>> 77 Massachusetts Avenue
>>>>> Cambridge, MA 02139-4301
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>
>>> --
>>>
>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>> Pat Haley Email:phaley at mit.edu
>>> Center for Ocean Engineering Phone: (617) 253-6824
>>> Dept. of Mechanical Engineering Fax: (617) 253-8125
>>> MIT, Room 5-213http://web.mit.edu/phaley/www/
>>> 77 Massachusetts Avenue
>>> Cambridge, MA 02139-4301
>>
>>
>
> --
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley Email:phaley at mit.edu
> Center for Ocean Engineering Phone: (617) 253-6824
> Dept. of Mechanical Engineering Fax: (617) 253-8125
> MIT, Room 5-213http://web.mit.edu/phaley/www/
> 77 Massachusetts Avenue
> Cambridge, MA 02139-4301
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170414/ba1734d8/attachment.html>
More information about the Gluster-users
mailing list