[Gluster-users] Slow write times to gluster disk

Mon Apr 17 07:18:41 UTC 2017

On 04/14/2017 10:27 AM, Ravishankar N wrote:
> I'm not sure if the version you are running (glusterfs 3.7.11 ) works
> with NFS-Ganesha as the link seems to suggest version >=3.8 as a
> per-requisite. Adding Soumya for help. If it is not supported, then you
> might have to go the plain glusterNFS way.

Even gluster 3.7.x shall work with NFS-Ganesha but the steps to 
configure had changed from 3.8 and hence the pre-requisite was added in 
the doc. IIUC, from your below mail, you would like to try NFS 
(preferably gNFS but not NFS-Ganesha) which may perform better compared 
to fuse mount. In that case, gNFS server comes up by default (till 
release-3.7.x) and there are additional steps needed to export volume 
via gNFS. Let me know if you have any issues accessing volumes via gNFS.

Regards,
Soumya

> Regards,
> Ravi
>
> On 04/14/2017 03:48 AM, Pat Haley wrote:
>>
>> Hi Ravi (and list),
>>
>> We are planning on testing the NFS route to see what kind of speed-up
>> we get.  A little research led us to the following:
>>
>> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Integration/
>>
>> Is this correct path to take to mount 2 xfs volumes as a single
>> gluster file system volume?  If not, what would be a better path?
>>
>>
>> Pat
>>
>>
>>
>> On 04/11/2017 12:21 AM, Ravishankar N wrote:
>>> On 04/11/2017 12:42 AM, Pat Haley wrote:
>>>>
>>>> Hi Ravi,
>>>>
>>>> Thanks for the reply.  And yes, we are using the gluster native
>>>> (fuse) mount.  Since this is not my area of expertise I have a few
>>>> questions (mostly clarifications)
>>>>
>>>> Is a factor of 20 slow-down typical when compare a fuse-mounted
>>>> filesytem versus an NFS-mounted filesystem or should we also be
>>>> looking for additional issues?  (Note the first dd test described
>>>> below was run on the server that hosts the file-systems so no
>>>> network communication was involved).
>>>
>>> Though both the gluster bricks and the mounts are on the same
>>> physical machine in your setup, the I/O still passes through
>>> different layers of kernel/user-space fuse stack although I don't
>>> know if 20x slow down on gluster vs NFS share is normal. Why don't
>>> you try doing a gluster NFS mount on the machine and try the dd test
>>> and compare it with the gluster fuse mount results?
>>>
>>>>
>>>> You also mention tweaking " write-behind xlator settings".  Would
>>>> you expect better speed improvements from switching the mounting
>>>> from fuse to gnfs or from tweaking the settings?  Also are these
>>>> mutually exclusive or would the be additional benefits from both
>>>> switching to gfns and tweaking?
>>> You should test these out and find the answers yourself. :-)
>>>
>>>>
>>>> My next question is to make sure I'm clear on the comment " if the
>>>> gluster node containing the gnfs server goes down, all mounts done
>>>> using that node will fail".  If you have 2 servers, each 1 brick in
>>>> the over-all gluster FS, and one server fails, then for gnfs nothing
>>>> on either server is visible to other nodes while under fuse only the
>>>> files on the dead server are not visible.  Is this what you meant?
>>> Yes, for gnfs mounts, all I/O from various mounts go to the gnfs
>>> server process (on the machine whose IP was used at the time of
>>> mounting) which then sends the I/O to the brick processes. For fuse,
>>> the gluster fuse mount itself talks directly to the bricks.
>>>>
>>>> Finally, you mention "even for gnfs mounts, you can achieve
>>>> fail-over by using CTDB".  Do you know if CTDB would have any
>>>> performance impact (i.e. in a worst cast scenario could adding CTDB
>>>> to gnfs erase the speed benefits of going to gnfs in the first place)?
>>> I don't think it would. You can even achieve load balancing via CTDB
>>> to use different gnfs servers for different clients. But I don't know
>>> if this is needed/ helpful in your current setup where everything
>>> (bricks and clients) seem to be on just one server.
>>>
>>> -Ravi
>>>> Thanks
>>>>
>>>> Pat
>>>>
>>>>
>>>> On 04/08/2017 12:58 AM, Ravishankar N wrote:
>>>>> Hi Pat,
>>>>>
>>>>> I'm assuming you are using gluster native (fuse mount). If it
>>>>> helps, you could try mounting it via gluster NFS (gnfs) and then
>>>>> see if there is an improvement in speed. Fuse mounts are slower
>>>>> than gnfs mounts but you get the benefit of avoiding a single point
>>>>> of failure. Unlike fuse mounts, if the gluster node containing the
>>>>> gnfs server goes down, all mounts done using that node will fail).
>>>>> For fuse mounts, you could try tweaking the write-behind xlator
>>>>> settings to see if it helps. See the performance.write-behind and
>>>>> performance.write-behind-window-size options in `gluster volume set
>>>>> help`. Of course, even for gnfs mounts, you can achieve fail-over
>>>>> by using CTDB.
>>>>>
>>>>> Thanks,
>>>>> Ravi
>>>>>
>>>>> On 04/08/2017 12:07 AM, Pat Haley wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We noticed a dramatic slowness when writing to a gluster disk when
>>>>>> compared to writing to an NFS disk. Specifically when using dd
>>>>>> (data duplicator) to write a 4.3 GB file of zeros:
>>>>>>
>>>>>>   * on NFS disk (/home): 9.5 Gb/s
>>>>>>   * on gluster disk (/gdata): 508 Mb/s
>>>>>>
>>>>>> The gluser disk is 2 bricks joined together, no replication or
>>>>>> anything else. The hardware is (literally) the same:
>>>>>>
>>>>>>   * one server with 70 hard disks  and a hardware RAID card.
>>>>>>   * 4 disks in a RAID-6 group (the NFS disk)
>>>>>>   * 32 disks in a RAID-6 group (the max allowed by the card,
>>>>>>     /mnt/brick1)
>>>>>>   * 32 disks in another RAID-6 group (/mnt/brick2)
>>>>>>   * 2 hot spare
>>>>>>
>>>>>> Some additional information and more tests results (after changing
>>>>>> the log level):
>>>>>>
>>>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22
>>>>>> CentOS release 6.8 (Final)
>>>>>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108
>>>>>> [Invader] (rev 02)
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Create the file to /gdata (gluster)*
>>>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M
>>>>>> count=1000
>>>>>> 1000+0 records in
>>>>>> 1000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s*
>>>>>>
>>>>>> *Create the file to /home (ext4)*
>>>>>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M
>>>>>> count=1000
>>>>>> 1000+0 records in
>>>>>> 1000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times
>>>>>> as fast*
>>>>>>
>>>>>>
>>>>>> Copy from /gdata to /gdata (gluster to gluster)
>>>>>> *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>>>> 2048000+0 records in
>>>>>> 2048000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* -
>>>>>> realllyyy slooowww
>>>>>>
>>>>>>
>>>>>> *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)**
>>>>>> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2
>>>>>> 2048000+0 records in
>>>>>> 2048000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* -
>>>>>> realllyyy slooowww again
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Copy from /home to /home (ext4 to ext4)*
>>>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2
>>>>>> 2048000+0 records in
>>>>>> 2048000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times
>>>>>> as fast
>>>>>>
>>>>>>
>>>>>> *Copy from /home to /home (ext4 to ext4)*
>>>>>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3
>>>>>> 2048000+0 records in
>>>>>> 2048000+0 records out
>>>>>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times
>>>>>> as fast
>>>>>>
>>>>>>
>>>>>> As a test, can we copy data directly to the xfs mountpoint
>>>>>> (/mnt/brick1) and bypass gluster?
>>>>>>
>>>>>>
>>>>>> Any help you could give us would be appreciated.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> --
>>>>>>
>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>>>> Pat Haley                          Email:  phaley at mit.edu
>>>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>>>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>>>>>> 77 Massachusetts Avenue
>>>>>> Cambridge, MA  02139-4301
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>>> Pat Haley                          Email:  phaley at mit.edu
>>>> Center for Ocean Engineering       Phone:  (617) 253-6824
>>>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>>>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>>>> 77 Massachusetts Avenue
>>>> Cambridge, MA  02139-4301
>>>
>>>
>>
>> --
>>
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>> Pat Haley                          Email:  phaley at mit.edu
>> Center for Ocean Engineering       Phone:  (617) 253-6824
>> Dept. of Mechanical Engineering    Fax:    (617) 253-8125
>> MIT, Room 5-213                    http://web.mit.edu/phaley/www/
>> 77 Massachusetts Avenue
>> Cambridge, MA  02139-4301
>
>