[Gluster-users] Slow performance on samba with small files

Gary Lloyd g.lloyd at keele.ac.uk
Thu Feb 9 16:18:05 UTC 2017


Was just reading the small file section of the 3.9 release notes:

http://blog.gluster.org/2016/11/announcing-gluster-3-9/

Setting these options does seem to increase transfer speeds on small files
by quite alot:

  # gluster volume set <volname> features.cache-invalidation on
  # gluster volume set <volname> features.cache-invalidation-timeout 600
  # gluster volume set <volname> performance.stat-prefetch on
#This one seemed to have the biggest impact in small file performance
for me
  # gluster volume set <volname> performance.cache-invalidation on
  # gluster volume set <volname> performance.md-cache-timeout 600


Setting  # gluster volume set <volname> performance.cache-samba-metadata on
# Only for SMB access. Results in my client to keep losing the state of the
server and the shares often disappear / become inaccessible and I can only
get them back if I logon / logoff the machine, this is with distro Samba
4.4.4.

Has anyone here had the same issue, does the version of samba need to be
newer to support the feature ?

Thanks

*Gary Lloyd*
________________________________________________
I.T. Systems:Keele University
Finance & IT Directorate
Keele:Staffs:IC1 Building:ST5 5NB:UK
+44 1782 733063 <%2B44%201782%20733073>
________________________________________________

On 8 February 2017 at 11:49, Дмитрий Глушенок <glush at jet.msk.su> wrote:

> For _every_ file copied samba performs readdir() to get all entries of the
> destination folder. Then the list is searched for filename (to prevent name
> collisions as SMB shares are not case sensitive). More files in folder,
> more time it takes to perform readdir(). It is a lot worse for Gluster
> because single folder contents distributed among many servers and Gluster
> has to join many directory listings (requested via network) to form one and
> return it to caller.
>
> Rsync does not perform readdir(), it just checks file existence with
> stat() IIRC. And as modern Gluster versions has default setting to check
> for file only at its destination (when volume is balanced) - the check
> performs relatively fast.
>
> You can hack samba to prevent such checks if your goal is to get files
> copied not so slow (as you sure the files you are copying are not exists at
> destination). But try to perform 'ls -l' on _not_ cached folder with
> thousands of files - it will take tens of seconds. This is time your users
> will waste browsing shares.
>
> 8 февр. 2017 г., в 13:17, Gary Lloyd <g.lloyd at keele.ac.uk> написал(а):
>
> Thanks for the reply
>
> I've just done a bit more testing. If I use rsync from a gluster client to
> copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
>
> I have tried various samba tweaks / settings and have yet to get
> acceptable write speed on small files.
>
>
> *Gary Lloyd*
> ________________________________________________
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 <%2B44%201782%20733073>
> ________________________________________________
>
> On 8 February 2017 at 10:05, Дмитрий Глушенок <glush at jet.msk.su> wrote:
>
>> Hi,
>>
>> There is a number of tweaks/hacks to make it better, but IMHO overall
>> performance with small files is still unacceptable for such folders with
>> thousands of entries.
>>
>> If your shares are not too large to be placed on single filesystem and
>> you still want to use Gluster - it is possible to run VM on top of Gluster.
>> Inside that VM you can create ZFS/NTFS to be shared.
>>
>> 8 февр. 2017 г., в 12:10, Gary Lloyd <g.lloyd at keele.ac.uk> написал(а):
>>
>> Hi
>>
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3
>> with samba/ctdb.
>> I have been able to get it all up and running, but writing small files is
>> really slow.
>>
>> If I copy large files from gluster backed samba I get almost wire speed
>> (We only have 1Gb at the moment). I get around half that speed if I copy
>> large files to the gluster backed samba system, which I am guessing is due
>> to it being replicated (This is acceptable).
>>
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size
>> that has around 6000 files in it. A lot of these files are <1k in size.
>>
>> If I copy this up to gluster backed samba it takes almost one hour to get
>> there.
>> With our basic samba deployment it only takes about 5 minutes.
>>
>> Both systems reside on the same disks/SAN.
>>
>>
>> I was hoping that we would be able to move away from using a proprietary
>> SAN to house our network shares and use gluster instead.
>>
>> Does anyone have any suggestions of anything I could tweak to make it
>> better ?
>>
>> Many Thanks
>>
>>
>> *Gary Lloyd*
>> ________________________________________________
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> ________________________________________________
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>> --
>> Dmitry Glushenok
>> Jet Infosystems
>>
>>
>
> --
> Dmitry Glushenok
> Jet Infosystems
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170209/1126ef48/attachment.html>


More information about the Gluster-users mailing list