[Gluster-users] Slow performance on samba with small files
glush at jet.msk.su
Wed Feb 8 11:49:57 UTC 2017
For _every_ file copied samba performs readdir() to get all entries of the destination folder. Then the list is searched for filename (to prevent name collisions as SMB shares are not case sensitive). More files in folder, more time it takes to perform readdir(). It is a lot worse for Gluster because single folder contents distributed among many servers and Gluster has to join many directory listings (requested via network) to form one and return it to caller.
Rsync does not perform readdir(), it just checks file existence with stat() IIRC. And as modern Gluster versions has default setting to check for file only at its destination (when volume is balanced) - the check performs relatively fast.
You can hack samba to prevent such checks if your goal is to get files copied not so slow (as you sure the files you are copying are not exists at destination). But try to perform 'ls -l' on _not_ cached folder with thousands of files - it will take tens of seconds. This is time your users will waste browsing shares.
> 8 февр. 2017 г., в 13:17, Gary Lloyd <g.lloyd at keele.ac.uk> написал(а):
> Thanks for the reply
> I've just done a bit more testing. If I use rsync from a gluster client to copy the same files to the mount point it only takes a couple of minutes.
> For some reason it's very slow on samba though (version 4.4.4).
> I have tried various samba tweaks / settings and have yet to get acceptable write speed on small files.
> Gary Lloyd
> I.T. Systems:Keele University
> Finance & IT Directorate
> Keele:Staffs:IC1 Building:ST5 5NB:UK
> +44 1782 733063 <tel:%2B44%201782%20733073>
> On 8 February 2017 at 10:05, Дмитрий Глушенок <glush at jet.msk.su <mailto:glush at jet.msk.su>> wrote:
> There is a number of tweaks/hacks to make it better, but IMHO overall performance with small files is still unacceptable for such folders with thousands of entries.
> If your shares are not too large to be placed on single filesystem and you still want to use Gluster - it is possible to run VM on top of Gluster. Inside that VM you can create ZFS/NTFS to be shared.
>> 8 февр. 2017 г., в 12:10, Gary Lloyd <g.lloyd at keele.ac.uk <mailto:g.lloyd at keele.ac.uk>> написал(а):
>> I am currently testing gluster 3.9 replicated/distrbuted on centos 7.3 with samba/ctdb.
>> I have been able to get it all up and running, but writing small files is really slow.
>> If I copy large files from gluster backed samba I get almost wire speed (We only have 1Gb at the moment). I get around half that speed if I copy large files to the gluster backed samba system, which I am guessing is due to it being replicated (This is acceptable).
>> Small file write performance seems really poor for us though:
>> As an example I have an eclipse IDE workspace folder that is 6MB in size that has around 6000 files in it. A lot of these files are <1k in size.
>> If I copy this up to gluster backed samba it takes almost one hour to get there.
>> With our basic samba deployment it only takes about 5 minutes.
>> Both systems reside on the same disks/SAN.
>> I was hoping that we would be able to move away from using a proprietary SAN to house our network shares and use gluster instead.
>> Does anyone have any suggestions of anything I could tweak to make it better ?
>> Many Thanks
>> Gary Lloyd
>> I.T. Systems:Keele University
>> Finance & IT Directorate
>> Keele:Staffs:IC1 Building:ST5 5NB:UK
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>> http://lists.gluster.org/mailman/listinfo/gluster-users <http://lists.gluster.org/mailman/listinfo/gluster-users>
> Dmitry Glushenok
> Jet Infosystems
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users