[Gluster-users] MacOSX Finder performance woes

Thu Dec 20 02:29:41 UTC 2012

I'm rolling out 4 lots of 6-node GlusterFS setups for my employer.  Each
node is ~33TB of RAID6 backed storage (16x 3TB SATA disks in RAID6 with a
hot spare hanging off an LSI controller, with 2x SSDs configured for
caching), and Gluster is configured in distribute-replicate.  Each cluster
is 200TB of raw space, 100TB usable after replication.  When complete,
there will be 4 of these clusters.

Nodes are configured as XFS with 512byte inodes, running a fully patched
CentOS6 and Gluster 3.3.1.  Each node has a 6 core Xeon processor (with HT
for 12 threads) with 32GB of RAM.  Each node runs 2x 10Gbps Ethernet over
fiber in a bonded configuration (single IP address per node) for a full
20Gbits per node.

GlusterFS FUSE performance under Linux is great (clients run a mix of
Ubuntu 12.04 LTS for workstations and CentOS6 for servers).  Samba
performance back to Windows 7 clients is great.  NFS performance via both
Gluster's userspace setup as well as CentOS6's native NFS4 kernel server
are great to most other systems where we can't get the Gluster FUSE client
loaded (large industry-specific Linux boxes that are provided by vendors as
a "black box" solution, and only allow limited access via NFS or
SMB/CIFS).  All testing so far under those conditions proves orders of
magnitude faster throughput than our existing single NAS solutions.

MacOSX Finder performance is a problem, however.  There's a huge bug in
MacOSX itself that prevents using NFS at all (discussions on other mailing
lists suggest it occurred somewhere around 10.6, and continues through into
10.7 and 10.8).

Mounting via SMB under OSX is more stable than NFS, however in folders with
a large amount of files, Finder goes looking for a corresponding Apple
Resource Fork file (for every "filename.ext", it looks for a
"._filename.ext").  Running tcpdump and wireshark on the Gluster nodes
shows that the resulting "FILE_NOT_FOUND" error back to the client takes a
very long time.  Configuring a single node as a pure NAS with the same
software (but no Gluster implementation) is lightening fast.   As soon as
GlusterFS comes in to play, reporting of each "FILE_NOT_FOUND" slows down
the process dramatically, causing a directory with ~1000 images in it to
take well over 5 minutes to display the contents in MacOSX finder.

This problem is resolved somewhat by switching to AFP (via Netatalk loaded
on the GlusterFS nodes), but it has it's own problems unique to that
protocol, and I'd rather stick to GlusterFS-FUSE, NFS or SMB in that order
of preference.

It's worth noting that through the terminal, these problems don't exist.
Mounting via SMB, browsing to the volume in terminal and running "ls" or
"find" style commands retrieve file listings at a similar speed to Linux
and Windows.  The problem is limited to clients using Finder to browse
directories, and again particularly ones with a large number of files that
don't have matching Apple Resource Fork files.  (Of note, creating empty
files of the matching "._filename.ext" format solves the performance
problem, but litters our filestores with millions of empty files, which we
don't want).

I understand the problem is not strictly Gluster's issue.  Finder is
looking for a heck of a lot of files that don't exist (which is a pretty
silly design), and it tends to occur only with Samba re-exporting GlusterFS
volumes that we can see.  And likewise Apple's NFS bug that has now been in
existence across three releases of their OS is pretty horrible.   But
hopefully I can at least describe the problem and prompt some testing by
others.

I haven't had a chance to test a MacOSX FUSE client due to time
constraints, but that would at least answer the question if the problem is
Gluster's lag in reporting of files not found, or Samba's.

-Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20121220/d823157a/attachment.html>