[Gluster-devel] Doing LS with a lot of directory, files

Tom Myny tom.myny at tigron.be
Thu Apr 24 10:47:37 UTC 2008


Hello,

I'm running afr on two storage servers, with three clients.
For the moment, we have copied over 500 million small files on it, splitting
into each directory which contains 1000 files.

When doing ls in directory containing 1000 directory's we have the following
issue:


- Ls is taking more then 15 minutes to complete in a directory with 1000
folders. (this will be split also to 100 folders later, but it's now a big
problem)
	-> Yes, for now its ls --color=auto by default on debian :D
- When doing copies from other clients, those copies halt until that ls is
complete.


Is there a way to

1) Do a ls faster (ok, I know it can be that fast like on the filesystem
itself, but on the filesystem (or an nfs system) it takes max 15 seconds)
2) When someone is doing an ls, the other processes are not freesing.
(checking on the storage servers, we have a load of 0.00)

The filesystems we use are based on xfs.
An example of a server config:

volume sas-ds
        type storage/posix
        option directory /sas/data
end-volume

volume sas-ns
        type storage/posix
        option directory /sas/ns
end-volume

volume sata-ds
        type storage/posix
        option directory /sata/data
end-volume

volume sata-ns
        type storage/posix
        option directory /sata/ns
end-volume

volume sas-backup-ds
        type protocol/client
        option transport-type tcp/client
        option remote-host x.x.x.x
        option remote-subvolume sas-ds
end-volume

volume sas-backup-ns
        type protocol/client
        option transport-type tcp/client
        option remote-host x.x.x.x
        option remote-subvolume sas-ns
end-volume

...

volume sas-unify
        type cluster/unify
        subvolumes sas-ds-afr
        option namespace sas-ns-afr
        option scheduler rr
end-volume

volume sata-unify
        type cluster/unify
        subvolumes sata-ds-afr
        option namespace sata-ns-afr
        option scheduler rr
end-volume

volume sas
        type performance/io-threads
        option thread-count 16
        option cache-size 256MB
        subvolumes sas-unify
end-volume

volume sata
        type performance/io-threads
        option thread-count 16
        option cache-size 256MB
        subvolumes sata-unify
end-volume

..

I hope to fix this, because we want to double this next year :)


Regards,
Tom






More information about the Gluster-devel mailing list