[Gluster-users] How many subfolders in parent folders ?
laurent.chouinard at ubisoft.com
Wed May 8 21:56:17 UTC 2013
The performance of this is going to be very dependent on the hardware that your cluster uses.
For example, I have a cluster here of 4 nodes with 4 SSDs each, and a replication factor set to 2. Bricks are using XFS. Nodes are inter-connected with 10 GigE networking.
The folder structure we use is based on hexadecimal representation and we take the first two bytes of the file name to decide where it goes. This way, we end up with:
That makes a total of only 65 792 folders (256 x 256 + 256), and if you spread 100 million different files in that, the result is a bit over 1500 files per folder, which is very reasonable.
Now for the speed that crawling to such a layout happens, my cluster here has 3 million files at the moment. If I find a "find ." command from inside one of the 256 top folders, it takes 14 seconds. I can extrapolate that to just over 59 minutes if I were to crawl through all of them.
I would advise against going too deep in the folder-in-folder-in-folder, because as you multiply the possibilities, you'll end up with a file system with millions of entries just for folders themselves.
More information about the Gluster-users