[Gluster-users] Performance: lots of small files, hdd, nvme etc.

Mon Apr 3 17:00:48 UTC 2023

hello
you can read files from underlying filesystem first (ext4,xfs...), for 
ex: /srv/glusterfs/wwww/brick.

as fall back you can check mounted glusterfs path, to heal missing local 
node entries. ex: /mnt/shared/www/...

you need only to write to mount.glusterfs mount point.

On 3/30/2023 11:26 AM, Hu Bert wrote:
> - workload: the (un)famous "lots of small files" setting
> - currently 70% of the of the volume is used: ~32TB
> - file size: few KB up to 1MB
> - so there are hundreds of millions of files (and millions of directories)
> - each image has an ID
> - under the base dir the IDs are split into 3 digits
> - dir structure: /basedir/(000-999)/(000-999)/ID/[lotsoffileshere]
> - example for ID 123456789: /basedir/123/456/123456789/default.jpg
> - maybe this structure isn't good and e.g. this would be better:
> /basedir/IDs/[here the files] - so millions of ID-dirs directly under
> /basedir/
> - frequent access to the files by webservers (nginx, tomcat): lookup
> if file exists, read/write images etc.
> - Strahil mentioned: "Keep in mind that negative searches (searches of
> non-existing/deleted objects) has highest penalty." <--- that happens
> very often...
> - server load on high traffic days: > 100 (mostly iowait)
> - bad are server reboots (read filesystem info etc.)
> - really bad is a sw raid rebuild/resync

-- 
S pozdravom / Yours sincerely
Ing. Jan Hudoba

http://www.jahu.sk