[Gluster-users] Memory usage high on server sides

Thu Apr 15 04:18:42 UTC 2010

Hi Tejas,

> Problems you saw - 
> 
> 1) High memory usage on client where gluster volume is mounted

Memory usage for clients is 0% after copying.
$ps auxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     19692  1.3  0.0 262148  6980 ?        Ssl  Apr12
61:33 /sbin/glusterfs --log-level=NORMAL
--volfile=/u2/git/modules/shared/glusterfs/clients/r2/c2.vol /gfs/r2/f2

> 2) High memory usage on server
Yes.
$ps auxf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     26472  2.2 29.1 718100 600260 ?       Ssl  Apr09 184:09
glusterfsd -f /etc/glusterfs/servers/r2/f1.vol
root     26485  1.8 39.8 887744 821384 ?       Ssl  Apr09 157:16
glusterfsd -f /etc/glusterfs/servers/r2/f2.vol

> 3) 2 days to copy 300 GB data
More than 700GB. There are two folders. The first one is copied to
server 1 and server 2, and the second one is copied to server 2 and
server 3. The vol files are below.

> About the config, can you provide the following for both old and new systems -
> 
> 1) OS and kernel level on gluster servers and clients
Debian Kernel 2.6.18-6-amd64

$uname -a
Linux fs2 2.6.18-6-amd64 #1 SMP Tue Aug 19 04:30:56 UTC 2008 x86_64
GNU/Linux

> 2) volume file from servers and clients

#####Server Vol file (f1.vol)
# The same settings for f2.vol and f3.vol, just different dirs and ports
# f1 f3 for Server 1, f1 f2 for Server 2, f2 f3 for Server 3
volume posix1
  type storage/posix
  option directory /gfs/r2/f1
end-volume

volume locks1
    type features/locks
    subvolumes posix1
end-volume

volume brick1
    type performance/io-threads
    option thread-count 8
    subvolumes locks1
end-volume

volume server-tcp
    type protocol/server
    option transport-type tcp
    option auth.addr.brick1.allow 192.168.0.*
    option transport.socket.listen-port 6991
    option transport.socket.nodelay on
    subvolumes brick1
end-volume

#####Client Vol file (c1.vol)
# The same settings for c2.vol and c3.vol
# s2 s3 for c2, s3 s1 for c3
volume s1
    type protocol/client
    option transport-type tcp
    option remote-host 192.168.0.31
    option transport.socket.nodelay on
    option transport.remote-port 6991
    option remote-subvolume brick1
end-volume

volume s2
    type protocol/client
    option transport-type tcp
    option remote-host 192.168.0.32
    option transport.socket.nodelay on
    option transport.remote-port 6991
    option remote-subvolume brick1
end-volume

volume mirror
    type cluster/replicate
    option data-self-heal off
    option metadata-self-heal off
    option entry-self-heal off
    subvolumes s1 s2
end-volume

volume writebehind
    type performance/write-behind
    option cache-size 100MB
    option flush-behind off
    subvolumes mirror
end-volume

volume iocache
    type performance/io-cache
    option cache-size `grep 'MemTotal' /proc/meminfo  | awk '{print $2 *
0.2 / 1024}' | cut -f1 -d.`MB
    option cache-timeout 1
    subvolumes writebehind
end-volume

volume quickread
    type performance/quick-read
    option cache-timeout 1
    option max-file-size 256Kb
    subvolumes iocache
end-volume

volume statprefetch
    type performance/stat-prefetch
    subvolumes quickread
end-volume

> 3) Filesystem type of backend gluster subvolumes
ext3

> 4) How close to full the backend subvolumes are
New 2T hard disks for each server.

> 5) The exact copy command .. did you mount the volumes from
> old and new system on a single machine and did cp or used rsync
> or some other method ? If something more than just a cp, please
> send the exact command line you used.
The old file system uses DRBD and NFS.
The exact command is
sudo cp -R -v -p -P /nfsmounts/nfs3/photo .

> 6) How many files/directories ( tentative ) in that 300GB data ( would help in 
> trying to reproduce inhouse with a smaller test bed ).
I cannot tell, but the file sizes are between 1KB to 200KB, average
around 20KB.

> 7) Was there other load on the new or old system ?
The old systems are still used for web servers.
The new systems are on the same servers but different hard disks. 

> 8) Any other patterns you noticed.
There is once that one client tried to connect one server with external
IP address.
Using distribute translator across all three mirrors will make system
twice slower than using three mounted folders.

Is this information enough?

Please take a look.

Regards,

Chris