[Gluster-users] Gluster high RPC calls and reply

Mon Jul 7 13:33:06 UTC 2014

Hello Niels,

I did a net hogs on the interface to see what process might be using the bandwidth,

NetHogs version 0.8.0

  PID USER     PROGRAM                                                                                                  DEV        SENT      RECEIVED       
18611 root     /usr/sbin/glusterfsd                                                                                     tun0	  16.307      17.547 KB/sec
1055  root     /usr/sbin/glusterfs                                                                                      tun0	  17.249      16.259 KB/sec
13439 guru     sshd: guru at pts/0                                                                                         tun0	   0.966       0.051 KB/sec
18625 root     /usr/sbin/glusterfs                                                                                      tun0	   0.000       0.000 KB/sec
18629 root     /usr/sbin/glusterfs                                                                                      tun0	   0.000       0.000 KB/sec
9636  root     /usr/sbin/glusterd                                                                                       tun0	   0.000       0.000 KB/sec
?     root     unknown TCP                                                                                                         0.000       0.000 KB/sec

  TOTAL                                                                                                                           34.523      33.856 KB/sec 

Its glusterfs and glusterfsd process.

I looked at the capture file and see that the lookup is being made on random files.

For PID information, please see this:

[guru at srv2 ~]$ sudo netstat -tpn | grep 49152
tcp        0      0 127.0.0.1:49152             127.0.0.1:1012              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:49152             127.0.0.1:1016              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:1016              127.0.0.1:49152             ESTABLISHED 18625/glusterfs     
tcp        0      0 10.8.0.6:1021               10.8.0.1:49152              ESTABLISHED 1055/glusterfs      
tcp        0      0 10.8.0.6:49152              10.8.0.1:1017               ESTABLISHED 18611/glusterfsd    
tcp        0      0 10.8.0.6:1020               10.8.0.1:49152              ESTABLISHED 18629/glusterfs     
tcp        0      0 127.0.0.1:1023              127.0.0.1:49152             ESTABLISHED 18629/glusterfs     
tcp        0      0 10.8.0.6:49152              10.8.0.1:1022               ESTABLISHED 18611/glusterfsd    
tcp        0      0 10.8.0.6:49152              10.8.0.1:1021               ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:49152             127.0.0.1:1023              ESTABLISHED 18611/glusterfsd    
tcp        0      0 127.0.0.1:1012              127.0.0.1:49152             ESTABLISHED 1055/glusterfs      
tcp        0      0 10.8.0.6:1019               10.8.0.1:49152              ESTABLISHED 18625/glusterfs     
[guru at srv2 ~]$ ps -v 18611
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18611 ?        Ssl   14:12      0     0 650068 20404  2.0 /usr/sbin/glusterfsd -s srv2 --volfile-id gv0.srv2.root-gluster-vol0 -p /var/lib/glusterd/vols/gv0
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/r
[guru at srv2 ~]$ 
[guru at srv2 ~]$ 
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be888
[guru at srv2 ~]$ 
[guru at srv2 ~]$ 
[guru at srv2 ~]$ 
[guru at srv2 ~]$ 
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be8881500723b063.socket --xlator-option *replicate*.node-uuid=84af83c9-0a29-
[guru at srv2 ~]$ 
[guru at srv2 ~]$ 
[guru at srv2 ~]$ ps -v 18625
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
18625 ?        Ssl    0:03      0     0 239528 41040  4.0 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5ad5b036fd636cc5dddffa73593e4089.socket
[guru at srv2 ~]$ sudo nethogs tun0
Waiting for first packet to arrive (see sourceforge.net bug 1019381)
[guru at srv2 ~]$ rpm -qa | grep gluster
glusterfs-3.5.1-1.el6.x86_64
glusterfs-cli-3.5.1-1.el6.x86_64
glusterfs-libs-3.5.1-1.el6.x86_64
glusterfs-fuse-3.5.1-1.el6.x86_64
glusterfs-server-3.5.1-1.el6.x86_64
glusterfs-api-3.5.1-1.el6.x86_64
[guru at srv2 ~]$ 

I don’t see anything odd here. Please suggest.

Thanks,
Gurdeep.

On 7 Jul 2014, at 9:06 pm, Niels de Vos <ndevos at redhat.com> wrote:

> On Sun, Jul 06, 2014 at 11:28:51PM +1000, Gurdeep Singh (Guru) wrote:
>> Hello,
>> 
>> I have setup gluster in replicate type and its working fine.
>> 
>> I am seeing a constant chatting between the hosts for lookup call and 
>> lookup reply. I am trying to understand as to why this traffic is 
>> being initiated constantly. Please look at the attached image. This 
>> traffic is using around 200KB/s of constant bandwidth and is 
>> exhausting our allocated monthly bandwidth on our 2 VPS.
> 
> You can use Wireshark to identify which process does the LOOKUP calls.  
> For this, do the following:
> 
> 1. select a LOOKUP Call
> 2. enable the 'packet details' pane (found in the main menu, 'view')
> 3. expand the 'Transmission Control Protocol' tree
> 4. check the 'Source port' of the LOOKUP Call
> 
> Together with the 'Source' and the 'Source port' you can go to the 
> server that matches the 'Source' address. A command like this would give 
> you the PID of the process in the right column:
> 
>  # netstat -tpn | grep $SOURCE_PORT
> 
> And with 'ps -v $PID' you can check which process is responsible for the 
> LOOKUP. This process can be a fuse-mount, self-heal-daemon or any other 
> glusterfs-client. Depending on the type of client, you maybe can tune 
> the workload or other options a little.
> 
> In Wireshark you can also check what filename is LOOKUP'd, just expand 
> the 'GlusterFS' part in the 'packet details' and check the 'Basename'.  
> Maybe this filename (without directory structure) does give you any 
> ideas of which activity is causing the LOOKUPs.
> 
> HTH,
> Niels
> 
>> 
>> The configuration I have for Gluster is:
>> 
>> [guru at srv1 ~]$ sudo gluster volume info
>> [sudo] password for guru: 
>> 
>> Volume Name: gv0
>> Type: Replicate
>> Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: srv1:/root/gluster-vol0
>> Brick2: srv2:/root/gluster-vol0
>> Options Reconfigured:
>> cluster.lookup-unhashed: on
>> performance.cache-refresh-timeout: 60
>> performance.cache-size: 1GB
>> storage.health-check-interval: 30
>> 
>> 
>> 
>> Please suggest how to fine tune the RPC calls/reply. 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140707/a2d0c9bf/attachment.html>