[Gluster-users] Gluster high RPC calls and reply

Mon Jul 7 13:52:36 UTC 2014

[guru at srv2 ~]$ ps -v 1055
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
 1055 ?        Ssl   86:01     31     0 319148 33092  3.2 /usr/sbin/glusterfs --volfile-server=srv2 --volfile-id=/gv0 /var/www/html/image/
[guru at srv2 ~]$ 

On 7 Jul 2014, at 11:49 pm, Pranith Kumar Karampuri <pkarampu at redhat.com> wrote:

> 
> On 07/07/2014 07:03 PM, Gurdeep Singh (Guru) wrote:
>> Hello Niels,
>> 
>> I did a net hogs on the interface to see what process might be using the bandwidth,
>> 
>> NetHogs version 0.8.0
>> 
>>   PID USER     PROGRAM                                                                                                  DEV        SENT      RECEIVED       
>> 18611 root     /usr/sbin/glusterfsd                                                                                     tun0   16.307      17.547 KB/sec
>> 1055  root     /usr/sbin/glusterfs                                                                                      tun0   17.249      16.259 KB/sec
>> 13439 guru     sshd: guru at pts/0                                                                                         tun0    0.966       0.051 KB/sec
>> 18625 root     /usr/sbin/glusterfs                                                                                      tun0    0.000       0.000 KB/sec
>> 18629 root     /usr/sbin/glusterfs                                                                                      tun0    0.000       0.000 KB/sec
>> 9636  root     /usr/sbin/glusterd                                                                                       tun0    0.000       0.000 KB/sec
>> ?     root     unknown TCP                                                                                                         0.000       0.000 KB/sec
>> 
>>   TOTAL                                                                                                                           34.523      33.856 KB/sec 
>> 
>> 
>> 
>> 
> Which process corresponds to '1055'?
> 
> Pranith
>> Its glusterfs and glusterfsd process.
>> 
>> I looked at the capture file and see that the lookup is being made on random files.
>> 
>> For PID information, please see this:
>> 
>> [guru at srv2 ~]$ sudo netstat -tpn | grep 49152
>> tcp        0      0 127.0.0.1:49152             127.0.0.1:1012              ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 127.0.0.1:49152             127.0.0.1:1016              ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 127.0.0.1:1016              127.0.0.1:49152             ESTABLISHED 18625/glusterfs     
>> tcp        0      0 10.8.0.6:1021               10.8.0.1:49152              ESTABLISHED 1055/glusterfs      
>> tcp        0      0 10.8.0.6:49152              10.8.0.1:1017               ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 10.8.0.6:1020               10.8.0.1:49152              ESTABLISHED 18629/glusterfs     
>> tcp        0      0 127.0.0.1:1023              127.0.0.1:49152             ESTABLISHED 18629/glusterfs     
>> tcp        0      0 10.8.0.6:49152              10.8.0.1:1022               ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 10.8.0.6:49152              10.8.0.1:1021               ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 127.0.0.1:49152             127.0.0.1:1023              ESTABLISHED 18611/glusterfsd    
>> tcp        0      0 127.0.0.1:1012              127.0.0.1:49152             ESTABLISHED 1055/glusterfs      
>> tcp        0      0 10.8.0.6:1019               10.8.0.1:49152              ESTABLISHED 18625/glusterfs     
>> [guru at srv2 ~]$ ps -v 18611
>> Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
>>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>> 18611 ?        Ssl   14:12      0     0 650068 20404  2.0 /usr/sbin/glusterfsd -s srv2 --volfile-id gv0.srv2.root-gluster-vol0 -p /var/lib/glusterd/vols/gv0
>> [guru at srv2 ~]$ ps -v 18629
>> Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
>>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>> 18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/r
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ ps -v 18629
>> Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
>>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>> 18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be888
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ ps -v 18629
>> Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
>>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>> 18629 ?        Ssl    0:04      0     0 333296 17380  1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be8881500723b063.socket --xlator-option *replicate*.node-uuid=84af83c9-0a29-
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ 
>> [guru at srv2 ~]$ ps -v 18625
>> Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
>>   PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>> 18625 ?        Ssl    0:03      0     0 239528 41040  4.0 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5ad5b036fd636cc5dddffa73593e4089.socket
>> [guru at srv2 ~]$ sudo nethogs tun0
>> Waiting for first packet to arrive (see sourceforge.net bug 1019381)
>> [guru at srv2 ~]$ rpm -qa | grep gluster
>> glusterfs-3.5.1-1.el6.x86_64
>> glusterfs-cli-3.5.1-1.el6.x86_64
>> glusterfs-libs-3.5.1-1.el6.x86_64
>> glusterfs-fuse-3.5.1-1.el6.x86_64
>> glusterfs-server-3.5.1-1.el6.x86_64
>> glusterfs-api-3.5.1-1.el6.x86_64
>> [guru at srv2 ~]$ 
>> 
>> 
>> I don’t see anything odd here. Please suggest.
>> 
>> Thanks,
>> Gurdeep.
>> 
>> 
>> 
>> 
>> 
>> On 7 Jul 2014, at 9:06 pm, Niels de Vos <ndevos at redhat.com> wrote:
>> 
>>> On Sun, Jul 06, 2014 at 11:28:51PM +1000, Gurdeep Singh (Guru) wrote:
>>>> Hello,
>>>> 
>>>> I have setup gluster in replicate type and its working fine.
>>>> 
>>>> I am seeing a constant chatting between the hosts for lookup call and 
>>>> lookup reply. I am trying to understand as to why this traffic is 
>>>> being initiated constantly. Please look at the attached image. This 
>>>> traffic is using around 200KB/s of constant bandwidth and is 
>>>> exhausting our allocated monthly bandwidth on our 2 VPS.
>>> 
>>> You can use Wireshark to identify which process does the LOOKUP calls.  
>>> For this, do the following:
>>> 
>>> 1. select a LOOKUP Call
>>> 2. enable the 'packet details' pane (found in the main menu, 'view')
>>> 3. expand the 'Transmission Control Protocol' tree
>>> 4. check the 'Source port' of the LOOKUP Call
>>> 
>>> Together with the 'Source' and the 'Source port' you can go to the 
>>> server that matches the 'Source' address. A command like this would give 
>>> you the PID of the process in the right column:
>>> 
>>>  # netstat -tpn | grep $SOURCE_PORT
>>> 
>>> And with 'ps -v $PID' you can check which process is responsible for the 
>>> LOOKUP. This process can be a fuse-mount, self-heal-daemon or any other 
>>> glusterfs-client. Depending on the type of client, you maybe can tune 
>>> the workload or other options a little.
>>> 
>>> In Wireshark you can also check what filename is LOOKUP'd, just expand 
>>> the 'GlusterFS' part in the 'packet details' and check the 'Basename'.  
>>> Maybe this filename (without directory structure) does give you any 
>>> ideas of which activity is causing the LOOKUPs.
>>> 
>>> HTH,
>>> Niels
>>> 
>>>> 
>>>> The configuration I have for Gluster is:
>>>> 
>>>> [guru at srv1 ~]$ sudo gluster volume info
>>>> [sudo] password for guru: 
>>>> 
>>>> Volume Name: gv0
>>>> Type: Replicate
>>>> Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: srv1:/root/gluster-vol0
>>>> Brick2: srv2:/root/gluster-vol0
>>>> Options Reconfigured:
>>>> cluster.lookup-unhashed: on
>>>> performance.cache-refresh-timeout: 60
>>>> performance.cache-size: 1GB
>>>> storage.health-check-interval: 30
>>>> 
>>>> 
>>>> 
>>>> Please suggest how to fine tune the RPC calls/reply. 
>>> 
>>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://supercolony.gluster.org/mailman/listinfo/gluster-users
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140707/9d5531dc/attachment.html>