[Gluster-users] Gluster high RPC calls and reply
Gurdeep Singh (Guru)
guru at bazaari.com.au
Mon Jul 7 13:33:06 UTC 2014
Hello Niels,
I did a net hogs on the interface to see what process might be using the bandwidth,
NetHogs version 0.8.0
PID USER PROGRAM DEV SENT RECEIVED
18611 root /usr/sbin/glusterfsd tun0 16.307 17.547 KB/sec
1055 root /usr/sbin/glusterfs tun0 17.249 16.259 KB/sec
13439 guru sshd: guru at pts/0 tun0 0.966 0.051 KB/sec
18625 root /usr/sbin/glusterfs tun0 0.000 0.000 KB/sec
18629 root /usr/sbin/glusterfs tun0 0.000 0.000 KB/sec
9636 root /usr/sbin/glusterd tun0 0.000 0.000 KB/sec
? root unknown TCP 0.000 0.000 KB/sec
TOTAL 34.523 33.856 KB/sec
Its glusterfs and glusterfsd process.
I looked at the capture file and see that the lookup is being made on random files.
For PID information, please see this:
[guru at srv2 ~]$ sudo netstat -tpn | grep 49152
tcp 0 0 127.0.0.1:49152 127.0.0.1:1012 ESTABLISHED 18611/glusterfsd
tcp 0 0 127.0.0.1:49152 127.0.0.1:1016 ESTABLISHED 18611/glusterfsd
tcp 0 0 127.0.0.1:1016 127.0.0.1:49152 ESTABLISHED 18625/glusterfs
tcp 0 0 10.8.0.6:1021 10.8.0.1:49152 ESTABLISHED 1055/glusterfs
tcp 0 0 10.8.0.6:49152 10.8.0.1:1017 ESTABLISHED 18611/glusterfsd
tcp 0 0 10.8.0.6:1020 10.8.0.1:49152 ESTABLISHED 18629/glusterfs
tcp 0 0 127.0.0.1:1023 127.0.0.1:49152 ESTABLISHED 18629/glusterfs
tcp 0 0 10.8.0.6:49152 10.8.0.1:1022 ESTABLISHED 18611/glusterfsd
tcp 0 0 10.8.0.6:49152 10.8.0.1:1021 ESTABLISHED 18611/glusterfsd
tcp 0 0 127.0.0.1:49152 127.0.0.1:1023 ESTABLISHED 18611/glusterfsd
tcp 0 0 127.0.0.1:1012 127.0.0.1:49152 ESTABLISHED 1055/glusterfs
tcp 0 0 10.8.0.6:1019 10.8.0.1:49152 ESTABLISHED 18625/glusterfs
[guru at srv2 ~]$ ps -v 18611
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18611 ? Ssl 14:12 0 0 650068 20404 2.0 /usr/sbin/glusterfsd -s srv2 --volfile-id gv0.srv2.root-gluster-vol0 -p /var/lib/glusterd/vols/gv0
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18629 ? Ssl 0:04 0 0 333296 17380 1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/r
[guru at srv2 ~]$
[guru at srv2 ~]$
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18629 ? Ssl 0:04 0 0 333296 17380 1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be888
[guru at srv2 ~]$
[guru at srv2 ~]$
[guru at srv2 ~]$
[guru at srv2 ~]$
[guru at srv2 ~]$ ps -v 18629
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18629 ? Ssl 0:04 0 0 333296 17380 1.7 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/823fa3197e2d1841be8881500723b063.socket --xlator-option *replicate*.node-uuid=84af83c9-0a29-
[guru at srv2 ~]$
[guru at srv2 ~]$
[guru at srv2 ~]$ ps -v 18625
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
18625 ? Ssl 0:03 0 0 239528 41040 4.0 /usr/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/5ad5b036fd636cc5dddffa73593e4089.socket
[guru at srv2 ~]$ sudo nethogs tun0
Waiting for first packet to arrive (see sourceforge.net bug 1019381)
[guru at srv2 ~]$ rpm -qa | grep gluster
glusterfs-3.5.1-1.el6.x86_64
glusterfs-cli-3.5.1-1.el6.x86_64
glusterfs-libs-3.5.1-1.el6.x86_64
glusterfs-fuse-3.5.1-1.el6.x86_64
glusterfs-server-3.5.1-1.el6.x86_64
glusterfs-api-3.5.1-1.el6.x86_64
[guru at srv2 ~]$
I don’t see anything odd here. Please suggest.
Thanks,
Gurdeep.
On 7 Jul 2014, at 9:06 pm, Niels de Vos <ndevos at redhat.com> wrote:
> On Sun, Jul 06, 2014 at 11:28:51PM +1000, Gurdeep Singh (Guru) wrote:
>> Hello,
>>
>> I have setup gluster in replicate type and its working fine.
>>
>> I am seeing a constant chatting between the hosts for lookup call and
>> lookup reply. I am trying to understand as to why this traffic is
>> being initiated constantly. Please look at the attached image. This
>> traffic is using around 200KB/s of constant bandwidth and is
>> exhausting our allocated monthly bandwidth on our 2 VPS.
>
> You can use Wireshark to identify which process does the LOOKUP calls.
> For this, do the following:
>
> 1. select a LOOKUP Call
> 2. enable the 'packet details' pane (found in the main menu, 'view')
> 3. expand the 'Transmission Control Protocol' tree
> 4. check the 'Source port' of the LOOKUP Call
>
> Together with the 'Source' and the 'Source port' you can go to the
> server that matches the 'Source' address. A command like this would give
> you the PID of the process in the right column:
>
> # netstat -tpn | grep $SOURCE_PORT
>
> And with 'ps -v $PID' you can check which process is responsible for the
> LOOKUP. This process can be a fuse-mount, self-heal-daemon or any other
> glusterfs-client. Depending on the type of client, you maybe can tune
> the workload or other options a little.
>
> In Wireshark you can also check what filename is LOOKUP'd, just expand
> the 'GlusterFS' part in the 'packet details' and check the 'Basename'.
> Maybe this filename (without directory structure) does give you any
> ideas of which activity is causing the LOOKUPs.
>
> HTH,
> Niels
>
>>
>> The configuration I have for Gluster is:
>>
>> [guru at srv1 ~]$ sudo gluster volume info
>> [sudo] password for guru:
>>
>> Volume Name: gv0
>> Type: Replicate
>> Volume ID: dc8dc3f2-f5bd-4047-9101-acad04695442
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: srv1:/root/gluster-vol0
>> Brick2: srv2:/root/gluster-vol0
>> Options Reconfigured:
>> cluster.lookup-unhashed: on
>> performance.cache-refresh-timeout: 60
>> performance.cache-size: 1GB
>> storage.health-check-interval: 30
>>
>>
>>
>> Please suggest how to fine tune the RPC calls/reply.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140707/a2d0c9bf/attachment.html>
More information about the Gluster-users
mailing list