[Gluster-users] rpc_client_ping_timer_expired logic

Pranith Kumar Karampuri pkarampu at redhat.com
Fri Feb 5 05:26:29 UTC 2016



On 02/04/2016 08:26 PM, Khoi Mai wrote:
> Hi Gluster community,
>
> Can someone who has insight on how rpc_client_ping_timer_expired 
> operates, I would love to learn more about.   The reason behind it is 
> that last week I had 2 fuse clients produce the same disconnect 
> message, but reconnected immediately afterwards.  What I'd like to 
> know is what may have caused it to behave this way and where else I 
> can look to build and understanding of root cause.  The gluster node 
> does show the same disconnect/reconnect.

The way it works is, when a first message is sent to the server, ping 
rpc is sent to server and a 42 seconds timer is started by default (It 
can be changed with network.ping-timeout).
           If the ping response comes it will stop the earlier timer and 
will start a 42 second timer again for next ping message.
           If the ping response doesn't come in 42 seconds timer expires 
at that point if there was no transport activity where some other 
messages were sent/received the transport gets disconnected and 
reconnect is attempted. Otherwise it think the ping response may come 
after some more time so delays the timer by 42 more seconds to see if 
the response comes.

Pranith
>
> Jan 28 14:25:27 omhq1cab GlusterFS[1640]: [2016-01-28 20:25:27.685703] 
> C [client-handshake.c:127:rpc_client_ping_timer_expired] 
> 0-prodstatic-client-3: server 72.36.4.204:49155 has not responded in 
> the last 10 seconds, disconnecting.
>
>
> Jan 28 14:24:52 omhq1ca9 GlusterFS[1612]: [2016-01-28 20:24:52.589450] 
> C [client-handshake.c:127:rpc_client_ping_timer_expired] 
> 0-prodstatic-client-3: server 72.36.4.204:49155 has not responded in 
> the last 10 seconds, disconnecting.
>
> My setup for the volume is as follows:  Brick4 was the one that 
> appeared not responding to the clients.  I have an environment where 
> multiple clients(30+)  mount this volume and none of them had any 
> issues with Brick4 logged.
>
> Volume Name: prodstatic
> Type: Distributed-Replicate
> Volume ID: 187c241d-0eeb-4405-80f2-c704ea44bc36
> Status: Started
> Number of Bricks: 2 x 4 = 8
> Transport-type: tcp
> Bricks:
> Brick1: server1140:/export/content/static
> Brick2: server1c5d:/export/content/static
> Brick3: server11ad:/export/content/static
> *Brick4: server1781:/export/content/static*
> Brick5: server1c56:/export/content/static
> Brick6: server1c58:/export/content/static
> Brick7: server1c57:/export/content/static
> Brick8: server1c59:/export/content/static
> Options Reconfigured:
> network.ping-timeout: 10
> server.allow-insecure: on
> features.quota: on
>
> Thanks
> Khoi
>
>
> **
>
>
>
> This email and any attachments may contain information that is 
> confidential and/or privileged for the sole use of the intended 
> recipient. Any use, review, disclosure, copying, distribution or 
> reliance by others, and any forwarding of this email or its contents, 
> without the express permission of the sender is strictly prohibited by 
> law. If you are not the intended recipient, please contact the sender 
> immediately, delete the e-mail and destroy all copies.
>
> **
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160205/8b0c6694/attachment.html>


More information about the Gluster-users mailing list