[Bugs] [Bug 1807785] New: seeing error message in glustershd.log on volume start(or may be as part of shd graph regeneration) inet_pton failed with return code 0 [Invalid argument]

bugzilla at redhat.com bugzilla at redhat.com
Thu Feb 27 09:19:04 UTC 2020


https://bugzilla.redhat.com/show_bug.cgi?id=1807785

            Bug ID: 1807785
           Summary: seeing error message in glustershd.log on volume
                    start(or may be as part of shd graph regeneration)
                    inet_pton failed with return code 0 [Invalid argument]
           Product: GlusterFS
           Version: 7
            Status: NEW
         Component: eventsapi
          Severity: low
          Assignee: bugs at gluster.org
          Reporter: jahernan at redhat.com
                CC: amukherj at redhat.com, avishwan at redhat.com,
                    nchilaka at redhat.com, rhs-bugs at redhat.com,
                    sanandpa at redhat.com, sheggodu at redhat.com,
                    storage-qa-internal at redhat.com, vdas at redhat.com
        Depends On: 1732443, 1752330
  Target Milestone: ---
    Classification: Community



+++ This bug was initially created as a clone of Bug #1752330 +++

+++ This bug was initially created as a clone of Bug #1732443 +++

Description of problem:
=======================
I am seeing below error message in shd.log when I start a new volume(or may be
as part of shd graph regeneration event).
The message is not so helpful for an admin to understand what it means and
hence can be a bit annoying



[2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event]
0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument]

###########

[2019-07-23 12:09:53.154339] I [rpc-clnt.c:2035:rpc_clnt_reconfig]
0-distrep-client-7: changing port to 49153 (from 0)
[2019-07-23 12:09:53.154388] I [socket.c:871:__socket_shutdown]
0-distrep-client-7: intentional socket shutdown(14)
[2019-07-23 12:09:53.154870] I [rpc-clnt.c:2035:rpc_clnt_reconfig]
0-shd1-client-2: changing port to 49154 (from 0)
[2019-07-23 12:09:53.155279] I [socket.c:871:__socket_shutdown]
0-shd1-client-2: intentional socket shutdown(24)
[2019-07-23 12:09:53.160810] I [rpc-clnt.c:2035:rpc_clnt_reconfig]
0-shd2-client-0: changing port to 49155 (from 0)
[2019-07-23 12:09:53.160885] I [socket.c:871:__socket_shutdown]
0-shd2-client-0: intentional socket shutdown(16)
[2019-07-23 12:09:53.161088] I [rpc-clnt.c:2035:rpc_clnt_reconfig]
0-shd2-client-1: changing port to 49155 (from 0)
[2019-07-23 12:09:53.161208] I [socket.c:871:__socket_shutdown]
0-shd2-client-1: intentional socket shutdown(26)
[2019-07-23 12:09:53.165383] I [rpc-clnt.c:2035:rpc_clnt_reconfig]
0-shd2-client-2: changing port to 49155 (from 0)
[2019-07-23 12:09:53.165715] I [socket.c:871:__socket_shutdown]
0-shd2-client-2: intentional socket shutdown(22)
[2019-07-23 12:09:53.168402] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-distrep-client-6: Connected to
distrep-client-6, attached to remote volume '/gluster/brick2/distrep-sv3'.
[2019-07-23 12:09:53.168455] I [MSGID: 108002] [afr-common.c:5589:afr_notify]
0-distrep-replicate-2: Client-quorum is met
[2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event]
0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument]
[2019-07-23 12:09:53.172515] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-shd1-client-1: Connected to
shd1-client-1, attached to remote volume '/gluster/brick1/shd1'.
[2019-07-23 12:09:53.172559] I [MSGID: 108002] [afr-common.c:5589:afr_notify]
0-shd1-replicate-0: Client-quorum is met
[2019-07-23 12:09:53.173251] E [MSGID: 101172] [events.c:89:_gf_event]
0-shd1-replicate-0: inet_pton failed with return code 0 [Invalid argument]
[2019-07-23 12:09:53.173591] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-distrep-client-7: Connected to
distrep-client-7, attached to remote volume '/gluster/brick2/distrep-sv3'.
[2019-07-23 12:09:53.175599] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-shd1-client-2: Connected to
shd1-client-2, attached to remote volume '/gluster/brick1/shd1'.
[2019-07-23 12:09:53.179349] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-1: Connected to
shd2-client-1, attached to remote volume '/gluster/brick1/shd2'.
[2019-07-23 12:09:53.179391] I [MSGID: 108005]
[afr-common.c:5235:__afr_handle_child_up_event] 0-shd2-replicate-0: Subvolume
'shd2-client-1' came back up; going online.
[2019-07-23 12:09:53.179656] E [MSGID: 101172] [events.c:89:_gf_event]
0-shd2-replicate-0: inet_pton failed with return code 0 [Invalid argument]
[2019-07-23 12:09:53.180373] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-0: Connected to
shd2-client-0, attached to remote volume '/gluster/brick1/shd2'.
[2019-07-23 12:09:53.180422] I [MSGID: 108002] [afr-common.c:5589:afr_notify]
0-shd2-replicate-0: Client-quorum is met
[2019-07-23 12:09:53.180698] E [MSGID: 101172] [events.c:89:_gf_event]
0-shd2-replicate-0: inet_pton failed with return code 0 [Invalid argument]
[2019-07-23 12:09:53.181257] I [MSGID: 114046]
[client-handshake.c:1106:client_setvolume_cbk] 0-shd2-client-2: Connected to
shd2-client-2, attached to remote volume '/gluster/brick1/shd2'.

Version-Release number of selected component (if applicable):
==============
6.0.8 (shd-mux is reverted)

How reproducible:
================
always

Steps to Reproduce:
1.create a 2x3 volume and start it
2. create a new 1x3 volume and start it 


Actual results:
=============
glustershd.log has above error message logged when we start a new volume

Expected results:
===================
either the message can be removed if it is not of much value, or else it can be
rephrased to something more meaningful


Additional info:
==============
not attaching logs as it is straightforward to reproduce

--- Additional comment from Vivek Das on 2019-09-10 09:47:33 UTC ---

This is marked blocker based on 'Regression' keyword.

I am seeing the error messages in samba client logs as well.
We have not seen functional impact as of now. But would like to know what
introduced this in rhgs-3.5.0. An error message is what we need to avoid to be
seen by customers.

Sunil your take on this please.

--- Additional comment from Atin Mukherjee on 2019-09-12 02:33:11 UTC ---

>From _gf_event () :

   /* Socket Configurations */                                                 
    server.sin_family = AF_INET;                                                
    server.sin_port = htons(EVENT_PORT);                                        
    ret = inet_pton(server.sin_family, host, &server.sin_addr);                 
    if (ret <= 0) {                                                             
        gf_msg(this->name, GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG,            
               "inet_pton failed with return code %d", ret);                    
        goto out;                                                               
    }             


RETURN VALUE 
       inet_pton() returns 1 on success (network address was successfully
       converted).  0 is returned if src does not contain a character string
       representing a valid network address in the specified address family.
       If af does not contain a valid address family, -1 is returned and
       errno is set to EAFNOSUPPORT.

[2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event]
0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument]

So based on the above log it looks like in certain cases host is NULL and
that's why we're bailing out. This seems like a wrong functionality?

Aravinda - could you take a look at this?

--- Additional comment from Aravinda VK on 2019-09-12 05:47:06 UTC ---

(In reply to Atin Mukherjee from comment #5)
> From _gf_event () :
> 
>    /* Socket Configurations */                                              
> 
>     server.sin_family = AF_INET;                                            
> 
>     server.sin_port = htons(EVENT_PORT);                                    
> 
>     ret = inet_pton(server.sin_family, host, &server.sin_addr);             
> 
>     if (ret <= 0) {                                                         
> 
>         gf_msg(this->name, GF_LOG_ERROR, EINVAL, LG_MSG_INVALID_ARG,        
> 
>                "inet_pton failed with return code %d", ret);                
> 
>         goto out;                                                           
> 
>     }             
> 
> 
> RETURN VALUE 
>        inet_pton() returns 1 on success (network address was successfully
>        converted).  0 is returned if src does not contain a character string
>        representing a valid network address in the specified address family.
>        If af does not contain a valid address family, -1 is returned and
>        errno is set to EAFNOSUPPORT.
> 
> [2019-07-23 12:09:53.168761] E [MSGID: 101172] [events.c:89:_gf_event]
> 0-distrep-replicate-2: inet_pton failed with return code 0 [Invalid argument]
> 
> So based on the above log it looks like in certain cases host is NULL and
> that's why we're bailing out. This seems like a wrong functionality?
> 
> Aravinda - could you take a look at this?

I will look into this issue today.

--- Additional comment from Aravinda VK on 2019-09-12 07:26:20 UTC ---

`-s localhost` passed in the argument is converted to `::1`. But
`server.sin_family = AF_INET;` is used instead of `AF_INET6`. Due to this
`inet_pton` is failing.

Example `ps ax | grep shd`

13449 ?        Ssl    0:00 /usr/sbin/glusterfs -s localhost --volfile-id
shd/gv1 -p /var/run/gluster/shd/gv1/gv1-shd.pid -l
/var/log/glusterfs/glustershd.log -S /var/run/gluster/5d6ae69815eaaafd.socket
--xlator-option *replicate*.node-uuid=855ed18b-5c6f-4889-b523-5b7864d2a842
--process-name glustershd --client-pid=-6

I will explore and update the proper fix to handle this issue.

--- Additional comment from Atin Mukherjee on 2019-09-16 02:29:07 UTC ---

(In reply to Aravinda VK from comment #7)
> `-s localhost` passed in the argument is converted to `::1`. But
> `server.sin_family = AF_INET;` is used instead of `AF_INET6`. Due to this
> `inet_pton` is failing.
> 
> Example `ps ax | grep shd`
> 
> 13449 ?        Ssl    0:00 /usr/sbin/glusterfs -s localhost --volfile-id
> shd/gv1 -p /var/run/gluster/shd/gv1/gv1-shd.pid -l
> /var/log/glusterfs/glustershd.log -S
> /var/run/gluster/5d6ae69815eaaafd.socket --xlator-option
> *replicate*.node-uuid=855ed18b-5c6f-4889-b523-5b7864d2a842 --process-name
> glustershd --client-pid=-6
> 
> I will explore and update the proper fix to handle this issue.

Aravinda - IIUC, the impact would be that we'd miss to generate event in such
cases given gf_event API is bailing out, correct? And if so then this should be
a blocker?


Referenced Bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=1732443
[Bug 1732443] seeing error message in glustershd.log on volume start(or may be
as part of shd graph regeneration) inet_pton failed with return code 0 [Invalid
argument]
https://bugzilla.redhat.com/show_bug.cgi?id=1752330
[Bug 1752330] seeing error message in glustershd.log on volume start(or may be
as part of shd graph regeneration) inet_pton failed with return code 0 [Invalid
argument]
-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the Bugs mailing list