[Gluster-devel] [erik.jacobson at hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]

Yaniv Kaul ykaul at redhat.com
Tue Sep 21 15:42:56 UTC 2021


Perhaps part of the problem is that is that if we DISABLE the IPv6 support
in ./configure time, I don't expect to see lines such as
(from af_inet_client_get_remote_sockaddr() function):
    if (inet_pton(AF_INET6, remote_host, &serveraddr)) {
        sockaddr->sa_family = AF_INET6;
    }

    /* TODO: gf_resolve is a blocking call. kick in some
       non blocking dns techniques */
    ret = gf_resolve_ip6(remote_host, remote_port, sockaddr->sa_family,
                         &this->dnscache, &addr_info);
    if (ret == -1) {
        gf_log(this->name, GF_LOG_ERROR, "DNS resolution failed on host %s",
               remote_host);
        goto err;
    }

In the code as enabled.

However, I do feel 'transport.address-family' should have been set to IPv4
to force IPv4 regardless.
Then the question is why socket_client_get_remote_sockaddr() is
calling client_fill_address_family() to get the family address, but then
the next flow, a call to af_inet_client_get_remote_sockaddr() - which has
this information, but ignores it (as we see above).
Y.


On Tue, Sep 21, 2021 at 12:12 PM Erik Jacobson <erik.jacobson at hpe.com>
wrote:

> Dear devel team -
>
> I botched the email address here. I type "hpcm-devel" like 30 times a
> day so I mistyped that. Sorry about that.
>
> Any advice appreciated and see attached patch that "gets it going for
> us" but obviously not something you could release.
>
> Erik
>
>
>
> ---------- Forwarded message ----------
> From: Erik Jacobson <erik.jacobson at hpe.com>
> To: gluster-users at gluster.org, hpcm-devel at gluster.org
> Cc:
> Bcc:
> Date: Mon, 20 Sep 2021 16:46:12 -0500
> Subject: [Gluster-users] gluster forcing IPV6 on our IPV4 servers,
> glusterd fails (was gluster update question regarding new DNS resolution
> requirement)
> I pretended I'm a low-level C programmer with network and filesystem
> experience for a few hours.
>
> I'm not sure what the right solution is but what was happening was the
> code was trying to treat our IPV4 hosts as AF_INET6 and the family was
> incompatible with our IPV4 IP addresses. Yes, we need to move to IPV6
> but we're hoping to do that on our own time (~50 years like everybody
> else :)
>
> I found a chunk of the code that seemed to be force-setting us to
> AF_INET6.
>
> While I'm sure it is not 100% the correct patch, the patch attached and
> pasted below is working for me so I'll integrate it with our internal
> build to continue testing.
>
> Please let me know if there is a configuration item I missed or a
> different way to do this. I added -devel to this email.
>
> In the previous thread, you would have seen that we're testing a
> hopeful change that will upgrade our deployed customers from gluster
> 7.9 to gluster 9.3.
>
> Thank you!! Advice on next steps would be appreciated !!
>
>
> diff -Narup glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c
> glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c
> --- glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c      2021-06-29
> 00:27:44.381408294 -0500
> +++ glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c       2021-09-20
> 16:34:28.969425361 -0500
> @@ -252,9 +252,16 @@ af_inet_client_get_remote_sockaddr(rpc_t
>      /* Need to update transport-address family if address-family is not
> provided
>         to command-line arguments
>      */
> +    /* HPE This is forcing our IPV4 servers in to to an IPV6 address
> +     * family that is not compatible with IPV4. For now we will just set
> it
> +     * to AF_INET.
> +     */
> +    /*
>      if (inet_pton(AF_INET6, remote_host, &serveraddr)) {
>          sockaddr->sa_family = AF_INET6;
>      }
> +    */
> +    sockaddr->sa_family = AF_INET;
>
>      /* TODO: gf_resolve is a blocking call. kick in some
>         non blocking dns techniques */
>
>
> On Mon, Sep 20, 2021 at 11:35:35AM -0500, Erik Jacobson wrote:
> > I missed the other important log snip:
> >
> > The message "E [MSGID: 101075] [common-utils.c:520:gf_resolve_ip6]
> 0-resolver: error in getaddrinfo [{family=10}, {ret=Address family for
> hostname not supported}]" repeated 620 times between [2021-09-20
> 15:49:23.720633 +0000] and [2021-09-20 15:50:41.731542 +0000]
> >
> > So I will dig in to the code some here.
> >
> >
> > On Mon, Sep 20, 2021 at 10:59:30AM -0500, Erik Jacobson wrote:
> > > Hello all! I hope you are well.
> > >
> > > We are starting a new software release cycle and I am trying to find a
> > > way to upgrade customers from our build of gluster 7.9 to our build of
> > > gluster 9.3
> > >
> > > When we deploy gluster, we foribly remove all references to any host
> > > names and use only IP addresses. This is because, if for any reason a
> > > DNS server is unreachable, even if the peer files have IPs and DNS, it
> > > causes glusterd to be unable to reach peers properly. We can't really
> > > rely on /etc/hosts either because customers take artistic licene with
> > > their /etc/hosts files and don't realize that problems that can cause.
> > >
> > > So our deployed peer files look something like this:
> > >
> > > uuid=46a4b506-029d-4750-acfb-894501a88977
> > > state=3
> > > hostname1=172.23.0.16
> > >
> > > That is, with full intention, we avoid host names.
> > >
> > > When we upgrade to gluster 9.3, we fall over with these errors and
> > > gluster is now partitioned and the updated gluster servers can't reach
> > > anybody:
> > >
> > > [2021-09-20 15:50:41.731543 +0000] E
> [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS
> resolution failed on host 172.23.0.16
> > >
> > >
> > > As you can see, we have defined on purpose everything using IPs but in
> > > 9.3 it appears this method fails. Are there any suggestions short of
> > > putting real host names in peer files?
> > >
> > >
> > >
> > > FYI
> > >
> > > This supercomputer will be using gluster for part of its system
> > > management. It is how we deploy the Image Objects (squashfs images)
> > > hosted on NFS today and served by gluster leader nodes and also store
> > > system logs, console logs, and other data.
> > >
> > > https://www.olcf.ornl.gov/frontier/
> > >
> > >
> > > Erik
> > > ________
> > >
> > >
> > >
> > > Community Meeting Calendar:
> > >
> > > Schedule -
> > > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > > Bridge: https://meet.google.com/cpu-eiue-hvk
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > ________
> >
> >
> >
> > Community Meeting Calendar:
> >
> > Schedule -
> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > Bridge: https://meet.google.com/cpu-eiue-hvk
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> -------
>
> Community Meeting Calendar:
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
>
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20210921/f9dea576/attachment-0001.html>


More information about the Gluster-devel mailing list