[Gluster-devel] [erik.jacobson at hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]

Erik Jacobson erik.jacobson at hpe.com
Tue Sep 21 20:33:21 UTC 2021


> However, I do feel 'transport.address-family' should have been set to IPv4 to
> force IPv4 regardless.

If I missed a setting and this is all my fault, that would be welcome
news (although I'm not sure what I'll do for our deployed gluster 7.3
and 7.9 systems that update to 9.3 with a cluster manager update).

But I was looking and I didn't immediately see in the CLI where to set
this.

It's not in 'gluster volume set' help that I can see.

I'm happy to be educated! Thank you.

> Then the question is why socket_client_get_remote_sockaddr() is
> calling client_fill_address_family() to get the family address, but then the
> next flow, a call to af_inet_client_get_remote_sockaddr() - which has this
> information, but ignores it (as we see above).
> Y.
> 
> 
> On Tue, Sep 21, 2021 at 12:12 PM Erik Jacobson <erik.jacobson at hpe.com> wrote:
> 
>     Dear devel team -
> 
>     I botched the email address here. I type "hpcm-devel" like 30 times a
>     day so I mistyped that. Sorry about that.
> 
>     Any advice appreciated and see attached patch that "gets it going for
>     us" but obviously not something you could release.
> 
>     Erik
> 
> 
> 
>     ---------- Forwarded message ----------
>     From: Erik Jacobson <erik.jacobson at hpe.com>
>     To: gluster-users at gluster.org, hpcm-devel at gluster.org
>     Cc: 
>     Bcc: 
>     Date: Mon, 20 Sep 2021 16:46:12 -0500
>     Subject: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd
>     fails (was gluster update question regarding new DNS resolution
>     requirement)
>     I pretended I'm a low-level C programmer with network and filesystem
>     experience for a few hours.
> 
>     I'm not sure what the right solution is but what was happening was the
>     code was trying to treat our IPV4 hosts as AF_INET6 and the family was
>     incompatible with our IPV4 IP addresses. Yes, we need to move to IPV6
>     but we're hoping to do that on our own time (~50 years like everybody
>     else :)
> 
>     I found a chunk of the code that seemed to be force-setting us to
>     AF_INET6.
> 
>     While I'm sure it is not 100% the correct patch, the patch attached and
>     pasted below is working for me so I'll integrate it with our internal
>     build to continue testing.
> 
>     Please let me know if there is a configuration item I missed or a
>     different way to do this. I added -devel to this email.
> 
>     In the previous thread, you would have seen that we're testing a
>     hopeful change that will upgrade our deployed customers from gluster
>     7.9 to gluster 9.3.
> 
>     Thank you!! Advice on next steps would be appreciated !!
> 
> 
>     diff -Narup glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c
>     glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c
>     --- glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c      2021-06-29
>     00:27:44.381408294 -0500
>     +++ glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c       2021-09-20
>     16:34:28.969425361 -0500
>     @@ -252,9 +252,16 @@ af_inet_client_get_remote_sockaddr(rpc_t
>          /* Need to update transport-address family if address-family is not
>     provided
>             to command-line arguments
>          */
>     +    /* HPE This is forcing our IPV4 servers in to to an IPV6 address
>     +     * family that is not compatible with IPV4. For now we will just set
>     it
>     +     * to AF_INET.
>     +     */
>     +    /*
>          if (inet_pton(AF_INET6, remote_host, &serveraddr)) {
>              sockaddr->sa_family = AF_INET6;
>          }
>     +    */
>     +    sockaddr->sa_family = AF_INET;
> 
>          /* TODO: gf_resolve is a blocking call. kick in some
>             non blocking dns techniques */
> 
> 
>     On Mon, Sep 20, 2021 at 11:35:35AM -0500, Erik Jacobson wrote:
>     > I missed the other important log snip:
>     >
>     > The message "E [MSGID: 101075] [common-utils.c:520:gf_resolve_ip6]
>     0-resolver: error in getaddrinfo [{family=10}, {ret=Address family for
>     hostname not supported}]" repeated 620 times between [2021-09-20
>     15:49:23.720633 +0000] and [2021-09-20 15:50:41.731542 +0000]
>     >
>     > So I will dig in to the code some here.
>     >
>     >
>     > On Mon, Sep 20, 2021 at 10:59:30AM -0500, Erik Jacobson wrote:
>     > > Hello all! I hope you are well.
>     > >
>     > > We are starting a new software release cycle and I am trying to find a
>     > > way to upgrade customers from our build of gluster 7.9 to our build of
>     > > gluster 9.3
>     > >
>     > > When we deploy gluster, we foribly remove all references to any host
>     > > names and use only IP addresses. This is because, if for any reason a
>     > > DNS server is unreachable, even if the peer files have IPs and DNS, it
>     > > causes glusterd to be unable to reach peers properly. We can't really
>     > > rely on /etc/hosts either because customers take artistic licene with
>     > > their /etc/hosts files and don't realize that problems that can cause.
>     > >
>     > > So our deployed peer files look something like this:
>     > >
>     > > uuid=46a4b506-029d-4750-acfb-894501a88977
>     > > state=3
>     > > hostname1=172.23.0.16
>     > >
>     > > That is, with full intention, we avoid host names.
>     > >
>     > > When we upgrade to gluster 9.3, we fall over with these errors and
>     > > gluster is now partitioned and the updated gluster servers can't reach
>     > > anybody:
>     > >
>     > > [2021-09-20 15:50:41.731543 +0000] E
>     [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS
>     resolution failed on host 172.23.0.16
>     > >
>     > >
>     > > As you can see, we have defined on purpose everything using IPs but in
>     > > 9.3 it appears this method fails. Are there any suggestions short of
>     > > putting real host names in peer files?
>     > >
>     > >
>     > >
>     > > FYI
>     > >
>     > > This supercomputer will be using gluster for part of its system
>     > > management. It is how we deploy the Image Objects (squashfs images)
>     > > hosted on NFS today and served by gluster leader nodes and also store
>     > > system logs, console logs, and other data.
>     > >
>     > > https://www.olcf.ornl.gov/frontier/   
>     > >
>     > >
>     > > Erik
>     > > ________
>     > >
>     > >
>     > >
>     > > Community Meeting Calendar:
>     > >
>     > > Schedule -
>     > > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     > > Bridge: https://meet.google.com/cpu-eiue-hvk   
>     > > Gluster-users mailing list
>     > > Gluster-users at gluster.org
>     > > https://lists.gluster.org/mailman/listinfo/gluster-users   
>     > ________
>     >
>     >
>     >
>     > Community Meeting Calendar:
>     >
>     > Schedule -
>     > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     > Bridge: https://meet.google.com/cpu-eiue-hvk 
>     > Gluster-users mailing list
>     > Gluster-users at gluster.org
>     > https://lists.gluster.org/mailman/listinfo/gluster-users 
>     ________
> 
> 
> 
>     Community Meeting Calendar:
> 
>     Schedule -
>     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     Bridge: https://meet.google.com/cpu-eiue-hvk
>     Gluster-users mailing list
>     Gluster-users at gluster.org
>     https://lists.gluster.org/mailman/listinfo/gluster-users
>     -------
> 
>     Community Meeting Calendar:
>     Schedule -
>     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     Bridge: https://meet.google.com/cpu-eiue-hvk
> 
>     Gluster-devel mailing list
>     Gluster-devel at gluster.org
>     https://lists.gluster.org/mailman/listinfo/gluster-devel
> 
> 


More information about the Gluster-devel mailing list