[Gluster-devel] [erik.jacobson at hpe.com: [Gluster-users] gluster forcing IPV6 on our IPV4 servers, glusterd fails (was gluster update question regarding new DNS resolution requirement)]
Paul Jakma
paul at jakma.org
Tue Sep 21 12:15:44 UTC 2021
Hi,
I'd love to have more of a discussion on this. There are issues in the
code around IPv4 assumptions, and wider issues around identity that make
Gluster hard to operate on a dual-stack / many-network setup.
E.g., the assumption that a peer IP will have a hostname that resolves
to the same IP.
Paul
On Tue, 21 Sep 2021, Mohit Agrawal wrote:
> Hi,
>
> In gluster we do support one kind of address family (either IPV4 or IPV6)
> and it depends on the user
> what address family they want to use.It is a configurable option a user
> can set the value in volfile . Here
> It seems you are facing an issue as you mentioned " HPE is forcing IPV4
> servers in to to an IPV6 address", i
> think you can avoid an error if u pass address-family=inet6(
> xlator-option="transport.address-family=inet6")
> during mount a volume and in that case you should not face an issue.
> For more please refer this https://github.com/gluster/glusterfs/pull/2666
>
> Thanks,
> Mohit Agrawal
>
>
>
> On Tue, Sep 21, 2021 at 2:42 PM Erik Jacobson <erik.jacobson at hpe.com> wrote:
>
>> Dear devel team -
>>
>> I botched the email address here. I type "hpcm-devel" like 30 times a
>> day so I mistyped that. Sorry about that.
>>
>> Any advice appreciated and see attached patch that "gets it going for
>> us" but obviously not something you could release.
>>
>> Erik
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: Erik Jacobson <erik.jacobson at hpe.com>
>> To: gluster-users at gluster.org, hpcm-devel at gluster.org
>> Cc:
>> Bcc:
>> Date: Mon, 20 Sep 2021 16:46:12 -0500
>> Subject: [Gluster-users] gluster forcing IPV6 on our IPV4 servers,
>> glusterd fails (was gluster update question regarding new DNS resolution
>> requirement)
>> I pretended I'm a low-level C programmer with network and filesystem
>> experience for a few hours.
>>
>> I'm not sure what the right solution is but what was happening was the
>> code was trying to treat our IPV4 hosts as AF_INET6 and the family was
>> incompatible with our IPV4 IP addresses. Yes, we need to move to IPV6
>> but we're hoping to do that on our own time (~50 years like everybody
>> else :)
>>
>> I found a chunk of the code that seemed to be force-setting us to
>> AF_INET6.
>>
>> While I'm sure it is not 100% the correct patch, the patch attached and
>> pasted below is working for me so I'll integrate it with our internal
>> build to continue testing.
>>
>> Please let me know if there is a configuration item I missed or a
>> different way to do this. I added -devel to this email.
>>
>> In the previous thread, you would have seen that we're testing a
>> hopeful change that will upgrade our deployed customers from gluster
>> 7.9 to gluster 9.3.
>>
>> Thank you!! Advice on next steps would be appreciated !!
>>
>>
>> diff -Narup glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c
>> glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c
>> --- glusterfs-9.3-ORIG/rpc/rpc-transport/socket/src/name.c 2021-06-29
>> 00:27:44.381408294 -0500
>> +++ glusterfs-9.3-NEW/rpc/rpc-transport/socket/src/name.c 2021-09-20
>> 16:34:28.969425361 -0500
>> @@ -252,9 +252,16 @@ af_inet_client_get_remote_sockaddr(rpc_t
>> /* Need to update transport-address family if address-family is not
>> provided
>> to command-line arguments
>> */
>> + /* HPE This is forcing our IPV4 servers in to to an IPV6 address
>> + * family that is not compatible with IPV4. For now we will just set
>> it
>> + * to AF_INET.
>> + */
>> + /*
>> if (inet_pton(AF_INET6, remote_host, &serveraddr)) {
>> sockaddr->sa_family = AF_INET6;
>> }
>> + */
>> + sockaddr->sa_family = AF_INET;
>>
>> /* TODO: gf_resolve is a blocking call. kick in some
>> non blocking dns techniques */
>>
>>
>> On Mon, Sep 20, 2021 at 11:35:35AM -0500, Erik Jacobson wrote:
>>> I missed the other important log snip:
>>>
>>> The message "E [MSGID: 101075] [common-utils.c:520:gf_resolve_ip6]
>> 0-resolver: error in getaddrinfo [{family=10}, {ret=Address family for
>> hostname not supported}]" repeated 620 times between [2021-09-20
>> 15:49:23.720633 +0000] and [2021-09-20 15:50:41.731542 +0000]
>>>
>>> So I will dig in to the code some here.
>>>
>>>
>>> On Mon, Sep 20, 2021 at 10:59:30AM -0500, Erik Jacobson wrote:
>>>> Hello all! I hope you are well.
>>>>
>>>> We are starting a new software release cycle and I am trying to find a
>>>> way to upgrade customers from our build of gluster 7.9 to our build of
>>>> gluster 9.3
>>>>
>>>> When we deploy gluster, we foribly remove all references to any host
>>>> names and use only IP addresses. This is because, if for any reason a
>>>> DNS server is unreachable, even if the peer files have IPs and DNS, it
>>>> causes glusterd to be unable to reach peers properly. We can't really
>>>> rely on /etc/hosts either because customers take artistic licene with
>>>> their /etc/hosts files and don't realize that problems that can cause.
>>>>
>>>> So our deployed peer files look something like this:
>>>>
>>>> uuid=46a4b506-029d-4750-acfb-894501a88977
>>>> state=3
>>>> hostname1=172.23.0.16
>>>>
>>>> That is, with full intention, we avoid host names.
>>>>
>>>> When we upgrade to gluster 9.3, we fall over with these errors and
>>>> gluster is now partitioned and the updated gluster servers can't reach
>>>> anybody:
>>>>
>>>> [2021-09-20 15:50:41.731543 +0000] E
>> [name.c:265:af_inet_client_get_remote_sockaddr] 0-management: DNS
>> resolution failed on host 172.23.0.16
>>>>
>>>>
>>>> As you can see, we have defined on purpose everything using IPs but in
>>>> 9.3 it appears this method fails. Are there any suggestions short of
>>>> putting real host names in peer files?
>>>>
>>>>
>>>>
>>>> FYI
>>>>
>>>> This supercomputer will be using gluster for part of its system
>>>> management. It is how we deploy the Image Objects (squashfs images)
>>>> hosted on NFS today and served by gluster leader nodes and also store
>>>> system logs, console logs, and other data.
>>>>
>>>> https://www.olcf.ornl.gov/frontier/
>>>>
>>>>
>>>> Erik
>>>> ________
>>>>
>>>>
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> Schedule -
>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>> ________
>>>
>>>
>>>
>>> Community Meeting Calendar:
>>>
>>> Schedule -
>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> ________
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> -------
>>
>> Community Meeting Calendar:
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>
>> Gluster-devel mailing list
>> Gluster-devel at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
>
--
Paul Jakma | paul at jakma.org | @pjakma | Key ID: 0xD86BF79464A2FF6A
Fortune:
About the only thing on a farm that has an easy time is the dog.
More information about the Gluster-devel
mailing list