[Gluster-devel] Problems when using different hostnames in a bricks and a peer

Rarylson Freitas rarylson at gmail.com
Thu Jul 2 13:52:57 UTC 2015


Hi Atin,

You are right!!! I was using the version 3.5 in production. And when I've
checked the Gluster source code, I checked the wrong commit (not the latest
commit in the master branch).

Currently, you've already implemented my the proposed solution. It was done
at the function gd_peerinfo_find_from_addrinfo, file
xlators/mgmt/glusterd/src/glusterd-peer-utils.c.

Thanks for your tip! And sorry for any inconvenience.

--
*Rarylson Freitas*

On Thu, Jul 2, 2015 at 2:01 AM, Atin Mukherjee <amukherj at redhat.com> wrote:

> Which gluster version are you using? Better peer identification feature
> (available 3.6 onwards) should tackle this problem IMO.
>
> ~Atin
>
> On 07/02/2015 10:05 AM, Rarylson Freitas wrote:
> > Hi,
> >
> > Recently, my company needed to change our hostnames used in the Gluster
> > Pool.
> >
> > In a first moment, we have two Gluster Nodes called storage1 and
> storage2.
> > Our volumes used two bricks: storage1:/MYVOLYME and storage2:/MYVOLUME.
> We
> > put the storage1 and storage2 IPs in the /etc/hosts file of our nodes and
> > in our client servers.
> >
> > After some time, more client servers started to using Gluster and we
> > discovered that using hostnames without domain (using /etc/hosts) in all
> > client servers is a pain in the a$$ :(. So, we decided to change them to
> > something like storage1.mydomain.com and storage2.mydomain.com.
> >
> > Remember that, at this point, we had already some volumes (with bricks):
> >
> > $ gluster volume info MYVOL
> > [...]
> > Brick1: storage1:/MYDIR
> > Brick1: storage2:/MYDIR
> >
> > For simplicity, let's consider that we had two Gluster Nodes, each one
> with
> > the following entries in /etc/hosts:
> >
> > 10.10.10.1  storage1
> > 10.10.10.2  storage2
> >
> > To implement the hostname changes, we've changed the etc hosts file to:
> >
> > 10.10.10.1  storage1 storage1.mydomain.com
> > 10.10.10.2  storage2 storage2.mydomain.com
> >
> > And we've run in storage1:
> >
> > $ gluster peer probe storage2.mydomain.com
> > peer probe: success
> >
> > Everything works well during some time, but the glusterd starts to fail
> > after any reboot:
> >
> > $ service glusterfs-server status
> > glusterfs-server start/running, process 14714
> > $ service glusterfs-server restart
> > glusterfs-server stop/waiting
> > glusterfs-server start/running, process 14860
> > $ service glusterfs-server status
> > glusterfs-server stop/waiting
> >
> > To start the service again, it was necessary to rollback the hostname1
> > config to storage2 in /var/lib/glusterd/peers/OUR_UUID.
> >
> > After some try and error, we discovered that if we change the order of
> the
> > entries in /etc/hosts and repeat the process, everything worked.
> >
> > It is, from:
> >
> > 10.10.10.1  storage1 storage1.mydomain.com
> > 10.10.10.2  storage2 storage2.mydomain.com
> >
> > To:
> >
> > 10.10.10.1  storage1.mydomain.com storage1
> > 10.10.10.2  storage2.mydomain.com storage2
> >
> > And run:
> >
> > gluster peer probe storage2.mydomain.com
> > service glusterfs-server restart
> >
> > So we've checked the Glusterd debug log and checked the GlusterFS source
> > code and discovered that the big secret was the function
> > glusterd_friend_find_by_hostname, in the file
> > xlators/mgmt/glusterd/src/glusterd-utils.c. This function is called for
> > each brick that isn't a local brick and does the following things:
> >
> >    - It checks if the brick hostname is equal to some peer hostname;
> >    - If it's, this peer is our wanted friend;
> >    - If not, it gets the brick IP (resolves the hostname using the
> function
> >    getaddrinfo) and checks if the brick IP is equal to the peer hostname;
> >       - It is, we could run gluster peer probe 10.10.10.2. Once the brick
> >       IP (storage2 resolves to 10.10.10.2) would have equal to the peer
> >       "hostname" (10.10.10.2);
> >    - If it's, this peer is our wanted friend;
> >    - If not, gets the reverse of the brick IP (using the function
> >    getnameinfo) and checks if the brick reverse is equal to the peer
> >    hostname;
> >       - This is why changing the order of the entries in /etc/hosts
> worked
> >       as an workaround for us;
> >    - If not, returns and error (and Glusterd will fail).
> >
> > However, we think that comparing the brick IP (resolving the brick
> > hostname) and the peer IP (resolving the peer hostname) would be a
> simpler
> > and more comprehensive solution. Once both brick and peer will have
> > difference hostnames, but the same IP, it would work.
> >
> > The solution could be:
> >
> >    - It checks if the brick hostname is equal to some peer hostname;
> >    - If it's, this peer is our wanted friend;
> >    - If not, it gets both the brick IP (resolves the hostname using the
> >    function getaddrinfo) and the peer IP (resolves the peer hostname)
> and,
> >    for each IP pair, check if a brick IP is equal to a peer IP;
> >    - If it's, this peer is our wanted friend;
> >    - If not, returns and error (and Glusterd will fail).
> >
> > What do you think about it?
> > --
> >
> > *Rarylson Freitas*
> > Computer Engineer
> >
> >
> >
> > _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> >
>
> --
> ~Atin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150702/29d22797/attachment.html>


More information about the Gluster-devel mailing list