[Gluster-devel] Problems when using different hostnames in a bricks and a peer
Atin Mukherjee
atin.mukherjee83 at gmail.com
Thu Jul 2 15:49:42 UTC 2015
Not at all a problem. I am here to help Rarylson :)
-Atin
Sent from one plus one
On Jul 2, 2015 7:23 PM, "Rarylson Freitas" <rarylson at gmail.com> wrote:
> Hi Atin,
>
> You are right!!! I was using the version 3.5 in production. And when I've
> checked the Gluster source code, I checked the wrong commit (not the latest
> commit in the master branch).
>
> Currently, you've already implemented my the proposed solution. It was
> done at the function gd_peerinfo_find_from_addrinfo, file
> xlators/mgmt/glusterd/src/glusterd-peer-utils.c.
>
> Thanks for your tip! And sorry for any inconvenience.
>
> --
> *Rarylson Freitas*
>
> On Thu, Jul 2, 2015 at 2:01 AM, Atin Mukherjee <amukherj at redhat.com>
> wrote:
>
>> Which gluster version are you using? Better peer identification feature
>> (available 3.6 onwards) should tackle this problem IMO.
>>
>> ~Atin
>>
>> On 07/02/2015 10:05 AM, Rarylson Freitas wrote:
>> > Hi,
>> >
>> > Recently, my company needed to change our hostnames used in the Gluster
>> > Pool.
>> >
>> > In a first moment, we have two Gluster Nodes called storage1 and
>> storage2.
>> > Our volumes used two bricks: storage1:/MYVOLYME and storage2:/MYVOLUME.
>> We
>> > put the storage1 and storage2 IPs in the /etc/hosts file of our nodes
>> and
>> > in our client servers.
>> >
>> > After some time, more client servers started to using Gluster and we
>> > discovered that using hostnames without domain (using /etc/hosts) in all
>> > client servers is a pain in the a$$ :(. So, we decided to change them to
>> > something like storage1.mydomain.com and storage2.mydomain.com.
>> >
>> > Remember that, at this point, we had already some volumes (with bricks):
>> >
>> > $ gluster volume info MYVOL
>> > [...]
>> > Brick1: storage1:/MYDIR
>> > Brick1: storage2:/MYDIR
>> >
>> > For simplicity, let's consider that we had two Gluster Nodes, each one
>> with
>> > the following entries in /etc/hosts:
>> >
>> > 10.10.10.1 storage1
>> > 10.10.10.2 storage2
>> >
>> > To implement the hostname changes, we've changed the etc hosts file to:
>> >
>> > 10.10.10.1 storage1 storage1.mydomain.com
>> > 10.10.10.2 storage2 storage2.mydomain.com
>> >
>> > And we've run in storage1:
>> >
>> > $ gluster peer probe storage2.mydomain.com
>> > peer probe: success
>> >
>> > Everything works well during some time, but the glusterd starts to fail
>> > after any reboot:
>> >
>> > $ service glusterfs-server status
>> > glusterfs-server start/running, process 14714
>> > $ service glusterfs-server restart
>> > glusterfs-server stop/waiting
>> > glusterfs-server start/running, process 14860
>> > $ service glusterfs-server status
>> > glusterfs-server stop/waiting
>> >
>> > To start the service again, it was necessary to rollback the hostname1
>> > config to storage2 in /var/lib/glusterd/peers/OUR_UUID.
>> >
>> > After some try and error, we discovered that if we change the order of
>> the
>> > entries in /etc/hosts and repeat the process, everything worked.
>> >
>> > It is, from:
>> >
>> > 10.10.10.1 storage1 storage1.mydomain.com
>> > 10.10.10.2 storage2 storage2.mydomain.com
>> >
>> > To:
>> >
>> > 10.10.10.1 storage1.mydomain.com storage1
>> > 10.10.10.2 storage2.mydomain.com storage2
>> >
>> > And run:
>> >
>> > gluster peer probe storage2.mydomain.com
>> > service glusterfs-server restart
>> >
>> > So we've checked the Glusterd debug log and checked the GlusterFS source
>> > code and discovered that the big secret was the function
>> > glusterd_friend_find_by_hostname, in the file
>> > xlators/mgmt/glusterd/src/glusterd-utils.c. This function is called for
>> > each brick that isn't a local brick and does the following things:
>> >
>> > - It checks if the brick hostname is equal to some peer hostname;
>> > - If it's, this peer is our wanted friend;
>> > - If not, it gets the brick IP (resolves the hostname using the
>> function
>> > getaddrinfo) and checks if the brick IP is equal to the peer
>> hostname;
>> > - It is, we could run gluster peer probe 10.10.10.2. Once the
>> brick
>> > IP (storage2 resolves to 10.10.10.2) would have equal to the peer
>> > "hostname" (10.10.10.2);
>> > - If it's, this peer is our wanted friend;
>> > - If not, gets the reverse of the brick IP (using the function
>> > getnameinfo) and checks if the brick reverse is equal to the peer
>> > hostname;
>> > - This is why changing the order of the entries in /etc/hosts
>> worked
>> > as an workaround for us;
>> > - If not, returns and error (and Glusterd will fail).
>> >
>> > However, we think that comparing the brick IP (resolving the brick
>> > hostname) and the peer IP (resolving the peer hostname) would be a
>> simpler
>> > and more comprehensive solution. Once both brick and peer will have
>> > difference hostnames, but the same IP, it would work.
>> >
>> > The solution could be:
>> >
>> > - It checks if the brick hostname is equal to some peer hostname;
>> > - If it's, this peer is our wanted friend;
>> > - If not, it gets both the brick IP (resolves the hostname using the
>> > function getaddrinfo) and the peer IP (resolves the peer hostname)
>> and,
>> > for each IP pair, check if a brick IP is equal to a peer IP;
>> > - If it's, this peer is our wanted friend;
>> > - If not, returns and error (and Glusterd will fail).
>> >
>> > What do you think about it?
>> > --
>> >
>> > *Rarylson Freitas*
>> > Computer Engineer
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-devel mailing list
>> > Gluster-devel at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-devel
>> >
>>
>> --
>> ~Atin
>>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.gluster.org/pipermail/gluster-devel/attachments/20150702/d376931c/attachment-0001.html>
More information about the Gluster-devel
mailing list