[Gluster-users] Failed rebalance resulting in major problems
Jeff Darcy
jdarcy at redhat.com
Mon Nov 11 19:33:20 UTC 2013
On 11/11/2013 02:15 PM, Shawn Heisey wrote:
> Is this possibly a result of my split-network architecture? I have
> a total of six gluster peers. The four servers with bricks have two
> networks, both gigabit - a back-end network where they can talk to
> each other, and a network (with a default gateway) where they can
> talk to the other two peers. Name resolution for gluster on those
> machines is done via hosts files that override DNS. The hosts files
> use the back-end network, DNS uses the other network.
>
> The other two peers have no bricks, but act as NFS/CIFS entry points
> from the rest of the network - network access servers. Their name
> resolution is all DNS. Those NAS servers also have a number of
> other network cards in them so that various networks can reach the
> storage without traversing our central firewall and overloading it.
There's nothing about a split-network configuration like yours that
would cause something like this *by itself*, but anything that creates
greater complexity also creates new possibilities for something to go
wrong. Just to be safe, if I were you, I'd double- and triple-check the
DNS and /etc/hosts configurations on all machines to make sure some tiny
error didn't creep in. If your bricks are at the same paths on each
machine, it would be possible for a machine to think it's connecting to
one brick and actually end up connecting to another. I haven't even
been able to think through all of the ramifications, but just thinking
about how that might affect rebalance makes me a bit queasy.
More information about the Gluster-users
mailing list