[Gluster-users] Failed rebalance resulting in major problems
Shawn Heisey
gluster at elyograg.org
Thu Nov 7 21:04:42 UTC 2013
(resending because my reply only went to Lukáš)
On 11/7/2013 3:20 AM, Lukáš Bezdička wrote:
> I strongly suggest not using 3.3.1 or whole 3.3 branch. I would only
> go for 3.4.1 on something close to production and even there I
> wouldn't yet use rebalance/shrinking. We give gluster heavy testing
> before it goes to production and about updating, why don't you build
> your own packages? We are maintaining our builds for several years now
> with our patches which gladly end up in gluster upstream sooner or later.
When I built the system, version 3.3.1 (and CentOS 6.3) was the latest
that was available. Before I added the new storage last week, I got
onto the IRC channel and asked whether I should install the same version
on the new servers, install the new version on the new servers, or
upgrade the entire cluster before adding anything. I got no actual
answers to that question, and there wasn't really a lot of discussion
that I noticed. If someone did answer my question at that time, I
missed it.
I decided to play it safe by installing the 3.3.1 version on the new
servers. It was a slightly newer revision, but I was told that there
were only packaging differences, that the code itself was unchanged. I
installed CentOS 6.4, which I figured would be safe because Gluster is
user-space and it's typically safe to upgrade RHEL/CentOS minor versions.
Before we deployed, I did do tests on my testbed where I added new
storage bricks, did rebalances, removed bricks, etc. There were no
problems with adding bricks or rebalancing, but I had nowhere near as
many files or space used as we have in production. I did encounter a
bug with removing bricks, which I filed:
https://bugzilla.redhat.com/show_bug.cgi?id=862347
Except for the 91 files that appear to be simply gone and unrecoverable,
I am pretty much done dealing with the fallout ... but I still have
nearly 9TB of data that needs to migrate before the bricks will be
evenly filled, and I can't be sure that this won't happen when I request
another rebalance, or next time we need to increase the volume size by
adding bricks. I really need an expert to evaluate our setup and make
recommendations.
I sent a request off to Redhat Consulting for help on this, but I
haven't heard anything back from them.
Thanks,
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131107/1b57fb1c/attachment.html>
More information about the Gluster-users
mailing list