[Gluster-users] Rebalance times in 3.2.5 vs 3.4.2
matted at MIT.EDU
Thu Feb 27 05:57:28 UTC 2014
Hopefully I'm not derailing this thread too far, but I have a related
rebalance progress/speed issue.
I have a rebalance process started that's been running for 3-4 days. Is
there a good way to see if it's running successfully, or might this be a
sign of some problem?
This is on a 4-node distribute setup with v3.4.2 and 45T of data.
The *-rebalance.log has been silent since some informational messages when
the rebalance started. There were a few initial warnings and errors that I
E [client-handshake.c:1397:client_setvolume_cbk] 0-cluster2-client-0:
SETVOLUME on remote-host failed: Authentication failed
W [client-handshake.c:1365:client_setvolume_cbk] 0-cluster2-client-4:
failed to set the volume (Permission denied)
W [client-handshake.c:1391:client_setvolume_cbk] 0-cluster2-client-4:
failed to get 'process-uuid' from reply dict
W [socket.c:514:__socket_rwv] 0-cluster2-client-3: readv failed (No data
"gluster volume status" reports that the rebalance is in progress, the
process listed in vols/<volname>/rebalance/<hash>.pid is still running on
the server, but "gluster volume rebalance <volname> status" reports 0 for
everything (files scanned or rebalanced, failures, run time).
On Thu, Feb 27, 2014 at 12:39 AM, Shylesh Kumar <shmohan at redhat.com> wrote:
> Hi Viktor,
> Lots of optimizations and improvements went in for 3.4 so it should be
> faster than 3.2.
> Just to make sure what's happening could you please check rebalance logs
> which will be in
> /var/log/glusterfs/<volname>-rebalance.log and check is there any
> progress ?
> Viktor Villafuerte wrote:
>> Anybody can confirm/dispute that this is normal/abnormal?
>> On Tue 25 Feb 2014 15:21:40, Viktor Villafuerte wrote:
>>> Hi all,
>>> I have distributed replicated set with 2 servers (replicas) and am
>>> trying to add another set of replicas: 1 x (1x1) => 2 x (1x1)
>>> I have about 23G of data which I copy onto the first replica, check
>>> everything and then add the other set of replicas and eventually
>>> rebalance fix-layout, migrate-data.
>>> Now on
>>> Gluster v3.2.5 this took about 30 mins (to rebalance + migrate-data)
>>> Gluster v3.4.2 this has been running for almost 4 hours and it's still
>>> not finished
>>> As I may have to do this in production, where the amount of data is
>>> significantly larger than 23G, I'm looking at about three weeks of wait
>>> to rebalance :)
>>> Now my question is if this is as it's meant to be? I can see that v3.4.2
>>> gives me more info about the rebalance process etc, but that surely
>>> cannot justify the enormous time difference.
>>> Is this normal/expected behaviour? If so I will have to stick with the
>>> v3.2.5 as it seems way quicker.
>>> Please, let me know if there is any 'well known' option/way/secret to
>>> speed the rebalance up on v3.4.2.
>>> Viktor Villafuerte
>>> Optus Internet Engineering
>>> t: 02 808-25265
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
> Gluster-users mailing list
> Gluster-users at gluster.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gluster-users